Clojure 中的科学数据集操作——将 ByteBuffer 读入矩阵
我希望使用 Clojure 和 Incanter 来处理大型科学数据集;具体来说,此数据集的 0.5 度版本(仅以二进制格式提供)。
我的问题是,对于在 Java/Clojure 中处理这个问题的优雅方法,您有什么建议?有没有一种简单的方法可以将此数据集放入 Incanter 或其他 java 矩阵包中?
我设法使用以下代码将二进制数据读取到 java.nio.ByteBuffer
中:
(defn to-float-array [^String str]
(-> (io/to-byte-array (io/to-file str))
java.nio.ByteBuffer/wrap
(.order java.nio.ByteOrder/LITTLE_ENDIAN)))
现在,我真的很努力地思考如何开始操作这个 ByteBuffer
作为一个数组。我一直在使用 Python 的 NumPy,这使得操作这些巨大的数据集变得非常容易。这是我想要做的事情的 python 代码:
// reshape row vector into (time, lat_slices, lon_slices)
// then cut out every other row
rain_data = np.fromfile("path/to/file", dtype="f")
rain_data = rain_data.reshape(24, 360, 720);
rain_data = rain_data[0:23:2,:,:];
在切片之后,我想返回这 12 个数组的向量。 (我需要分别操作它们作为未来的函数输入。)
因此,任何有关如何将此数据集导入 Incanter 的建议将不胜感激。
I'm looking to use Clojure and Incanter for processing of a large scientific dataset; specifically, the 0.5 degree version of this dataset (only available in binary format).
My question is, what recommendations do you have for elegant ways to deal with this problem in Java/Clojure? Is there a simple way to get this dataset into Incanter, or some other java matrix package?
I managed to read the binary data into a java.nio.ByteBuffer
using the following code:
(defn to-float-array [^String str]
(-> (io/to-byte-array (io/to-file str))
java.nio.ByteBuffer/wrap
(.order java.nio.ByteOrder/LITTLE_ENDIAN)))
Now, I'm really struggling with how I can begin to manipulate this ByteBuffer
as an array. I've been using Python's NumPy, which makes it very easy to manipulate these huge datasets. Here's the python code for what I'm looking to do:
// reshape row vector into (time, lat_slices, lon_slices)
// then cut out every other row
rain_data = np.fromfile("path/to/file", dtype="f")
rain_data = rain_data.reshape(24, 360, 720);
rain_data = rain_data[0:23:2,:,:];
After this slicing, I want to return a vector of these twelve arrays. (I need to manipulate them each separately as future function inputs.)
So, any advice on how to get this dataset into Incanter would be much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不知道如何将您的
ByteBuffer
转换为数组,但这里有一个reshape
函数的实现:(这在我有限的测试中工作得很好。)如果您的数据位于向量
r
中,那么您可以实现为
I don't know how to convert your
ByteBuffer
into an array, but here's an implementation of thereshape
function:(This works fine in my limited testing.) If your data is in a vector
r
then you can implementas