将二进制文件中的矩阵映射到 Perl
我有一个带有矩阵的 14 MB 文件,采用原始二进制格式。 我想吸食它并拥有类似数组的数组之类的东西,这样我就可以读取一些值。 我希望找到一些神奇的 Perl 模块,考虑到矩阵的大小,它可以为我完成所有工作:)
但我找不到它,我怀疑我只是错过了一种更明显的方法。 PDL::IO::FlexRaw 接近我需要的,尽管我对 F77 添加的奇怪字符的警告有点困惑。
该矩阵位于二进制文件中,采用原始格式,采用 64 位浮点数。 二进制文件的前八个字节是矩阵的第一个“单元”(1,1)。 接下来的八个字节是第二个单元格 (2,1)。 它没有页眉和页脚。 我知道它的尺寸,所以我可以告诉模块“每 64000 字节有一行”。
我正在查看 tie::mmapArray,但我不知道是否可以使其工作。 也许我更好地使用 lseek() 来回查找我需要的八个字节,然后 unpack() 它?
最好的方法是什么?
I have a 14 MB file with a matrix, in raw binary format. I would like to slurp it and have something like an array of arrays, so I can read some values. I was hoping to find some magical Perl module that would, given the size of the matrix, do all the work for me :)
But I can't find it, and I suspect I'm just missing a more obvious way of doing it. PDL::IO::FlexRaw is close to what I need, although I'm a bit confused about the warning with strange characters added by F77.
The matrix is in a binary file, in raw format, in 64 bits floats. The first eight bytes of the binary file is the first "cell" of the matrix, (1,1). The next eight bytes are the second cell, (2,1). It has no header and no footer. I know its dimensions, so I can tell the module "I have a row for every 64000 bytes".
I'm looking at tie::mmapArray, but I don't know if I can make it work. Maybe I better using lseek() back and forth to find the eight bytes I need and then unpack() it?
What is the best way of doing that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
除非内存紧张,否则只需读入整个文件即可。
这应该访问 (2,1)。 (不过我没有测试它......)
编辑:
好的,低内存版本:
只需要来自 CPAN 的 Sys::Mmap 。
Unless you're tight on memory, just read the whole file in.
That should access (2,1). (I didn't test it, though...)
EDIT:
Ok, low memory version:
Just needs Sys::Mmap from CPAN.
查看 pack 和 unpack (尤其是 unpack)可能会让您走上正轨,看看 b 格式。
A look at pack and unpack (especially unpack) might put you on the right track, look at the b format.
如果不知道文件的结构,任何图书馆怎么可能希望读取它呢? 如果它是某种标准化矩阵二进制格式,那么您可以尝试搜索 CPAN 来查找。 否则,我猜你必须自己做这项工作。
假设它不是一个 稀疏矩阵,它可能只是读取维度的问题,然后以适当大小的块读取。
Without knowing the structure of your file, how could any library hope to read it? If it's some kind of standardized matrix binary format, then you could try searching CPAN for that. Otherwise, I'm guessing you'll have to do the work yourself.
Assuming it's not a sparse matrix, it's probably just a matter of reading in the dimensions, and then reading in appropriately sized blocks.