存储“结构”;数据到二进制文件

发布于 2024-09-08 10:20:45 字数 607 浏览 2 评论 0原文

我需要存储一个二进制文件,其标头由 4 个字段组成,长度为 12 字节。它们分别是:sSamples(4 字节整数)、sSampPeriod(4 字节整数)、sSampSize(2 字节整数)和最后的 sParmKind(2 字节整数)。 我正在使用“结构”将变量添加到所需的字段。现在我已经分别定义了它们,如何将它们全部合并以存储“12 字节标头”?

sSamples        = struct.pack('i', nSamples) # 4-bytes integer
sSampPeriod     = struct.pack('i', nSampPeriod) # 4-bytes integer
sSampSize       = struct.pack('H', nSampSize) # 2-bytes integer / unsigned short
sParmKind       = struct.pack('H', 9) # 2-bytes integer / unsigned short

此外,我还有一个维度为 D 的 npVect 浮点数组(numpy.ndarray - float32)。如何将此向量存储在同一个二进制文件中,但位于标头之后?

I need to store a binary file with a 12 byte header composed of 4 fields. They are namely: sSamples (4-bytes integer), sSampPeriod (4-bytes integer), sSampSize (2-bytes integer), and finally sParmKind (2-bytes integer).
I'm using 'struct' to my variables to the desired fields. Now that I have them defined separately, how could I merge them all to store the '12 bytes header'?

sSamples        = struct.pack('i', nSamples) # 4-bytes integer
sSampPeriod     = struct.pack('i', nSampPeriod) # 4-bytes integer
sSampSize       = struct.pack('H', nSampSize) # 2-bytes integer / unsigned short
sParmKind       = struct.pack('H', 9) # 2-bytes integer / unsigned short

In addition, I've a npVect float array of dimensionality D (numpy.ndarray - float32). How could I store this vector in the same binary file, but after the header?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

帝王念 2024-09-15 10:20:45

正如 Cody Brocious 所写,您可以一次打包整个标头:

header = struct.pack('<iiHH', nSamples, nSampPeriod, nSampSize, nParmKind)

他还提到了字节序,如果您想打包数据以便在具有不同架构的机器上可靠地解包它,这一点很重要。格式字符串开头的 < 指定“使用小端约定打包此数据”。

对于数组,您必须打包其长度,以便确定再次读取它时要解包的值的数量。在一次调用中完成所有操作:

flattened = npVect.ravel()  # get a 1-D array of numbers
arrSize = len(flattened)
# pack header, count of numbers, and numbers, all in one call
packed = struct.pack('<iiHHi%df' % arrSize,
    nSamples, nSampPeriod, nSampSize, nParmKind, arrSize, *flattened)

根据数组可能有多大,您最终可能会得到一个代表二进制文件全部内容的巨大字符串,并且您可能需要研究 struct 的替代方案code> 不需要您将整个文件保存在内存中。

解包:

fmt = '<iiHHi'
nSamples, nSampPeriod, nSampSize, nParmKind, arrSize = struct.unpack(fmt, packed)
# Use unpack_from to start reading after the packed header and count
flattened = struct.unpack_from('<%df' % arrSize, packed, struct.calcsize(fmt))
npVect = np.ndarray(flattened, dtype='float32').reshape(# your dimensions go here
    )

编辑:哎呀,数组格式并不那么简单:)不过,总体思路是这样的:使用您喜欢的任何方法将数组压平为数字列表,打包数字值,然后打包每个值。另一方面,将数组作为平面列表读取,然后对其施加所需的任何结构。

编辑:更改格式字符串以使用重复说明符,而不是字符串乘法。感谢约翰·梅钦指出这一点。

编辑:添加了numpy代码以在打包之前压平数组并在解包后重建数组。

As Cody Brocious wrote, you can pack your entire header at once:

header = struct.pack('<iiHH', nSamples, nSampPeriod, nSampSize, nParmKind)

He also mentioned endianness, which is important if you want to pack your data so as to reliably unpack it on machines with different architectures. The < at the beginning of my format string specifies "pack this data using a little-endian convention".

As for the array, you'll have to pack its length in order to determine how many values to unpack when you read it again. Doing it all in one call:

flattened = npVect.ravel()  # get a 1-D array of numbers
arrSize = len(flattened)
# pack header, count of numbers, and numbers, all in one call
packed = struct.pack('<iiHHi%df' % arrSize,
    nSamples, nSampPeriod, nSampSize, nParmKind, arrSize, *flattened)

Depending on how big your array is likely to be, you could end up with a huge string representing the entire contents of your binary file, and you might want to look into alternatives to struct which don't require you to have the entire file in memory.

Unpacking:

fmt = '<iiHHi'
nSamples, nSampPeriod, nSampSize, nParmKind, arrSize = struct.unpack(fmt, packed)
# Use unpack_from to start reading after the packed header and count
flattened = struct.unpack_from('<%df' % arrSize, packed, struct.calcsize(fmt))
npVect = np.ndarray(flattened, dtype='float32').reshape(# your dimensions go here
    )

EDIT: Oops, the array format isn't quite as simple as that :) The general idea holds, though: flatten your array into a list of numbers using any method you like, pack the number of values, then pack each value. On the other side, read the array as a flat list, then impose whatever structure you need on it.

EDIT: Changed format strings to use repeat specifiers, rather than string multiplication. Thanks to John Machin for pointing it out.

EDIT: Added numpy code to flatten the array before packing and reconstruct it after unpacking.

江湖彼岸 2024-09-15 10:20:45

struct.pack 返回一个字符串,因此您可以简单地通过字符串连接来组合字段:

header = sSamples + sSampPeriod + sSampSize + sParmKind
assert len( header ) == 12

struct.pack returns a string, so you can combine the fields simply by string concatenation:

header = sSamples + sSampPeriod + sSampSize + sParmKind
assert len( header ) == 12
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文