打包成C类型并获得二进制值

发布于 2025-02-08 22:15:20 字数 619 浏览 4 评论 0原文

我正在使用以下代码将整数打包成未签名的简短,如下所示,

raw_data = 40

# Pack into little endian
data_packed = struct.pack('<H', raw_data)

现在我试图将结果打开,如下所示。由于数据被编码为Little-endian,因此我使用UTF-16-LE

def get_bin_str(data):
    bin_asc = binascii.hexlify(data)
    result = bin(int(bin_asc.decode("utf-16-le"), 16))
    trimmed_res = result[2:]
    return trimmed_res

print(get_bin_str(data_packed))

不幸的是,它引发了以下错误,

result = bin(int(bin_asc.decode(“ utf-16-le”),16))valueerror:无效 int()具有16:'㠲〰''

的文字

如何正确地解码小二进制数据以正确地解码二进制数据?

I'm using the following code to pack an integer into an unsigned short as follows,

raw_data = 40

# Pack into little endian
data_packed = struct.pack('<H', raw_data)

Now I'm trying to unpack the result as follows. I use utf-16-le since the data is encoded as little-endian.

def get_bin_str(data):
    bin_asc = binascii.hexlify(data)
    result = bin(int(bin_asc.decode("utf-16-le"), 16))
    trimmed_res = result[2:]
    return trimmed_res

print(get_bin_str(data_packed))

Unfortunately, it throws the following error,

result = bin(int(bin_asc.decode("utf-16-le"), 16)) ValueError: invalid
literal for int() with base 16: '㠲〰'

How do I properly decode the bytes in little-endian to binary data properly?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

百变从容 2025-02-15 22:15:20

使用拆卸包装扭转包装的内容。数据未编码UTF,因此没有理由使用UTF编码。

>>> import struct
>>> data_packed = struct.pack('<H', 40)
>>> data_packed.hex()   # the two little-endian bytes are 0x28 (40) and 0x00 (0)
2800
>>> data = struct.unpack('<H',data_packed)
>>> data
(40,)

单个值

>>> data = struct.unpack('<H',data_packed)[0]
>>> data
40

解开包装返回元组,因此将其索引以获取以二进制使用字符串格式打印的 。这些工作中的任何一个都最好。 bin()不允许您指定要显示的二进制数字的数量,如果不需要,则需要删除0b

>>> format(data,'016b')
'0000000000101000'
>>> f'{data:016b}'
'0000000000101000'

Use unpack to reverse what you packed. The data isn't UTF-encoded so there is no reason to use UTF encodings.

>>> import struct
>>> data_packed = struct.pack('<H', 40)
>>> data_packed.hex()   # the two little-endian bytes are 0x28 (40) and 0x00 (0)
2800
>>> data = struct.unpack('<H',data_packed)
>>> data
(40,)

unpack returns a tuple, so index it to get the single value

>>> data = struct.unpack('<H',data_packed)[0]
>>> data
40

To print in binary use string formatting. Either of these work work best. bin() doesn't let you specify the number of binary digits to display and the 0b needs to be removed if not desired.

>>> format(data,'016b')
'0000000000101000'
>>> f'{data:016b}'
'0000000000101000'
难理解 2025-02-15 22:15:20

您没有说要做什么,所以让我们假设您的目标是教育自己。 (如果您试图打包将传递给另一个程序的数据,则唯一可靠的测试是检查程序是否正确读取您的输出。)

Python没有“未签名的简短”类型,因此的输出struct.pack()是一个字节数组。要查看其中的内容,只需打印它:

>>> data_packed = struct.pack('<H', 40)
>>> print(data_packed)
b'(\x00'

那是什么?好吧,字符,在ASCII表中是十进制40,然后是null字节。如果您使用的数字不映射到可打印的ASCII字符,那么您会看到较少的东西令人惊讶的是:

>>> struct.pack("<H", 11)
b'\x0b\x00'

当然,0b是11中。右侧是因为英语的写作方式,但这是无关紧要的。无论如何

,您也可以直接查看该字节:

>>> print(data_packed[0])
40

是的,但是您说的是bin() :

>>> bin(data_packed[0])
'0b101000'
>>> bin(data_packed[1])
'0b0'

两个高点价值32和8

您看到的

get_bin_str中的转换正在进行。

>>> binascii.hexlify(data_packed)
b'2800'

嗯,好吧。不知道为什么您转换为十六进制数字,但是现在您有4个字节,而不是两个字节。 (28是十六进制中编写的编号40,00适用于null字节。)在下一步中,您调用decode并告诉这4个字节实际上是UTF-16;对于两个Unicode角色来说,就足够了,让我们看一下:

>>> b'2800'.decode("utf-16-le")
'㠲〰'

在下一步中,Python终于注意到了问题,但是到那时,这并没有太大的区别,因为您离您最初的40号数字很远。

要正确读取数据为UTF-16字符,请直接在包装的字节字符串上调用Decode

>>> data_packed.decode("utf-16-le")
'('
>>> ord('(')
40

You have not said what you are trying to do, so let's assume your goal is to educate yourself. (If you are trying to pack data that will be passed to another program, the only reliable test is to check if the program reads your output correctly.)

Python does not have an "unsigned short" type, so the output of struct.pack() is a byte array. To see what's in it, just print it:

>>> data_packed = struct.pack('<H', 40)
>>> print(data_packed)
b'(\x00'

What's that? Well, the character (, which is decimal 40 in the ascii table, followed by a null byte. If you had used a number that does not map to a printable ascii character, you'd see something less surprising:

>>> struct.pack("<H", 11)
b'\x0b\x00'

Where 0b is 11 in hex, of course. Wait, I specified "little-endian", so why is my number on the left? The answer is, it's not. Python prints the byte string left to right because that's how English is written, but that's irrelevant. If it helps, think of strings as growing upwards: From low memory locations to high memory. The least significant byte comes first, which makes this little-endian.

Anyway, you can also look at the bytes directly:

>>> print(data_packed[0])
40

Yup, it's still there. But what about the bits, you say? For this, use bin() on each of the bytes separately:

>>> bin(data_packed[0])
'0b101000'
>>> bin(data_packed[1])
'0b0'

The two high bits you see are worth 32 and 8. Your number was less than 256, so it fits entirely in the low byte of the short you constructed.

What's wrong with your unpacking code?

Just for fun let's see what your sequence of transformations in get_bin_str was doing.

>>> binascii.hexlify(data_packed)
b'2800'

Um, all right. Not sure why you converted to hex digits, but now you have 4 bytes, not two. (28 is the number 40 written in hex, the 00 is for the null byte.) In the next step, you call decode and tell it that these 4 bytes are actually UTF-16; there's just enough for two unicode characters, let's take a look:

>>> b'2800'.decode("utf-16-le")
'㠲〰'

In the next step Python finally notices that something is wrong, but by then it does not make much difference because you are pretty far away from the number 40 you started with.

To correctly read your data as a UTF-16 character, call decode directly on the byte string you packed.

>>> data_packed.decode("utf-16-le")
'('
>>> ord('(')
40
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文