打包成C类型并获得二进制值
我正在使用以下代码将整数打包成未签名的简短,如下所示,
raw_data = 40
# Pack into little endian
data_packed = struct.pack('<H', raw_data)
现在我试图将结果打开,如下所示。由于数据被编码为Little-endian,因此我使用UTF-16-LE
。
def get_bin_str(data):
bin_asc = binascii.hexlify(data)
result = bin(int(bin_asc.decode("utf-16-le"), 16))
trimmed_res = result[2:]
return trimmed_res
print(get_bin_str(data_packed))
不幸的是,它引发了以下错误,
result = bin(int(bin_asc.decode(“ utf-16-le”),16))valueerror:无效 int()具有16:'㠲〰''
的文字
如何正确地解码小二进制数据以正确地解码二进制数据?
I'm using the following code to pack an integer into an unsigned short as follows,
raw_data = 40
# Pack into little endian
data_packed = struct.pack('<H', raw_data)
Now I'm trying to unpack the result as follows. I use utf-16-le
since the data is encoded as little-endian.
def get_bin_str(data):
bin_asc = binascii.hexlify(data)
result = bin(int(bin_asc.decode("utf-16-le"), 16))
trimmed_res = result[2:]
return trimmed_res
print(get_bin_str(data_packed))
Unfortunately, it throws the following error,
result = bin(int(bin_asc.decode("utf-16-le"), 16)) ValueError: invalid
literal for int() with base 16: '㠲〰'
How do I properly decode the bytes in little-endian to binary data properly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用拆卸包装扭转包装的内容。数据未编码UTF,因此没有理由使用UTF编码。
单个值
解开包装
返回元组,因此将其索引以获取以二进制使用字符串格式打印的 。这些工作中的任何一个都最好。bin()
不允许您指定要显示的二进制数字的数量,如果不需要,则需要删除0b
。Use unpack to reverse what you packed. The data isn't UTF-encoded so there is no reason to use UTF encodings.
unpack
returns a tuple, so index it to get the single valueTo print in binary use string formatting. Either of these work work best.
bin()
doesn't let you specify the number of binary digits to display and the0b
needs to be removed if not desired.您没有说要做什么,所以让我们假设您的目标是教育自己。 (如果您试图打包将传递给另一个程序的数据,则唯一可靠的测试是检查程序是否正确读取您的输出。)
Python没有“未签名的简短”类型,因此
的输出struct.pack()
是一个字节数组。要查看其中的内容,只需打印它:那是什么?好吧,字符
(
,在ASCII表中是十进制40,然后是null字节。如果您使用的数字不映射到可打印的ASCII字符,那么您会看到较少的东西令人惊讶的是:当然,
0b
是11中。右侧是因为英语的写作方式,但这是无关紧要的。无论如何,您也可以直接查看该字节:
是的,但是您说的是
bin() :
两个高点价值32和8
您看到的
。
get_bin_str
中的转换正在进行。嗯,好吧。不知道为什么您转换为十六进制数字,但是现在您有4个字节,而不是两个字节。 (
28
是十六进制中编写的编号40,00
适用于null字节。)在下一步中,您调用decode
并告诉这4个字节实际上是UTF-16;对于两个Unicode角色来说,就足够了,让我们看一下:在下一步中,Python终于注意到了问题,但是到那时,这并没有太大的区别,因为您离您最初的40号数字很远。
要正确读取数据为UTF-16字符,请直接在包装的字节字符串上调用
Decode
。You have not said what you are trying to do, so let's assume your goal is to educate yourself. (If you are trying to pack data that will be passed to another program, the only reliable test is to check if the program reads your output correctly.)
Python does not have an "unsigned short" type, so the output of
struct.pack()
is a byte array. To see what's in it, just print it:What's that? Well, the character
(
, which is decimal 40 in the ascii table, followed by a null byte. If you had used a number that does not map to a printable ascii character, you'd see something less surprising:Where
0b
is 11 in hex, of course. Wait, I specified "little-endian", so why is my number on the left? The answer is, it's not. Python prints the byte string left to right because that's how English is written, but that's irrelevant. If it helps, think of strings as growing upwards: From low memory locations to high memory. The least significant byte comes first, which makes this little-endian.Anyway, you can also look at the bytes directly:
Yup, it's still there. But what about the bits, you say? For this, use
bin()
on each of the bytes separately:The two high bits you see are worth 32 and 8. Your number was less than 256, so it fits entirely in the low byte of the short you constructed.
What's wrong with your unpacking code?
Just for fun let's see what your sequence of transformations in
get_bin_str
was doing.Um, all right. Not sure why you converted to hex digits, but now you have 4 bytes, not two. (
28
is the number 40 written in hex, the00
is for the null byte.) In the next step, you calldecode
and tell it that these 4 bytes are actually UTF-16; there's just enough for two unicode characters, let's take a look:In the next step Python finally notices that something is wrong, but by then it does not make much difference because you are pretty far away from the number 40 you started with.
To correctly read your data as a UTF-16 character, call
decode
directly on the byte string you packed.