如何在Python中读入文件的二进制文件

发布于 2024-12-08 11:15:18 字数 203 浏览 0 评论 0 原文

在Python中,当我尝试使用“rb”读取可执行文件时,我没有得到我期望的二进制值(0010001等),而是得到了一系列我不知道如何处理的字母和符号。

Ex: ???}????l?S??????V?d?\?hG???8?O=(A).e??????B??$????????:    ???Z?C'???|lP@.\P?!??9KRI??{F?AB???5!qtWI??8
              

In Python, when I try to read in an executable file with 'rb', instead of getting the binary values I expected (0010001 etc.), I'm getting a series of letters and symbols that I do not know what to do with.

Ex: ???}????l?S??????V?d?\?hG???8?O=(A).e??????B??$????????:    ???Z?C'???|lP@.\P?!??9KRI??{F?AB???5!qtWI??8???????!ᢉ?]?zъeF?̀z??/?n??

How would I access the binary numbers of a file in Python?

Any suggestions or help would be appreciated. Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

春庭雪 2024-12-15 11:15:18

这就是二进制文件。它们以字节形式存储,当您打印它们时,它们被解释为 ASCII 字符。

您可以使用 bin() 函数ord() 函数 查看实际的二进制代码。

for value in enumerate(data):
   print bin(ord(value))

That is the binary. They are stored as bytes, and when you print them, they are interpreted as ASCII characters.

You can use the bin() function and the ord() function to see the actual binary codes.

for value in enumerate(data):
   print bin(ord(value))
扛刀软妹 2024-12-15 11:15:18

Python 中的字节序列使用字符串表示。打印字节序列时看到的一系列字母和符号只是字符串包含的字节的可打印表示形式。为了利用这些数据,您通常会以某种方式对其进行操作以获得更有用的表示。

您可以使用 ord(x)bin(x) 分别获取十进制和二进制表示形式:

>>> f = open('/tmp/IMG_5982.JPG', 'rb')
>>> data = f.read(10)
>>> data
'\x00\x00II*\x00\x08\x00\x00\x00'

>>> data[2]
'I'

>>> ord(data[2])
73

>>> hex(ord(data[2]))
'0x49'

>>> bin(ord(data[2]))
'0b1001001'

>>> f.close()

您传递的 'b' 标志to open() 不会告诉 Python 任何有关如何表示文件内容的信息。来自文档

在区分二进制文件和文本文件的系统上,将“b”附加到模式以二进制模式打开文件;在没有这种区别的系统上,添加“b”没有效果。

除非您只是想看看文件中的二进制数据是什么样子,否则 Mark Pilgrim 的书《Dive Into Python》有 使用二进制文件格式的示例 该示例展示了如何从 MP3 文件读取 IDv1 标签。这本书的网站似乎已关闭,所以我链接到一个镜像。

Byte sequences in Python are represented using strings. The series of letters and symbols that you see when you print out a byte sequence is merely a printable representation of bytes that the string contains. To make use of this data, you usually manipulate it in some way to obtain a more useful representation.

You can use ord(x) or bin(x) to obtain decimal and binary representations, respectively:

>>> f = open('/tmp/IMG_5982.JPG', 'rb')
>>> data = f.read(10)
>>> data
'\x00\x00II*\x00\x08\x00\x00\x00'

>>> data[2]
'I'

>>> ord(data[2])
73

>>> hex(ord(data[2]))
'0x49'

>>> bin(ord(data[2]))
'0b1001001'

>>> f.close()

The 'b' flag that you pass to open() does not tell Python anything about how to represent the file contents. From the docs:

Append 'b' to the mode to open the file in binary mode, on systems that differentiate between binary and text files; on systems that don’t have this distinction, adding the 'b' has no effect.

Unless you just want to look at what the binary data from the file looks like, Mark Pilgrim's book, Dive Into Python, has an example of working with binary file formats. The example shows how you can read IDv1 tags from an MP3 file. The book's website seems to be down, so I'm linking to a mirror.

我乃一代侩神 2024-12-15 11:15:18

字符串中的每个字符都是二进制字节的 ASCII 表示形式。如果您希望它是由 0 和 1 组成的字符串,那么您可以将每个字节转换为整数,将其格式化为 8 个二进制数字并将所有内容连接在一起:

>>> s = "hello world"
>>> ''.join("{0:08b}".format(ord(x)) for x in s)
'0110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

取决于您是否确实需要在二进制级别分析/操作外部模块例如 bitstring 可能会有所帮助。查看文档;要获得二进制解释,请使用以下内容:

>>> f = open('somefile', 'rb')
>>> b = bitstring.Bits(f)
>>> b.bin
0100100101001001...

Each character in the string is the ASCII representation of a binary byte. If you want it as a string of zeros and ones then you can convert each byte to an integer, format it as 8 binary digits and join everything together:

>>> s = "hello world"
>>> ''.join("{0:08b}".format(ord(x)) for x in s)
'0110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

Depending on if you really need to analyse / manipulate things at the binary level an external module such as bitstring could be helpful. Check out the docs; to just get the binary interpretation use something like:

>>> f = open('somefile', 'rb')
>>> b = bitstring.Bits(f)
>>> b.bin
0100100101001001...
自由如风 2024-12-15 11:15:18

使用ord(x)获取每个字节的整数值。

>>> with open('settings.dat', 'rb') as file:
...     data = file.read()
...
>>> for index, value in enumerate(data):
...     print '0x%08x 0x%02x' % (index, ord(value))
...
0x00000000 0x28
0x00000001 0x64
0x00000002 0x70
0x00000003 0x30
0x00000004 0x0d
0x00000005 0x0a
0x00000006 0x53
0x00000007 0x27
0x00000008 0x4d
0x00000009 0x41
0x0000000a 0x49
0x0000000b 0x4e
0x0000000c 0x5f
0x0000000d 0x57
0x0000000e 0x49
0x0000000f 0x4e

Use ord(x) to get the integer value of each byte.

>>> with open('settings.dat', 'rb') as file:
...     data = file.read()
...
>>> for index, value in enumerate(data):
...     print '0x%08x 0x%02x' % (index, ord(value))
...
0x00000000 0x28
0x00000001 0x64
0x00000002 0x70
0x00000003 0x30
0x00000004 0x0d
0x00000005 0x0a
0x00000006 0x53
0x00000007 0x27
0x00000008 0x4d
0x00000009 0x41
0x0000000a 0x49
0x0000000b 0x4e
0x0000000c 0x5f
0x0000000d 0x57
0x0000000e 0x49
0x0000000f 0x4e
记忆里有你的影子 2024-12-15 11:15:18

如果您确实想将二进制字节转换为位流,则必须从 bin() 的输出中删除前两个字符 ('0b') 并反转结果:

with open("settings.dat", "rb") as fp:
    print "".join( (bin(ord(c))[2:][::-1]).ljust(8,"0") for c in fp.read() )

如果您使用Python 2.6 之前,没有 bin() 函数。

If you realy want to convert the binaray bytes to a stream of bits, you have to remove the first two chars ('0b') from the output of bin() and reverse the result:

with open("settings.dat", "rb") as fp:
    print "".join( (bin(ord(c))[2:][::-1]).ljust(8,"0") for c in fp.read() )

If you use Python prior to 2.6, you have no bin() function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文