如何在Python中读入文件的二进制文件

发布于 2024-12-08 11:15:18 字数 203 浏览 0 评论 0 原文

在Python中，当我尝试使用“rb”读取可执行文件时，我没有得到我期望的二进制值（0010001等），而是得到了一系列我不知道如何处理的字母和符号。

Ex: ???}????l?S??????V?d?\?hG???8?O=(A).e??????B??$????????:    ???Z?C'???|lP@.\P?!??9KRI??{F?AB???5!qtWI??8


              
              
                
                  原文 
                
              
              In Python, when I try to read in an executable file with 'rb', instead of getting the binary values I expected (0010001 etc.), I'm getting a series of letters and symbols that I do not know what to do with.
Ex: ???}????l?S??????V?d?\?hG???8?O=(A).e??????B??$????????:    ???Z?C'???|lP@.\P?!??9KRI??{F?AB???5!qtWI??8???????!ᢉ?]?zъeF?̀z??/?n??

How would I access the binary numbers of a file in Python?
Any suggestions or help would be appreciated. Thank you in advance.

              
              
              
              
              
  
    
      
        
        收藏 0
      
      
        
        分享到微信
        
      
      
        
          
          分享到QQ
        
      
      
        
          
          分享到微博
        
      
    
  


              
              
  
    如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web
    技术交流群。



            
  
    
    
    
  



            

            
            

            
            
            
            
            

  
    发布评论
    
      
        
        
      
      
        
          
          
            
              需要
              登录
              才能够评论， 你可以免费
              注册
              一个本站的账号。
            
          
        
        
          
          
          
          
        
      
    

    
    
      评论（5）
      
        
        
        
  
    
      
    
  
  
    
      
        春庭雪
      
      
      2024-12-15 11:15:18
    
    这就是二进制文件。它们以字节形式存储，当您打印它们时，它们被解释为 ASCII 字符。
您可以使用 bin() 函数 和 ord() 函数 查看实际的二进制代码。
for value in enumerate(data):
   print bin(ord(value))


    
    
      That is the binary. They are stored as bytes, and when you print them, they are interpreted as ASCII characters.
You can use the bin() function and the ord() function to see the actual binary codes.
for value in enumerate(data):
   print bin(ord(value))


    
    
    
    
    
      
        
           回复
        
      

      
        
           收藏 0
        
      

      

      
      

      

      
      
        原文 
      
      
    
  


        
        
        
        
  
    
      
    
  
  
    
      
        扛刀软妹
      
      
      2024-12-15 11:15:18
    
    Python 中的字节序列使用字符串表示。打印字节序列时看到的一系列字母和符号只是字符串包含的字节的可打印表示形式。为了利用这些数据，您通常会以某种方式对其进行操作以获得更有用的表示。
您可以使用 ord(x) 或 bin(x) 分别获取十进制和二进制表示形式：
>>> f = open('/tmp/IMG_5982.JPG', 'rb')
>>> data = f.read(10)
>>> data
'\x00\x00II*\x00\x08\x00\x00\x00'

>>> data[2]
'I'

>>> ord(data[2])
73

>>> hex(ord(data[2]))
'0x49'

>>> bin(ord(data[2]))
'0b1001001'

>>> f.close()

您传递的 'b' 标志to open() 不会告诉 Python 任何有关如何表示文件内容的信息。来自文档：

在区分二进制文件和文本文件的系统上，将“b”附加到模式以二进制模式打开文件；在没有这种区别的系统上，添加“b”没有效果。

 除非您只是想看看文件中的二进制数据是什么样子，否则 Mark Pilgrim 的书《Dive Into Python》有 使用二进制文件格式的示例。 该示例展示了如何从 MP3 文件读取 IDv1 标签。这本书的网站似乎已关闭，所以我链接到一个镜像。

    
    
      Byte sequences in Python are represented using strings. The series of letters and symbols that you see when you print out a byte sequence is merely a printable representation of bytes that the string contains. To make use of this data, you usually manipulate it in some way to obtain a more useful representation.
You can use ord(x) or bin(x) to obtain decimal and binary representations, respectively:
>>> f = open('/tmp/IMG_5982.JPG', 'rb')
>>> data = f.read(10)
>>> data
'\x00\x00II*\x00\x08\x00\x00\x00'

>>> data[2]
'I'

>>> ord(data[2])
73

>>> hex(ord(data[2]))
'0x49'

>>> bin(ord(data[2]))
'0b1001001'

>>> f.close()

The 'b' flag that you pass to open() does not tell Python anything about how to represent the file contents. From the docs:

Append 'b' to the mode to open the file in binary mode, on systems that differentiate between binary and text files; on systems that don’t have this distinction, adding the 'b' has no effect.

Unless you just want to look at what the binary data from the file looks like, Mark Pilgrim's book, Dive Into Python, has an example of working with binary file formats. The example shows how you can read IDv1 tags from an MP3 file. The book's website seems to be down, so I'm linking to a mirror.

    
    
    
    
    
      
        
           回复
        
      

      
        
           收藏 0
        
      

      

      
      

      

      
      
        原文 
      
      
    
  


        
        
        
        
  
    
      
    
  
  
    
      
        我乃一代侩神
      
      
      2024-12-15 11:15:18
    
    字符串中的每个字符都是二进制字节的 ASCII 表示形式。如果您希望它是由 0 和 1 组成的字符串，那么您可以将每个字节转换为整数，将其格式化为 8 个二进制数字并将所有内容连接在一起：
>>> s = "hello world"
>>> ''.join("{0:08b}".format(ord(x)) for x in s)
'0110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

取决于您是否确实需要在二进制级别分析/操作外部模块例如 bitstring 可能会有所帮助。查看文档；要获得二进制解释，请使用以下内容：
>>> f = open('somefile', 'rb')
>>> b = bitstring.Bits(f)
>>> b.bin
0100100101001001...


    
    
      Each character in the string is the ASCII representation of a binary byte. If you want it as a string of zeros and ones then you can convert each byte to an integer, format it as 8 binary digits and join everything together:
>>> s = "hello world"
>>> ''.join("{0:08b}".format(ord(x)) for x in s)
'0110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

Depending on if you really need to analyse / manipulate things at the binary level an external module such as bitstring could be helpful. Check out the docs; to just get the binary interpretation use something like:
>>> f = open('somefile', 'rb')
>>> b = bitstring.Bits(f)
>>> b.bin
0100100101001001...


    
    
    
    
    
      
        
           回复
        
      

      
        
           收藏 0
        
      

      

      
      

      

      
      
        原文 
      
      
    
  


        
        
        
        
  
    
      
    
  
  
    
      
        自由如风
      
      
      2024-12-15 11:15:18
    
    使用ord(x)获取每个字节的整数值。
>>> with open('settings.dat', 'rb') as file:
...     data = file.read()
...
>>> for index, value in enumerate(data):
...     print '0x%08x 0x%02x' % (index, ord(value))
...
0x00000000 0x28
0x00000001 0x64
0x00000002 0x70
0x00000003 0x30
0x00000004 0x0d
0x00000005 0x0a
0x00000006 0x53
0x00000007 0x27
0x00000008 0x4d
0x00000009 0x41
0x0000000a 0x49
0x0000000b 0x4e
0x0000000c 0x5f
0x0000000d 0x57
0x0000000e 0x49
0x0000000f 0x4e


    
    
      Use ord(x) to get the integer value of each byte.
>>> with open('settings.dat', 'rb') as file:
...     data = file.read()
...
>>> for index, value in enumerate(data):
...     print '0x%08x 0x%02x' % (index, ord(value))
...
0x00000000 0x28
0x00000001 0x64
0x00000002 0x70
0x00000003 0x30
0x00000004 0x0d
0x00000005 0x0a
0x00000006 0x53
0x00000007 0x27
0x00000008 0x4d
0x00000009 0x41
0x0000000a 0x49
0x0000000b 0x4e
0x0000000c 0x5f
0x0000000d 0x57
0x0000000e 0x49
0x0000000f 0x4e


    
    
    
    
    
      
        
           回复
        
      

      
        
           收藏 0
        
      

      

      
      

      

      
      
        原文 
      
      
    
  


        
        
        
        
  
    
      
    
  
  
    
      
        记忆里有你的影子
      
      
      2024-12-15 11:15:18
    
    如果您确实想将二进制字节转换为位流，则必须从 bin() 的输出中删除前两个字符 ('0b') 并反转结果：
with open("settings.dat", "rb") as fp:
    print "".join( (bin(ord(c))[2:][::-1]).ljust(8,"0") for c in fp.read() )

如果您使用Python 2.6 之前，没有 bin() 函数。

    
    
      If you realy want to convert the binaray bytes to a stream of bits, you have to remove the first two chars ('0b') from the output of bin() and reverse the result:
with open("settings.dat", "rb") as fp:
    print "".join( (bin(ord(c))[2:][::-1]).ljust(8,"0") for c in fp.read() )

If you use Python prior to 2.6, you have no bin() function.

    
    
    
    
    
      
        
           回复
        
      

      
        
           收藏 0
        
      

      

      
      

      

      
      
        原文 
      
      
    
  


        
        
        ~没有更多了~
      
    
    
  

  
  
    
      
        绑定邮箱获取回复消息
        由于您还没有绑定你的真实邮箱，如果其他用户或者作者回复了您的评论，将不能在第一时间通知您！


          
  

  
  

  
  
  
    
      
        关于作者
      
      
        
          
        
        乖乖
        暂无简介
      
      
        
          0
          文章
        
        
          0
          评论
        
        
          23
          人气
        
      
      
        
           关注
        
        
           发私信
        
      
    
  
  

  
  

  
  
  
  
  



  
    
  

  
  
    
      
        相关话题
      
      
        
          
          
            关于从使用块返回的最佳实践
          
          
          
            我如何获得图表系列？  父母的父母的详细信息？
          
          
          
            根据三角形获取屏幕坐标
          
          
          
            如何设置树结构子节点的顺序
          
          
          
            按大小排序地图
          
          
          
            在 MSAccess 中，在 nvarchar 中插入 NULL 失败
          
          
          
            无法加载类型...  升级到 W2K3 和 IIS6 时
          
          
          
            最好的分布式暴力对抗措施是什么？
          
          
          
            C++ 中的 GUID  Linux GCC 应用程序
          
          
          
            如何使用 log4j 关闭日志记录？
          
          
        
      
    
  
  
  

  
  
    
      
        更多 
        热门标签
      
      
        
        操作系统
        
        程序设计
        
        IT运维
        
        Linux系统管理
        
        JavaScript
        
        服务器应用
        
        solaris
        
        C/C++
        
        PHP
        
        Shell
        
        BSD
        
        Vue.js
        
        aix
        
        Oracle
        
        Python
        
        HTML
        
        系统管理
        
        HTML5
        
        CSS
        
        前端
        
      
    
  
  

  

  
  
    
      
        更多 
        推荐作者
      
      
        
          
          
            
              
                
              
            
             关注 
            
              胡图图
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              zt006
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              z祗昰~
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              冰葑
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              野の
              
                文章 0
                评论 0
              
            
          
          
          
            
              
                
              
            
             关注 
            
              天空
              
                文章 0
                评论 0
              
            
          
          
        
      
    
  
  

  
  

  
    
      
        更多 
        友情链接
      
      
        
        文江博客

如何在Python中读入文件的二进制文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。