python文件读取“ rb”模式返回一个字节字符串,该字符字符串具有超过8位和非Hex字符的字节?
我试图在Python(带有图像文件)中以字节模式进行文件读取。这是我的简单代码:
f = open("./img.jpg", "rb")
print(f.read())
打印的结果是一个巨大的字节字符串。以下只是摘录(不是从字符串的开头开始,因为打印了这么多字节,以至于我的控制台窗口不会让我走得比您在下面的行开头看到的9D@:
9d@\xe7\x8a\xfe\xc3<6\xc8 @9\xf9Td\x1c\x8c\x91\xff\x00\xd7\xc7\xa5\x7f\x9f\xb7\xed\xd5\xe3\xfdo\xe1\xd7\xede\xfb2x\xeb\xc3\xb8\x97\xc4?\t\xbc9}\xf1w\xc3\xb6d\xed[\xbdg\xc2?\x104\xcdr\xda\xc2F\x19"\x1b\xd6\xf0\x9cVr\x0c|\xc9p\xcb\xc8$\x0f\xef\x7f\xe1g\x8b4\x1f\x1dx?\xc2\x9e6\xf0\xb5\xfcz\xa7\x85\xbcg\xe1\xbd\x07\xc5\xfe\x18\xd4\xe2`\xd1\xea>\x1d\xf1>\x95g\xaeh7\xe8W#\x13iZ\x85\x9c\x98\xcf\x1ef99\xaf;4\x8bT\xb0\x13_\xc9$\xfd}\xa5I/\xfc\x95\xb5\xe9\x16~S\xc5\x98\x1cD3,\xc7\x1dV?\xec\xb8\xfa\xdc\x94Z\xea\xf0\xd40\xd4\xeb+\xbd\xdcy\xe9\xc9\xa5\xb2\x9c{\x9e\xbf\x1f\xf0\xfd?\x98&\xad*\x06\x07?O\xe4j\x94a\x8eq\xdb\xf4\xab\xb1\xe7\xa1\xf4\xcf\xe3\xc75\xe4\xafyZ\xfa\xee~uSM<\xc5\x11(\xe8?\x1e\xff\x00\x81\xa7\x81\x81\x8aZ*\x0e^}t
现在,我'注意到一些有趣的东西。 (可以在字符串的开头上看到)。 不是十六进制的角色,这些示例也超过2个字符
;“ T ” 与文件格式有关?也许这不是十六进制吗?
I was trying to experiment with file reading in byte mode in python (with an image file). Here is my simple code:
f = open("./img.jpg", "rb")
print(f.read())
The result that printed was a massive byte string. Below is just an excerpt (not from the beginning of the string most likely since so many bytes printed that my console window won't let me go any higher than the 9d@ that you see at the beginning of the line below:
9d@\xe7\x8a\xfe\xc3<6\xc8 @9\xf9Td\x1c\x8c\x91\xff\x00\xd7\xc7\xa5\x7f\x9f\xb7\xed\xd5\xe3\xfdo\xe1\xd7\xede\xfb2x\xeb\xc3\xb8\x97\xc4?\t\xbc9}\xf1w\xc3\xb6d\xed[\xbdg\xc2?\x104\xcdr\xda\xc2F\x19"\x1b\xd6\xf0\x9cVr\x0c|\xc9p\xcb\xc8$\x0f\xef\x7f\xe1g\x8b4\x1f\x1dx?\xc2\x9e6\xf0\xb5\xfcz\xa7\x85\xbcg\xe1\xbd\x07\xc5\xfe\x18\xd4\xe2`\xd1\xea>\x1d\xf1>\x95g\xaeh7\xe8W#\x13iZ\x85\x9c\x98\xcf\x1ef99\xaf;4\x8bT\xb0\x13_\xc9$\xfd}\xa5I/\xfc\x95\xb5\xe9\x16~S\xc5\x98\x1cD3,\xc7\x1dV?\xec\xb8\xfa\xdc\x94Z\xea\xf0\xd40\xd4\xeb+\xbd\xdcy\xe9\xc9\xa5\xb2\x9c{\x9e\xbf\x1f\xf0\xfd?\x98&\xad*\x06\x07?O\xe4j\x94a\x8eq\xdb\xf4\xab\xb1\xe7\xa1\xf4\xcf\xe3\xc75\xe4\xafyZ\xfa\xee~uSM<\xc5\x11(\xe8?\x1e\xff\x00\x81\xa7\x81\x81\x8aZ*\x0e^}t
Now, I've noticed something interesting. Some of these "bytes" have non-hex characters, and some have more than 2 characters (even though 1 byte in hex only has 2 characters). For example, a valid byte would be something like \x8a (which can be seen towards the beginning of the string). However, this string also has stuff like \xc3<6 or 9d@ or \xf9Td. As can be seen in these examples, they feature characters like '@' or '<' or 'T' which aren't hex characters, and these examples are also more than 2 characters long.
How am I to interpret this? Are all of these "bytes" even really supposed to be viewed as bytes? Does this have something to do with the file format? Perhaps this is not hex after all? Can someone please help me make sense of byte strings like this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
F.Read()将返回对字节类的引用。打印将隐式尝试将其转换为其字符串表示。有些字符将是可打印的,有些则不能。那些不可打印的将以十六进制格式显示
f.read() will return a reference to a bytes class. print will implicitly try to convert that to its string representation. Some characters will be printable and some not. Those that are not printable will be displayed in hex format