python文件读取“ rb”模式返回一个字节字符串,该字符字符串具有超过8位和非Hex字符的字节?

发布于 2025-02-03 01:07:22 字数 1071 浏览 1 评论 0原文

我试图在Python(带有图像文件)中以字节模式进行文件读取。这是我的简单代码:

f = open("./img.jpg", "rb")
print(f.read())

打印的结果是一个巨大的字节字符串。以下只是摘录(不是从字符串的开头开始,因为打印了这么多字节,以至于我的控制台窗口不会让我走得比您在下面的行开头看到的9D@:

9d@\xe7\x8a\xfe\xc3<6\xc8 @9\xf9Td\x1c\x8c\x91\xff\x00\xd7\xc7\xa5\x7f\x9f\xb7\xed\xd5\xe3\xfdo\xe1\xd7\xede\xfb2x\xeb\xc3\xb8\x97\xc4?\t\xbc9}\xf1w\xc3\xb6d\xed[\xbdg\xc2?\x104\xcdr\xda\xc2F\x19"\x1b\xd6\xf0\x9cVr\x0c|\xc9p\xcb\xc8$\x0f\xef\x7f\xe1g\x8b4\x1f\x1dx?\xc2\x9e6\xf0\xb5\xfcz\xa7\x85\xbcg\xe1\xbd\x07\xc5\xfe\x18\xd4\xe2`\xd1\xea>\x1d\xf1>\x95g\xaeh7\xe8W#\x13iZ\x85\x9c\x98\xcf\x1ef99\xaf;4\x8bT\xb0\x13_\xc9$\xfd}\xa5I/\xfc\x95\xb5\xe9\x16~S\xc5\x98\x1cD3,\xc7\x1dV?\xec\xb8\xfa\xdc\x94Z\xea\xf0\xd40\xd4\xeb+\xbd\xdcy\xe9\xc9\xa5\xb2\x9c{\x9e\xbf\x1f\xf0\xfd?\x98&\xad*\x06\x07?O\xe4j\x94a\x8eq\xdb\xf4\xab\xb1\xe7\xa1\xf4\xcf\xe3\xc75\xe4\xafyZ\xfa\xee~uSM<\xc5\x11(\xe8?\x1e\xff\x00\x81\xa7\x81\x81\x8aZ*\x0e^}t

现在,我'注意到一些有趣的东西。 (可以在字符串的开头上看到)。 不是十六进制的角色,这些示例也超过2个字符

;“ T ” 与文件格式有关?也许这不是十六进制吗?

I was trying to experiment with file reading in byte mode in python (with an image file). Here is my simple code:

f = open("./img.jpg", "rb")
print(f.read())

The result that printed was a massive byte string. Below is just an excerpt (not from the beginning of the string most likely since so many bytes printed that my console window won't let me go any higher than the 9d@ that you see at the beginning of the line below:

9d@\xe7\x8a\xfe\xc3<6\xc8 @9\xf9Td\x1c\x8c\x91\xff\x00\xd7\xc7\xa5\x7f\x9f\xb7\xed\xd5\xe3\xfdo\xe1\xd7\xede\xfb2x\xeb\xc3\xb8\x97\xc4?\t\xbc9}\xf1w\xc3\xb6d\xed[\xbdg\xc2?\x104\xcdr\xda\xc2F\x19"\x1b\xd6\xf0\x9cVr\x0c|\xc9p\xcb\xc8$\x0f\xef\x7f\xe1g\x8b4\x1f\x1dx?\xc2\x9e6\xf0\xb5\xfcz\xa7\x85\xbcg\xe1\xbd\x07\xc5\xfe\x18\xd4\xe2`\xd1\xea>\x1d\xf1>\x95g\xaeh7\xe8W#\x13iZ\x85\x9c\x98\xcf\x1ef99\xaf;4\x8bT\xb0\x13_\xc9$\xfd}\xa5I/\xfc\x95\xb5\xe9\x16~S\xc5\x98\x1cD3,\xc7\x1dV?\xec\xb8\xfa\xdc\x94Z\xea\xf0\xd40\xd4\xeb+\xbd\xdcy\xe9\xc9\xa5\xb2\x9c{\x9e\xbf\x1f\xf0\xfd?\x98&\xad*\x06\x07?O\xe4j\x94a\x8eq\xdb\xf4\xab\xb1\xe7\xa1\xf4\xcf\xe3\xc75\xe4\xafyZ\xfa\xee~uSM<\xc5\x11(\xe8?\x1e\xff\x00\x81\xa7\x81\x81\x8aZ*\x0e^}t

Now, I've noticed something interesting. Some of these "bytes" have non-hex characters, and some have more than 2 characters (even though 1 byte in hex only has 2 characters). For example, a valid byte would be something like \x8a (which can be seen towards the beginning of the string). However, this string also has stuff like \xc3<6 or 9d@ or \xf9Td. As can be seen in these examples, they feature characters like '@' or '<' or 'T' which aren't hex characters, and these examples are also more than 2 characters long.

How am I to interpret this? Are all of these "bytes" even really supposed to be viewed as bytes? Does this have something to do with the file format? Perhaps this is not hex after all? Can someone please help me make sense of byte strings like this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

假情假意假温柔 2025-02-10 01:07:22

F.Read()将返回对字节类的引用。打印将隐式尝试将其转换为其字符串表示。有些字符将是可打印的,有些则不能。那些不可打印的将以十六进制格式显示

f.read() will return a reference to a bytes class. print will implicitly try to convert that to its string representation. Some characters will be printable and some not. Those that are not printable will be displayed in hex format

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文