python文件读取“ rb”模式返回一个字节字符串，该字符字符串具有超过8位和非Hex字符的字节？

发布于 2025-02-03 01:07:22 字数 1071 浏览 1 评论 0原文

我试图在Python（带有图像文件）中以字节模式进行文件读取。这是我的简单代码：

f = open("./img.jpg", "rb")
print(f.read())

打印的结果是一个巨大的字节字符串。以下只是摘录（不是从字符串的开头开始，因为打印了这么多字节，以至于我的控制台窗口不会让我走得比您在下面的行开头看到的9D@：

9d@\xe7\x8a\xfe\xc3<6\xc8 @9\xf9Td\x1c\x8c\x91\xff\x00\xd7\xc7\xa5\x7f\x9f\xb7\xed\xd5\xe3\xfdo\xe1\xd7\xede\xfb2x\xeb\xc3\xb8\x97\xc4?\t\xbc9}\xf1w\xc3\xb6d\xed[\xbdg\xc2?\x104\xcdr\xda\xc2F\x19"\x1b\xd6\xf0\x9cVr\x0c|\xc9p\xcb\xc8$\x0f\xef\x7f\xe1g\x8b4\x1f\x1dx?\xc2\x9e6\xf0\xb5\xfcz\xa7\x85\xbcg\xe1\xbd\x07\xc5\xfe\x18\xd4\xe2`\xd1\xea>\x1d\xf1>\x95g\xaeh7\xe8W#\x13iZ\x85\x9c\x98\xcf\x1ef99\xaf;4\x8bT\xb0\x13_\xc9$\xfd}\xa5I/\xfc\x95\xb5\xe9\x16~S\xc5\x98\x1cD3,\xc7\x1dV?\xec\xb8\xfa\xdc\x94Z\xea\xf0\xd40\xd4\xeb+\xbd\xdcy\xe9\xc9\xa5\xb2\x9c{\x9e\xbf\x1f\xf0\xfd?\x98&\xad*\x06\x07?O\xe4j\x94a\x8eq\xdb\xf4\xab\xb1\xe7\xa1\xf4\xcf\xe3\xc75\xe4\xafyZ\xfa\xee~uSM<\xc5\x11(\xe8?\x1e\xff\x00\x81\xa7\x81\x81\x8aZ*\x0e^}t

现在，我'注意到一些有趣的东西。（可以在字符串的开头上看到）。不是十六进制的角色，这些示例也超过2个字符

;“ T ” 与文件格式有关？也许这不是十六进制吗？

原文

I was trying to experiment with file reading in byte mode in python (with an image file). Here is my simple code:

f = open("./img.jpg", "rb")
print(f.read())

The result that printed was a massive byte string. Below is just an excerpt (not from the beginning of the string most likely since so many bytes printed that my console window won't let me go any higher than the 9d@ that you see at the beginning of the line below:

9d@\xe7\x8a\xfe\xc3<6\xc8 @9\xf9Td\x1c\x8c\x91\xff\x00\xd7\xc7\xa5\x7f\x9f\xb7\xed\xd5\xe3\xfdo\xe1\xd7\xede\xfb2x\xeb\xc3\xb8\x97\xc4?\t\xbc9}\xf1w\xc3\xb6d\xed[\xbdg\xc2?\x104\xcdr\xda\xc2F\x19"\x1b\xd6\xf0\x9cVr\x0c|\xc9p\xcb\xc8$\x0f\xef\x7f\xe1g\x8b4\x1f\x1dx?\xc2\x9e6\xf0\xb5\xfcz\xa7\x85\xbcg\xe1\xbd\x07\xc5\xfe\x18\xd4\xe2`\xd1\xea>\x1d\xf1>\x95g\xaeh7\xe8W#\x13iZ\x85\x9c\x98\xcf\x1ef99\xaf;4\x8bT\xb0\x13_\xc9$\xfd}\xa5I/\xfc\x95\xb5\xe9\x16~S\xc5\x98\x1cD3,\xc7\x1dV?\xec\xb8\xfa\xdc\x94Z\xea\xf0\xd40\xd4\xeb+\xbd\xdcy\xe9\xc9\xa5\xb2\x9c{\x9e\xbf\x1f\xf0\xfd?\x98&\xad*\x06\x07?O\xe4j\x94a\x8eq\xdb\xf4\xab\xb1\xe7\xa1\xf4\xcf\xe3\xc75\xe4\xafyZ\xfa\xee~uSM<\xc5\x11(\xe8?\x1e\xff\x00\x81\xa7\x81\x81\x8aZ*\x0e^}t

Now, I've noticed something interesting. Some of these "bytes" have non-hex characters, and some have more than 2 characters (even though 1 byte in hex only has 2 characters). For example, a valid byte would be something like \x8a (which can be seen towards the beginning of the string). However, this string also has stuff like \xc3<6 or 9d@ or \xf9Td. As can be seen in these examples, they feature characters like '@' or '<' or 'T' which aren't hex characters, and these examples are also more than 2 characters long.

How am I to interpret this? Are all of these "bytes" even really supposed to be viewed as bytes? Does this have something to do with the file format? Perhaps this is not hex after all? Can someone please help me make sense of byte strings like this?

分享到QQ

分享到微博