校验和 JPEG 数据(不是整个文件)
是否有 end-of-exif / end-of-xmp / end-of-iptc / start-of-data 标记,我可以使用它们来获取 jpg / jpeg(和其他图像格式)的数据部分的校验和?
Are there end-of-exif / end-of-xmp / end-of-iptc / start-of-data markers that I could use to get a checksum of just the data part of a jpg / jpeg (and other image formats)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我认为这个问题与这个相关 仅计算图像的核心图像数据(不包括元数据)的哈希,https://stackoverflow.com如果您正在寻找代码,/a/10075170/890106 会给出答案的一个元素。
但它可能不适用于所有 JPG 变体:其中一些可以嵌入多个图像(MPF / CIPA 多图片格式,更多信息请访问 http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/MPF.html),您可能仍然有一些元数据。另外,某些软件将--[0-9A-F]+--形式的UID放在文件末尾,并且不应读取它。最安全的解决方案是对像素进行校验和(尽管您仍然可以受到方向、颜色配置文件等的影响)。
I think this question is related to this one Compute hash of only the core image data (excluding metadata) for an image, https://stackoverflow.com/a/10075170/890106 gives an element of answer if you're looking for code.
It might not works with all JPG variants though : some of them can embed multiple images (MPF / CIPA Multi-Picture Format, more informations at http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/MPF.html) and you might still have some metadata. Also, some software put an UID in the form of --[0-9A-F]+-- at the end of the file and it shouldn't be read. Safest solution if probably to checksum pixels (though you can still have influence of orientation, color profile, ..).
获取像素数据的哈希和的一种简单方法是将 JPEG 转换为 32 位 BMP 或转换为 PNG,然后从中计算哈希和。这将从 JPEG 中剥离所有相关信息,甚至会匹配具有不同编码的 JPEG,从而产生相同的像素数据。当然,如果您有生成的 BMP,您也可以直接使用内存中的像素数据(即 Windows 有几个 API 函数可以从任何支持的图像类型中获取它)。
One easy way to get a hash sum of just the pixel data would be to convert the JPEG into a 32Bit BMP or alternatively into PNG and to calculate a hashsum from that. This will strip all the associated information from the JPEGs and would even match JPEGs with differnt encodings that lead to the same pixel data. You could of course also use the in memory pixel data of the resulting BMPs directly if you have it (i.e. Windows has several API functions to get it from any supported image type).
jpeg 和 exif 是的,其他我不知道。
我拥有的 JPEG 规范称为 JFIF(JPEG 文件交换格式),它来自 ISO 10918-1 的附录 B,与所有 ISO 规范一样,需要仔细阅读才能弄清楚如何将规范转换为数据结构。我认为 这个 更容易遵循
EXIF 格式解析,就像 TIFF 格式一样。每个块都有一个类型和一个大小,因此您只需遍历这些块,直到到达图像数据块。它有一个指向图像数据的指针(实际上是指向条带的指针,但我很确定您可以假设第一个图像数据条之后到文件末尾的所有内容都是图像数据。exif
格式具有 自己的网站
Yes to jpeg and exif, I don't know to the others.
The JPEG spec that I have is called JFIF (JPEG File Interchange Format) it comes from Annex B of ISO 10918-1 and like all ISO specs, it takes careful reading to figure out how to translate the spec into data structures. I think this is much easier to follow
the EXIF format parses much like the TIFF format. each chunk has a type and a size, so you just walk the chunks until you get to the image data chunk. it has a pointer to the image data (actually pointers to strips, but I'm pretty sure that you can assume the everything after the first strip of image data to the end of the file is image data.
The exif format has its own website
您必须查看每种格式。对于 JPEG,它看起来像结构意味着您可以只对以 FFEn 开头的部分(例如 0xFFE1),并对每个标记后指定的字节进行校验和(看起来长度在标记后面,在大端格式中为 2 个字节)。有关更多详细信息,请参阅此处。
You'll have to look at each format. For JPEG, it looks like the structure implies that you can just do a checksum of the sections that start with FFEn (e.g. 0xFFE1) and checksum the bytes specified after each marker (It looks like the length follows the marker and is 2 bytes in big-endian format). For more details, see here.
由于您想对各种图像格式执行此操作,因此您应该仅使用通用图像解压缩库并对未压缩数据运行校验和。这将允许您匹配相同的图像,即使它们在磁盘上的编码不同。
如果您想限制自己使用 JPEG,您可以对 SOI 和 EOI 之间的数据进行校验和。 这个答案可以稍微调整以满足您的需要。
Since you want to do this for various image formats, you should just use a general-purpose image decompression library and run your checksum on the uncompressed data. This will allow you to match identical images even if they are encoded differently on disk.
If you want to limit yourself to JPEG, you can checksum the data between SOI and EOI. This answer can be slightly adapted to do what you need.
MediaTags 具有对 JPEG、MP3、M4A 等的校验和支持
MediaTags has checksum support for JPEG, MP3, M4A, etc