Zip64“中央目录定位器末尾”的用途是什么?
在 Zip64 格式 中,有一个名为
中央目录定位器的Zip64结尾
,包含中央目录记录的 zip64 结尾的偏移量。当您可以通过幻数搜索“中央目录的 zip64 结尾”记录时,为什么需要此记录?
编辑:请注意,查找定位器的唯一方法是查找定位器的幻数。这里的要点是,当您也可以通过其幻数直接搜索中央目录的zip64结尾
记录时,为什么要首先使用定位器幻数来搜索定位器呢?
In the Zip64 format, there is a header called
Zip64 end of central directory locator
that contains the offset to the zip64 end of central directory record. Why would you need this record when you can search for the 'zip64 end of central directory' record by its magic number?
EDIT: Please note that the only way to look up the locator is by looking up the magic number for locator. The point here is that why bother searching for the locator with the locator magic number in the first place when you can directly search the zip64 end of central directory
record also by its magic number?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
直接导航到文件中的字节偏移量比搜索幻数要快得多。此外,无法保证不会在数据中的其他位置找到幻数,如果它开始从无效但“假定正确”的位置读取,则可能会导致实现读取不正确的数据。
在我自己做了一些额外的实现之后,我认为最重要的是要注意“特殊用途数据可能驻留在 zip64 可扩展数据扇区字段中”(在中央目录记录的 Zip64 结尾之后)。这些字段可能存在多个,每个字段都以 2 字节的标头 ID 开头,后跟 4 字节的数据大小 - 后跟实际的“特殊用途数据” - 允许多个 2^32 字节 (4 GB) 的数据。虽然这看起来有些极端,但这样做肯定会导致需要在定位器和“中央目录记录的 Zip64 结尾”之间跨越磁盘。这里的数据量越大,不仅需要更长的时间来扫描签名,而且意外找到最小 4 字节/32 位“zip64 中央目录末尾”签名的随机机会也会随着数据长度的增加而增加。
“查找定位器的唯一方法是查找定位器的幻数”是不正确的。如果存在,则应紧接在“中央目录记录结束”之前。从那里读回 20 个字节,然后读取接下来的 4 个字节应该会产生“中央目录定位器签名的 zip64 结尾”——这可以用作健全性检查(而不是扫描它)。
Navigating directly to a byte offset in a file is significantly faster than searching for a magic number. Additionally, there is no guarantee that the magic number won't be found elsewhere within the data, which could cause the implementation to read from incorrect data if it starts reading from an invalid but "assumed correct" location.
After doing some additional implementation around this myself, I think the most significant thing to note is that "special purpose data may reside in the zip64 extensible data sector field" (following the Zip64 end of central directory record). Multiple of these fields may exist, and each starts with a header ID of 2 bytes, followed by a data size of 4 bytes - followed by the actual "special purpose data" - allowing for multiple 2^32 bytes (4 GB) of data. While this may seem extreme, doing so could certainly lead to needing to span disks between the locator and the "Zip64 end of central directory record". Larger amounts of data here would not only take longer to scan for the signature, but the random chance of accidentally finding the minimal 4 byte / 32-bit "zip64 end of central directory" signature will increase with the length of the data.
"the only way to look up the locator is by looking up the magic number for locator" is not true. If it exists, it should be immediately before the "End of central directory record". Reading back 20 bytes from there, then reading the next 4 bytes should yield the "zip64 end of central dir locator signature" - which can be used as a sanity check (rather than scanning for it).