由 rubyzip 压缩的 xlsx 无法被 Excel 读取
我正在编写可以读取/写入 Excel xlsx 文件的代码。 xlsx 文件只是几个 xml 文件的 zip 存档,因此为了测试我是否可以编写文件,我使用了一个名为 rubyzip
的 gem 来解压缩 xlsx 文件,然后立即将其压缩回一个新的存档,而不修改数据。然而,当我这样做时,我无法打开新的 Excel 文件,据说它已损坏。
或者,如果我使用 Mac OS X 的 Archive Utility(处理 zip 文件的本机应用程序),并且解压缩并重新压缩 Excel 文件,则数据不会损坏,并且我可以在 Excel 中打开生成的文件。
我发现“损坏”数据的不是 rubyzip 的“解压缩”功能,而是 zip 过程。 (事实上,当我对 rubyzip
创建的新 zip 文件使用 Archive Utility 时,Excel 再次可以读取该文件)。
我想知道为什么会发生这种情况,以及有哪些解决方案可以以 Excel 可读的方式以编程方式压缩内容。
我的压缩代码:
def compress(path)
path.sub!(%r[/$],'')
archive = File.join(path,File.basename(path))+'.zip'
FileUtils.rm archive, :force=>true
Zip::ZipFile.open(archive, 'w') do |zipfile|
Dir["#{path}/**/**"].reject{|f|f==archive}.each do |file|
temp = file
zipfile.add(file.sub(path+'/',''),file)
end
end
end
I am working on writing code which can read/write Excel xlsx files. xlsx files are simply zip archives of several xml files, so in order to test out if I could write a file, I used a gem called rubyzip
to unzip the xlsx file and then immediately zip it back up to a new archive, without modifying the data. When I do this, however, I cannot open the new excel file, it is said to be corrupted.
Alternatively, if I use Mac OS X's Archive Utility (the native application to handle zip files), and I unzip and re-zip an excel file, the data is not corrupted and I can open the resultant file in Excel.
I have found that it is not the 'unzip' functionality of rubyzip
that "corrupts" the data, but the zip process. (In fact, when I use Archive Utility on the new zip file that rubyzip
creates, the file is again readable by Excel).
I'm wondering why this happens, and what solutions there could be to zip the contents programmatically in a way which is readable by Excel.
My code for zipping:
def compress(path)
path.sub!(%r[/$],'')
archive = File.join(path,File.basename(path))+'.zip'
FileUtils.rm archive, :force=>true
Zip::ZipFile.open(archive, 'w') do |zipfile|
Dir["#{path}/**/**"].reject{|f|f==archive}.each do |file|
temp = file
zipfile.add(file.sub(path+'/',''),file)
end
end
end
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
为了使包保持一致,OOXML 格式对 Zip 的使用施加了许多限制。例如,包中允许的唯一压缩方法是 DEFLATE。
您可能需要检查可用标准的附录 C 中的 OPC 包(即 .XSLX 文件)规范 此处 (Zip),然后确保 rubyzip 库没有执行任何操作这是不允许的(例如使用 IMPLODE 压缩方法)。
There are a number of constraints that the OOXML format imposes on the use of Zip in order for the packages to be conformant. For example, the only compression method permitted in the package is DEFLATE.
You might want to check the specification for OPC packages (which .XSLX files are) in Annex C of the standard available here (Zip), and then ensure that the rubyzip library is not doing anything that is not permitted (such as using the IMPLODE compression method).