pyExcelerator 在读取某些文件时出现问题

发布于 2024-08-14 00:56:01 字数 1108 浏览 16 评论 0原文

我在读取某些 xls 文件时使用 pyExcelerator 时遇到问题。

我写了一些 python 脚本，使用这个库来解析 XLS 文件并用信息填充数据库。

这些脚本解析的文件的模板可能会有所不同，我有时会重新配置脚本来处理它们。使用我遇到问题的模板之一：pyExcelerator 只是引发了一个异常：

Traceback (most recent call last):
 File "/home/* * */parsexls.py",
line 64, in handle_label
   parser.parse()
 File "/home/* * */parsers.py", line 335, in parse
   self.contents = pyExcelerator.parse_xls(self.file_record.file,
self.encoding)
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/ImportXLS.py",
line 327, in parse_xls
   ole_streams = CompoundDoc.Reader(filename).STREAMS
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 67, in __init__
   self.__build_short_sectors_data()
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 256, in __build_short_sectors_data
   dentry_start_sid, stream_size) = self.dir_entry_list[0]
IndexError: list index out of range

一些问题 XLS 文件包含空工作表，删除这些工作表会有所帮助，但许多文件即使没有空工作表也无法处理。这些文件中没有什么特别的，它们不包含公式或图片 - 只是字符串、数字和日期。

正如我所看到的， pyExcelerator 已被其作者放弃:(

非常感谢任何有关解决此问题的建议。

原文

I've got a problem using pyExcelerator when reading some xls-files.

There're some python scripts i wrote, that use this library to parse XLS-files and populate database with info.

The templates for the files these scripts parse may vary and i sometimes reconfigure the script to handle them. With the one of the templates i ran into problem: pyExcelerator just raises an exception:

Traceback (most recent call last):
 File "/home/* * */parsexls.py",
line 64, in handle_label
   parser.parse()
 File "/home/* * */parsers.py", line 335, in parse
   self.contents = pyExcelerator.parse_xls(self.file_record.file,
self.encoding)
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/ImportXLS.py",
line 327, in parse_xls
   ole_streams = CompoundDoc.Reader(filename).STREAMS
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 67, in __init__
   self.__build_short_sectors_data()
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 256, in __build_short_sectors_data
   dentry_start_sid, stream_size) = self.dir_entry_list[0]
IndexError: list index out of range

Some of the problem XLS-files contained empty sheets and removing of these sheets helped, but many of the files can't be handled even without empty sheets. There's nothing extraordinary in these files and they contain no formulas or pictures - just strings, numbers and dates.

As i can see, the pyExcelerator is abandoned by it's author :(

Any suggestions on fixing this issue are much appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

桃扇骨 2024-08-21 00:56:01

我是xlrd的作者。它读取XLS 文件，并且不是任何东西的分支。我维护一个名为 xlwt 的包，它writes XLS 文件，并且是 pyExcelerator 的一个分支。 pyExcelerator 中的 parse_xls 功能已被弃用，甚至已从 xlwt 中删除。请改用 xlrd。

鉴于您复制的回溯，该文件看起来可能已损坏。它所做的事情早在解析工作表数据之前就发生了。什么软件生成这些文件？您可以使用 Excel 或 OpenOffice.org 的 Calc 或 Gnumeric 打开它们吗？ xlrd 可能会给您更有意义的错误消息。您可能想向我发送 (insert_punctuation('sjmachin', 'lexicon', 'net')) 失败文件的副本；请包括一些带空纸的和一些不带空纸的。顺便问一下，你用什么来删除空表？处理带有空工作表的文件时，从 pyExcelerator 收到什么错误消息？

回复收藏 0 原文