pyExcelerator 在读取某些文件时出现问题

发布于 2024-08-14 00:56:01 字数 1108 浏览 8 评论 0原文

我在读取某些 xls 文件时使用 pyExcelerator 时遇到问题。

我写了一些 python 脚本,使用这个库来解析 XLS 文件并用信息填充数据库。

这些脚本解析的文件的模板可能会有所不同,我有时会重新配置脚本来处理它们。使用我遇到问题的模板之一:pyExcelerator 只是引发了一个异常:

Traceback (most recent call last):
 File "/home/* * */parsexls.py",
line 64, in handle_label
   parser.parse()
 File "/home/* * */parsers.py", line 335, in parse
   self.contents = pyExcelerator.parse_xls(self.file_record.file,
self.encoding)
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/ImportXLS.py",
line 327, in parse_xls
   ole_streams = CompoundDoc.Reader(filename).STREAMS
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 67, in __init__
   self.__build_short_sectors_data()
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 256, in __build_short_sectors_data
   dentry_start_sid, stream_size) = self.dir_entry_list[0]
IndexError: list index out of range

一些问题 XLS 文件包含空工作表,删除这些工作表会有所帮助,但许多文件即使没有空工作表也无法处理。这些文件中没有什么特别的,它们不包含公式或图片 - 只是字符串、数字和日期。

正如我所看到的, pyExcelerator 已被其作者放弃:(

非常感谢任何有关解决此问题的建议。

I've got a problem using pyExcelerator when reading some xls-files.

There're some python scripts i wrote, that use this library to parse XLS-files and populate database with info.

The templates for the files these scripts parse may vary and i sometimes reconfigure the script to handle them. With the one of the templates i ran into problem: pyExcelerator just raises an exception:

Traceback (most recent call last):
 File "/home/* * */parsexls.py",
line 64, in handle_label
   parser.parse()
 File "/home/* * */parsers.py", line 335, in parse
   self.contents = pyExcelerator.parse_xls(self.file_record.file,
self.encoding)
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/ImportXLS.py",
line 327, in parse_xls
   ole_streams = CompoundDoc.Reader(filename).STREAMS
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 67, in __init__
   self.__build_short_sectors_data()
 File "/usr/local/lib/python2.6/dist-packages/pyExcelerator/CompoundDoc.py",
line 256, in __build_short_sectors_data
   dentry_start_sid, stream_size) = self.dir_entry_list[0]
IndexError: list index out of range

Some of the problem XLS-files contained empty sheets and removing of these sheets helped, but many of the files can't be handled even without empty sheets. There's nothing extraordinary in these files and they contain no formulas or pictures - just strings, numbers and dates.

As i can see, the pyExcelerator is abandoned by it's author :(

Any suggestions on fixing this issue are much appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

桃扇骨 2024-08-21 00:56:01

我是xlrd的作者。它读取XLS 文件,并且不是任何东西的分支。我维护一个名为 xlwt 的包,它writes XLS 文件,并且是 pyExcelerator 的一个分支。 pyExcelerator 中的 parse_xls 功能已被弃用,甚至已从 xlwt 中删除。请改用 xlrd。

鉴于您复制的回溯,该文件看起来可能已损坏。它所做的事情早在解析工作表数据之前就发生了。什么软件生成这些文件?您可以使用 Excel 或 OpenOffice.org 的 Calc 或 Gnumeric 打开它们吗? xlrd 可能会给您更有意义的错误消息。您可能想向我发送 (insert_punctuation('sjmachin', 'lexicon', 'net')) 失败文件的副本;请包括一些带空纸的和一些不带空纸的。顺便问一下,你用什么来删除空表?处理带有空工作表的文件时,从 pyExcelerator 收到什么错误消息?

I'm the author of xlrd. It reads XLS files and is not a fork of anything. I maintain a package called xlwt which writes XLS files and is a fork of pyExcelerator. The parse_xls functionality in pyExcelerator was deprecated to the point of removal from xlwt. Use xlrd instead.

Given the traceback that you reproduced, it looks like the file may be corrupted. What it is doing there happens well before the sheet data is parsed. What software produces these files? Can you open them with Excel or OpenOffice.org's Calc or Gnumeric? xlrd may give you a more meaningful error message. You may like to send me (insert_punctuation('sjmachin', 'lexicon', 'net')) copies of your failing file(s); please include some with and some without empty sheets. By the way, what are you using to remove empty sheets? What error message do you get from pyExcelerator when processing files with empty sheets?

魄砕の薆 2024-08-21 00:56:01

您可能希望尝试一下 xlrd...它(我相信)是作为 pyExcelerator 的一个分支开始的,因此合并需要很少的代码更改,但它得到了积极的维护:

http://pypi.python.org/pypi/xlrd

项目网站

一般信息、文档中的发行说明和历史记录

You might wish to give xlrd a try... it started (I believe) as a fork of pyExcelerator, so incorporating requires few code changes, but it is actively maintained:

http://pypi.python.org/pypi/xlrd

Project website

General info, release notes and history from the documentation

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文