lxml etree 解析函数出现 IOError

发布于 2024-11-15 20:08:45 字数 1713 浏览 3 评论 0原文

我有这样的逻辑:

for root, dirs, files in os.walk(os.getcwd()):
    if "info.xml" in files:
        root = lxml.etree.parse("%s/info.xml" % root)
        tag = root.xpath("/info/tagname")[0].text

当解析当前路径很深的一个 info.xml 时,遇到错误消息:

    Traceback (most recent call last):
  File "/home/work/mergefile.py", line 365, in <module>
  File "/home/work/mergefile.py", line 344, in merge_ejb_files
  File "/home/work/mergefile.py", line 63, in __init__
  File "/home/work/mergefile.py", line 78, in _parse_info2doc
  File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
  File "parser.pxi", line 1491, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71205)
  File "parser.pxi", line 1520, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:71488)
  File "parser.pxi", line 1420, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:70583)
  File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:67736)
  File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
  File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
  File "parser.pxi", line 563, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64056)
IOError: Error reading file '/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml': failed to load external entity "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"

但是文件 "/home/work/ci/case/dc_daily/dc/ 213577/223922/223958/792536/info.xml"存在,我可以在ipython IDE下用lxml解析它

你知道问题是什么吗?如果你知道的话,请帮助我! 谢谢你!

I have a logic like :

for root, dirs, files in os.walk(os.getcwd()):
    if "info.xml" in files:
        root = lxml.etree.parse("%s/info.xml" % root)
        tag = root.xpath("/info/tagname")[0].text

when parse one info.xml which very deep in current path, met Error Message:

    Traceback (most recent call last):
  File "/home/work/mergefile.py", line 365, in <module>
  File "/home/work/mergefile.py", line 344, in merge_ejb_files
  File "/home/work/mergefile.py", line 63, in __init__
  File "/home/work/mergefile.py", line 78, in _parse_info2doc
  File "lxml.etree.pyx", line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590)
  File "parser.pxi", line 1491, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:71205)
  File "parser.pxi", line 1520, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:71488)
  File "parser.pxi", line 1420, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:70583)
  File "parser.pxi", line 975, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:67736)
  File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:63820)
  File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:64741)
  File "parser.pxi", line 563, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64056)
IOError: Error reading file '/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml': failed to load external entity "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml"

but the file "/home/work/ci/case/dc_daily/dc/213577/223922/223958/792536/info.xml" exist and I can parse it with lxml under ipython IDE

Do you know what is the problem is? If you know it, help me please!
Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

[旋木] 2024-11-22 20:08:45

根据我上面的评论,这是我的解决方案。我打开文件进行读取,然后他们立即关闭它们,这样我就不会达到 1024 个文件的限制。

import lxml.etree as etree
for root,dirs,files in os.walk(os.getcwd()):
    if "info.xml" in files:
        with open('%s/info.xml'%root) as processfile: #use 'rb' if necessary
            xml = etree.parse(processfile)
            tag = root.xpath("/info/tagname")[0].text

Here's my solution, as per my comment above. I'm opening files for read, them closing them right after so I don't hit the 1024 file limit.

import lxml.etree as etree
for root,dirs,files in os.walk(os.getcwd()):
    if "info.xml" in files:
        with open('%s/info.xml'%root) as processfile: #use 'rb' if necessary
            xml = etree.parse(processfile)
            tag = root.xpath("/info/tagname")[0].text
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文