lxml.etree.iterparse 关闭输入文件处理程序?

发布于 2024-11-26 07:29:36 字数 2439 浏览 2 评论 0原文

filterous是使用 iterparse解析一个简单的 XML StringIO 对象 在 <一个href="https://github.com/l0b0/filterous/blob/cc172cc54bd068a5f1de231e3567f4fe0bb5d2d1/tests/tests.py#L139" rel="nofollow">单元测试。但是,当随后尝试访问 StringIO 对象时,Python 将退出并显示“ValueError: I/O 操作已关闭文件”消息。根据 iterparse 文档,“从 lxml 开始2.3,在错误情况下也会调用 .close() 方法,”但我没有从 iterparse 收到任何错误消息或 Exception。我的 IO-foo 显然没有跟上速度,所以有人有建议吗?

命令和(希望)相关代码:

$ python2.6 setup.py test

setup.py:tests/tests.py:filterous/filterous.py:Traceback:PS

from setuptools import setup
from filterous import filterous as package

setup(
    ...
    test_suite = 'tests.tests',

运行

from cStringIO import StringIO
import unittest

from filterous import filterous

XML = '''<posts tag="" total="3" ...'''

class TestSearch(unittest.TestCase):
    def setUp(self):
        self.xml = StringIO(XML)
        self.result = StringIO()
    ...
    def test_empty_tag_not(self):
        """Empty tag; should get N results."""
        filterous.search(
            self.xml,
            self.result,
            {'ntag': [u'']},
            ['href'],
            False)
        self.assertEqual(
            len(self.result.getvalue().splitlines()),
            self.xml.getvalue().count('<post '))

良好

from lxml import etree
...
def search(file_pointer, out, terms, includes, human_readable = True):
    ...
    context = etree.iterparse(file_pointer, tag='posts')

ERROR: Empty tag; should get N results.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/victor/dev/filterous/tests/tests.py", line 149, in test_empty_tag_not
    self.xml.getvalue().count('<post '))
ValueError: I/O operation on closed file

测试在 2010-07-27

filterous is using iterparse to parse a simple XML StringIO object in a unit test. However, when trying to access the StringIO object afterwards, Python exits with a "ValueError: I/O operation on closed file" message. According to the iterparse documentation, "Starting with lxml 2.3, the .close() method will also be called in the error case," but I get no error message or Exception from iterparse. My IO-foo is obviously not up to speed, so does anyone have suggestions?

The command and (hopefully) relevant code:

$ python2.6 setup.py test

setup.py:

from setuptools import setup
from filterous import filterous as package

setup(
    ...
    test_suite = 'tests.tests',

tests/tests.py:

from cStringIO import StringIO
import unittest

from filterous import filterous

XML = '''<posts tag="" total="3" ...'''

class TestSearch(unittest.TestCase):
    def setUp(self):
        self.xml = StringIO(XML)
        self.result = StringIO()
    ...
    def test_empty_tag_not(self):
        """Empty tag; should get N results."""
        filterous.search(
            self.xml,
            self.result,
            {'ntag': [u'']},
            ['href'],
            False)
        self.assertEqual(
            len(self.result.getvalue().splitlines()),
            self.xml.getvalue().count('<post '))

filterous/filterous.py:

from lxml import etree
...
def search(file_pointer, out, terms, includes, human_readable = True):
    ...
    context = etree.iterparse(file_pointer, tag='posts')

Traceback:

ERROR: Empty tag; should get N results.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/victor/dev/filterous/tests/tests.py", line 149, in test_empty_tag_not
    self.xml.getvalue().count('<post '))
ValueError: I/O operation on closed file

PS: The tests all ran fine on 2010-07-27.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

怪我入戏太深 2024-12-03 07:29:36

似乎与 StringIO 配合得很好,请尝试使用它而不是 cStringIO。不知道为什么要关门。

Seems to work fine with StringIO, try using that instead of cStringIO. No idea why it's getting closed.

月下凄凉 2024-12-03 07:29:36

Docs-fu 是问题所在。您引用的“从 lxml 2.3 开始,在错误情况下也会调用 .close() 方法”与 iterparse 无关。它出现在您的链接页面上的 iterparse 部分之前。它是目标解析器接口文档的一部分。它指的是目标(输出!)对象的 close() 方法,与 StringIO 无关。无论如何,你似乎也忽略了这个小词。在2.3之前,lxml仅在解析成功时才关闭目标对象。现在,它还会在出错时关闭它。

为什么解析完成后要“访问”StringIO 对象?

更新 通过稍后尝试访问数据库,您是指测试中的所有 self.xml.getvalue() 调用吗? [在你的问题中显示 ferschlugginer 回溯,这样我们就不需要猜测!] 如果这是导致问题的原因(它确实算作 IO 操作),请忘记 getvalue() ...如果它可以工作,不是吗?返回(非常规命名的)(不变的)XML?

Docs-fu is the problem. What you quoted "Starting with lxml 2.3, the .close() method will also be called in the error case," is nothing to do with iterparse. It appears on your linked page before the section on iterparse. It is part of the docs for the target parser interface. It is referring to the close() method of the target (output!) object, nothing to do with your StringIO. In any case, you also seem to have ignored that little word also. Before 2.3, lxml closed the target object only if the parse was successful. Now it also closes it upon error.

Why do you want to "access" the StringIO object after parsing has finished?

Update By trying to access the database afterwards, do you mean all those self.xml.getvalue() calls in your tests? [Show the ferschlugginer traceback in your question so we don't need to guess!] If that's causing the problem (it does count as an IO operation), forget getvalue() ... if it were to work, wouldn't it return the (unconventionally named) (invariant) XML?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文