lxml.etree.iterparse 关闭输入文件处理程序?
filterous是使用 iterparse
解析一个简单的 XML StringIO
对象 在 <一个href="https://github.com/l0b0/filterous/blob/cc172cc54bd068a5f1de231e3567f4fe0bb5d2d1/tests/tests.py#L139" rel="nofollow">单元测试。但是,当随后尝试访问 StringIO
对象时,Python 将退出并显示“ValueError: I/O 操作已关闭文件
”消息。根据 iterparse
文档,“从 lxml 开始2.3,在错误情况下也会调用 .close() 方法,”但我没有从 iterparse
收到任何错误消息或 Exception
。我的 IO-foo 显然没有跟上速度,所以有人有建议吗?
命令和(希望)相关代码:
$ python2.6 setup.py test
setup.py:tests/tests.py:filterous/filterous.py:Traceback:PS
from setuptools import setup
from filterous import filterous as package
setup(
...
test_suite = 'tests.tests',
运行
from cStringIO import StringIO
import unittest
from filterous import filterous
XML = '''<posts tag="" total="3" ...'''
class TestSearch(unittest.TestCase):
def setUp(self):
self.xml = StringIO(XML)
self.result = StringIO()
...
def test_empty_tag_not(self):
"""Empty tag; should get N results."""
filterous.search(
self.xml,
self.result,
{'ntag': [u'']},
['href'],
False)
self.assertEqual(
len(self.result.getvalue().splitlines()),
self.xml.getvalue().count('<post '))
良好
from lxml import etree
...
def search(file_pointer, out, terms, includes, human_readable = True):
...
context = etree.iterparse(file_pointer, tag='posts')
:
ERROR: Empty tag; should get N results.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/victor/dev/filterous/tests/tests.py", line 149, in test_empty_tag_not
self.xml.getvalue().count('<post '))
ValueError: I/O operation on closed file
测试在 2010-07-27。
filterous is using iterparse
to parse a simple XML StringIO
object in a unit test. However, when trying to access the StringIO
object afterwards, Python exits with a "ValueError: I/O operation on closed file
" message. According to the iterparse
documentation, "Starting with lxml 2.3, the .close() method will also be called in the error case," but I get no error message or Exception
from iterparse
. My IO-foo is obviously not up to speed, so does anyone have suggestions?
The command and (hopefully) relevant code:
$ python2.6 setup.py test
setup.py:
from setuptools import setup
from filterous import filterous as package
setup(
...
test_suite = 'tests.tests',
tests/tests.py:
from cStringIO import StringIO
import unittest
from filterous import filterous
XML = '''<posts tag="" total="3" ...'''
class TestSearch(unittest.TestCase):
def setUp(self):
self.xml = StringIO(XML)
self.result = StringIO()
...
def test_empty_tag_not(self):
"""Empty tag; should get N results."""
filterous.search(
self.xml,
self.result,
{'ntag': [u'']},
['href'],
False)
self.assertEqual(
len(self.result.getvalue().splitlines()),
self.xml.getvalue().count('<post '))
filterous/filterous.py:
from lxml import etree
...
def search(file_pointer, out, terms, includes, human_readable = True):
...
context = etree.iterparse(file_pointer, tag='posts')
Traceback:
ERROR: Empty tag; should get N results.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/victor/dev/filterous/tests/tests.py", line 149, in test_empty_tag_not
self.xml.getvalue().count('<post '))
ValueError: I/O operation on closed file
PS: The tests all ran fine on 2010-07-27.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
似乎与
StringIO
配合得很好,请尝试使用它而不是cStringIO
。不知道为什么要关门。Seems to work fine with
StringIO
, try using that instead ofcStringIO
. No idea why it's getting closed.Docs-fu 是问题所在。您引用的“从 lxml 2.3 开始,在错误情况下也会调用 .close() 方法”与 iterparse 无关。它出现在您的链接页面上的 iterparse 部分之前。它是目标解析器接口文档的一部分。它指的是目标(输出!)对象的 close() 方法,与 StringIO 无关。无论如何,你似乎也忽略了这个小词也。在2.3之前,lxml仅在解析成功时才关闭目标对象。现在,它还会在出错时关闭它。
为什么解析完成后要“访问”StringIO 对象?
更新 通过稍后尝试访问数据库,您是指测试中的所有 self.xml.getvalue() 调用吗? [在你的问题中显示 ferschlugginer 回溯,这样我们就不需要猜测!] 如果这是导致问题的原因(它确实算作 IO 操作),请忘记 getvalue() ...如果它可以工作,不是吗?返回(非常规命名的)(不变的)XML?
Docs-fu is the problem. What you quoted "Starting with lxml 2.3, the .close() method will also be called in the error case," is nothing to do with iterparse. It appears on your linked page before the section on iterparse. It is part of the docs for the target parser interface. It is referring to the close() method of the target (output!) object, nothing to do with your StringIO. In any case, you also seem to have ignored that little word also. Before 2.3, lxml closed the target object only if the parse was successful. Now it also closes it upon error.
Why do you want to "access" the StringIO object after parsing has finished?
Update By trying to access the database afterwards, do you mean all those self.xml.getvalue() calls in your tests? [Show the ferschlugginer traceback in your question so we don't need to guess!] If that's causing the problem (it does count as an IO operation), forget getvalue() ... if it were to work, wouldn't it return the (unconventionally named) (invariant) XML?