为什么 Jython 2.5.2 中的 xml.sax 解析器将两个字符的属性转换为元组?
当我在 Jython 2.5.2 下使用 xml.sax 进行解析时,当我在 XML 流中遇到 2 个字符的属性时,它会将属性名称转换为元组。无论对该名称进行多少强制,我都无法提取该属性的值。我尝试传递元组或将其转换为字符串并传递它。这两种情况都会导致:
Traceback (most recent call last):
File "test.py", line 18, in startElement
print '%s = %s' % (k, attrs.getValue(k))
File "/usr/local/Cellar/jython/2.5.2/libexec/Lib/xml/sax/drivers2/drv_javasax.py", line 266, in getValue
value = self._attrs.getValue(_makeJavaNsTuple(name))
TypeError: getValue(): 1st arg can't be coerced to String, int
我有一些可以运行的示例代码,它显示了问题:
import xml
from xml import sax
from xml.sax import handler
import traceback
class MyXMLHandler( handler.ContentHandler):
def __init__(self):
pass
def startElement(self, name, attrs):
for k in attrs.keys():
print 'type(k) = %s' % type(k)
if isinstance(k, (list, tuple)):
k = ''.join(k)
print 'type(k) = %s' % type(k)
print 'k = %s' % k
try:
print '%s = %s' % (k, attrs.getValue(k))
except Exception, e:
print '\nError:'
traceback.print_exc()
print ''
if __name__ == '__main__':
s = '<TAG A="0" AB="0" ABC="0"/>'
print '%s' % s
xml.sax.parseString(s, MyXMLHandler())
exit(0)
运行时,AB
属性作为元组返回,但 A
和 < code>ABC 属性是 unicode 字符串,并且可以通过 属性对象。在 Jython 2.5.2 下,对我来说,此输出为:
> jython test.py
<TAG A="0" AB="0" ABC="0"/>
type(k) = <type 'unicode'>
type(k) = <type 'unicode'>
k = A
A = 0
type(k) = <type 'tuple'>
type(k) = <type 'unicode'>
k = AB
Error:
Traceback (most recent call last):
File "test.py", line 18, in startElement
print '%s = %s' % (k, attrs.getValue(k))
File "/usr/local/Cellar/jython/2.5.2/libexec/Lib/xml/sax/drivers2/drv_javasax.py", line 266, in getValue
value = self._attrs.getValue(_makeJavaNsTuple(name))
TypeError: getValue(): 1st arg can't be coerced to String, int
type(k) = <type 'unicode'>
type(k) = <type 'unicode'>
k = ABC
ABC = 0
此代码在 OS X 上的 Python 2.7.2 和 CentOS 5.6 上的 Python 2.4.3 下正确运行。我挖掘了 Jython 错误,但找不到与此问题类似的任何内容。
这是已知的 Jython xml.sax 处理问题吗?或者我是否弄乱了 Handler 中与 2.5.2 不兼容的内容?
编辑:这似乎是 Jython 2.5.2 的错误。我找到了对它的引用: http://sourceforge.net/mailarchive/message.php? msg_id=27783080 -- 欢迎提出解决方法的建议。
When ever I encounter a 2-character attribute in my XML stream when parsing with xml.sax under Jython 2.5.2 it converts the attribute name to a tuple. No amount of coercion of that name allows me to extract the value for the attribute. I tried passing the tuple or converting it to a string and passing that. Both cases result in:
Traceback (most recent call last):
File "test.py", line 18, in startElement
print '%s = %s' % (k, attrs.getValue(k))
File "/usr/local/Cellar/jython/2.5.2/libexec/Lib/xml/sax/drivers2/drv_javasax.py", line 266, in getValue
value = self._attrs.getValue(_makeJavaNsTuple(name))
TypeError: getValue(): 1st arg can't be coerced to String, int
I've got some sample code you can run that shows the problem:
import xml
from xml import sax
from xml.sax import handler
import traceback
class MyXMLHandler( handler.ContentHandler):
def __init__(self):
pass
def startElement(self, name, attrs):
for k in attrs.keys():
print 'type(k) = %s' % type(k)
if isinstance(k, (list, tuple)):
k = ''.join(k)
print 'type(k) = %s' % type(k)
print 'k = %s' % k
try:
print '%s = %s' % (k, attrs.getValue(k))
except Exception, e:
print '\nError:'
traceback.print_exc()
print ''
if __name__ == '__main__':
s = '<TAG A="0" AB="0" ABC="0"/>'
print '%s' % s
xml.sax.parseString(s, MyXMLHandler())
exit(0)
When run, the AB
attribute is returned as a tuple but the A
and ABC
attributes are unicode strings and function properly with the get()
method on the Attribute object. Under Jython 2.5.2 this outputs, for me:
> jython test.py
<TAG A="0" AB="0" ABC="0"/>
type(k) = <type 'unicode'>
type(k) = <type 'unicode'>
k = A
A = 0
type(k) = <type 'tuple'>
type(k) = <type 'unicode'>
k = AB
Error:
Traceback (most recent call last):
File "test.py", line 18, in startElement
print '%s = %s' % (k, attrs.getValue(k))
File "/usr/local/Cellar/jython/2.5.2/libexec/Lib/xml/sax/drivers2/drv_javasax.py", line 266, in getValue
value = self._attrs.getValue(_makeJavaNsTuple(name))
TypeError: getValue(): 1st arg can't be coerced to String, int
type(k) = <type 'unicode'>
type(k) = <type 'unicode'>
k = ABC
ABC = 0
This code functions correctly under Python 2.7.2 on OS X and Python 2.4.3 on CentOS 5.6. I dug around Jython bugs but couldn't find anything similar to this issue.
Is it a known Jython xml.sax handling problem? Or have I messed up something in my Handler that's 2.5.2 incompatible?
Edit: this appears to be a Jython 2.5.2 bug. I found a reference to it: http://sourceforge.net/mailarchive/message.php?msg_id=27783080 -- suggestions for a workaround welcome.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
因此,这是 Jython 中报告的错误。我花了一些时间挖掘,但我在他们的错误存档中找到了它:
http://bugs.jython.org/issue1768
关于该错误的第二条评论提供了解决该问题的方法:使用
_attrs.getValue()
方法从属性列表中检索值。像这样:那么我重写的代码就可以工作:
如果我将行:更改为:
更灵活的works-in-python-and-jython解决方案是构建一个助手,
So, this is a reported bug in Jython. It took some digging but I found it in their bug archive:
http://bugs.jython.org/issue1768
The second comment on the bug provides a work-around for the issue: use the
_attrs.getValue()
method to retrieve values off the attributes list. Like so:My re-written code works if I change the line:
to:
The more flexible works-in-python-and-jython solution is to build a helper: