在XML.Sax(Python)中不为我工作
我需要验证XML,但代码有变量(str),而不是来自文件。
因此,我认为使用XML.SAX很容易做到。但是我不能为我工作。解析文件时正常工作,但是解析字符串时会遇到一个奇怪的错误。
这是我的测试代码:
from xml.sax import make_parser, parseString
import os
filename = os.path.join('.', 'data', 'data.xml')
xmlstr = "<note>\n<to>Mary</to>\n<from>Jane</from>\n<heading>Reminder</heading>\n<body>Go to the zoo</body>\n</note>"
def parsefile(file):
parser = make_parser()
parser.parse(file)
def parsestr(xmlstr):
parser = make_parser()
parseString(xmlstr.encode('utf-8'), parser)
try:
parsefile(filename)
print("%s is well-formed" % filename)
except Exception as e:
print("%s is NOT well-formed! %s" % (filename, e))
try:
parsestr(xmlstr)
print("%s is well-formed" % ('xml string'))
except Exception as e:
print("%s is NOT well-formed! %s" % ('xml string', e))
执行脚本时,我明白了:
./data/data.xml is well-formed
xml string is NOT well-formed! 'ExpatParser' object has no attribute 'processingInstruction'
我缺少什么?
I need to validate xml but the code comes in a variable (str), not from a file.
So I figured this would be easy to do with xml.sax. But I can't get it to work for me. It works fine when parsing a file, but I get a strange error when parsing a string.
Here's my test-code:
from xml.sax import make_parser, parseString
import os
filename = os.path.join('.', 'data', 'data.xml')
xmlstr = "<note>\n<to>Mary</to>\n<from>Jane</from>\n<heading>Reminder</heading>\n<body>Go to the zoo</body>\n</note>"
def parsefile(file):
parser = make_parser()
parser.parse(file)
def parsestr(xmlstr):
parser = make_parser()
parseString(xmlstr.encode('utf-8'), parser)
try:
parsefile(filename)
print("%s is well-formed" % filename)
except Exception as e:
print("%s is NOT well-formed! %s" % (filename, e))
try:
parsestr(xmlstr)
print("%s is well-formed" % ('xml string'))
except Exception as e:
print("%s is NOT well-formed! %s" % ('xml string', e))
When executing the script, I get this:
./data/data.xml is well-formed
xml string is NOT well-formed! 'ExpatParser' object has no attribute 'processingInstruction'
What am I missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
parsestring
的第二个参数应该是contenthandler
不是解析器。因为您正在传递错误类型的对象类型,所以它没有预期的方法。您期望您将子类
ContentHandler
,然后根据需要处理SAX事件。在这种情况下,您实际上并不是要从文档中提取任何信息,因此您可以使用basecontentHandler
类:The second argument to
parseString
is supposed to be aContentHandler
, not a parser. Because you're passing in the wrong type of object, it doesn't have the expected methods.You're expected to subclass
ContentHandler
and then handle the SAX events as necessary. In this case, you're not actually trying to extract any information from the document, so you could use the baseContentHandler
class: