使用 python xml.sax 解析 XML 实体
使用 xml.sax 使用 python 解析 XML,但我的代码无法捕获实体。为什么skippedEntity()或resolveEntity()没有报告如下:
import os
import cStringIO
import xml.sax
from xml.sax.handler import ContentHandler,EntityResolver,DTDHandler
#Class to parse and run test XML files
class TestHandler(ContentHandler,EntityResolver,DTDHandler):
#SAX handler - Entity resolver
def resolveEntity(self,publicID,systemID):
print "TestHandler.resolveEntity: %s %s" % (publicID,systemID)
def skippedEntity(self, name):
print "TestHandler.skippedEntity: %s" % (name)
def unparsedEntityDecl(self,publicID,systemID,ndata):
print "TestHandler.unparsedEntityDecl: %s %s" % (publicID,systemID)
def startElement(self,name,attrs):
# name = string.lower(name)
summary = '' + attrs.get('summary','')
arg = '' + attrs.get('arg','')
print 'TestHandler.startElement(), %s : %s (%s)' % (name,summary,arg)
def run(xml_string):
try:
parser = xml.sax.make_parser()
stream = cStringIO.StringIO(xml_string)
curHandler = TestHandler()
parser.setContentHandler(curHandler)
parser.setDTDHandler( curHandler )
parser.setEntityResolver( curHandler )
parser.parse(stream)
stream.close()
except (xml.sax.SAXParseException), e:
print "*** PARSER error: %s" % e;
def main():
try:
XML = "<!DOCTYPE page[ <!ENTITY num 'foo'> ]><test summary='step: #'>Entity: ¬</test>"
run(XML)
except Exception, e:
print 'FATAL ERROR: %s' % (str(e))
if __name__== '__main__':
main()
运行时,我看到的只是:
TestHandler.startElement(), step: foo ()
*** PARSER error: <unknown>:1:36: undefined entity
为什么我看不到#num;的resolveEntity打印或 ¬ 的跳过条目打印?
Parsing XML with python using xml.sax, but my code fails to catch Entities. Why doesn't skippedEntity() or resolveEntity() report in the following:
import os
import cStringIO
import xml.sax
from xml.sax.handler import ContentHandler,EntityResolver,DTDHandler
#Class to parse and run test XML files
class TestHandler(ContentHandler,EntityResolver,DTDHandler):
#SAX handler - Entity resolver
def resolveEntity(self,publicID,systemID):
print "TestHandler.resolveEntity: %s %s" % (publicID,systemID)
def skippedEntity(self, name):
print "TestHandler.skippedEntity: %s" % (name)
def unparsedEntityDecl(self,publicID,systemID,ndata):
print "TestHandler.unparsedEntityDecl: %s %s" % (publicID,systemID)
def startElement(self,name,attrs):
# name = string.lower(name)
summary = '' + attrs.get('summary','')
arg = '' + attrs.get('arg','')
print 'TestHandler.startElement(), %s : %s (%s)' % (name,summary,arg)
def run(xml_string):
try:
parser = xml.sax.make_parser()
stream = cStringIO.StringIO(xml_string)
curHandler = TestHandler()
parser.setContentHandler(curHandler)
parser.setDTDHandler( curHandler )
parser.setEntityResolver( curHandler )
parser.parse(stream)
stream.close()
except (xml.sax.SAXParseException), e:
print "*** PARSER error: %s" % e;
def main():
try:
XML = "<!DOCTYPE page[ <!ENTITY num 'foo'> ]><test summary='step: #'>Entity: ¬</test>"
run(XML)
except Exception, e:
print 'FATAL ERROR: %s' % (str(e))
if __name__== '__main__':
main()
When run, all I see is:
TestHandler.startElement(), step: foo ()
*** PARSER error: <unknown>:1:36: undefined entity
Why don't I see the resolveEntity print for # or the skipped entry print for ¬?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为resolveEntity 和skippedEntity 仅为外部DTD 调用。我通过修改 XML 使其工作。
external.dtd 包含两个简单的实体声明。
另外,我摆脱了resolveEntity。
这个输出 -
希望这有帮助。
I think resolveEntity and skippedEntity are only called for external DTDs. I got this to work by modifying the XML.
The external.dtd contains two simple entity declarations.
Also, I got rid of resolveEntity.
This outputs -
Hope this helps.
这是您的程序的修改版本,我希望它有意义。它演示了调用所有
TestHandler
方法的情况。test.dtd 包含:
输出:
Addition
据我所知,仅当使用外部 DTD 时才会调用
skippedEntity
(至少我无法提出反例;如果 文档 更清晰一些)。Adam 在他的回答中说,仅针对外部 DTD 调用
resolveEntity
。但这并不完全正确。在处理对内部或外部 DTD 子集中声明的外部实体的引用时,也会调用resolveEntity
。例如:bar.txt 的内容可以是
FOO
。在这种情况下不可能在属性值中引用实体。Here is a modified version of your program that I hope makes sense. It demonstrates a case where all
TestHandler
methods are called.test.dtd contains:
Output:
Addition
As far as I can tell,
skippedEntity
is called only when an external DTD is used (at least I can't come up with a counterexample; it would be nice if the the documentation was a little clearer).Adam said in his answer that
resolveEntity
is called only for external DTDs. But that is not quite true.resolveEntity
is also called when processing a reference to an external entity that is declared in an internal or external DTD subset. For example:where the content of bar.txt could be, say,
FOO
. In this case it is not possible to refer to the entity in an attribute value.