如何使用 Python 中的 Amara 库根据 XSD 架构验证 xml 文件?
以下问题的高额赏金:
您好, 这是我在 Ubuntu 9.10 上使用 Python 2.6、Amara2 进行的尝试 (顺便说一句,test.xsd 是使用 xml2xsd 工具创建的):
g@spot:~$ cat test.xml; echo =====o=====; cat test.xsd; echo ====
o=====; cat test.py; echo =====o=====; ./test.py; echo =====o=====
<?xml version="1.0" encoding="utf-8"?>==; ./test.py` >
test.txttest.xsd; echo ===
<test>abcde</test>
=====o=====
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="test" type="xs:NCName"/>
</xs:schema>
=====o=====
#!/usr/bin/python2.6
# I wish to validate an xml file against an external XSD schema.
from amara import bindery, parse
source = 'test.xml'
schema = 'test.xsd'
#help(bindery.parse)
#doc = bindery.parse(source, uri=schema, validate=True) # These 2 seem
to fail in the same way.
doc = parse(source, uri=schema, validate=True) # So, what is the
difference anyway?
#
=====o=====
Traceback (most recent call last):
File "./test.py", line 14, in <module>
doc = parse(source, uri=schema, validate=True)
File "/usr/local/lib/python2.6/dist-packages/Amara-2.0a4-py2.6-linux-
x86_64.egg/amara/tree.py", line 50, in parse
return _parse(inputsource(obj, uri), flags,
entity_factory=entity_factory)
amara.ReaderError: In file:///home/g/test.xml, line 2, column 0:
Missing document type declaration
g@spot:~$
=====o=====
那么,为什么我会看到此错误?不支持这个功能吗? 如何在拥有以下内容的同时根据 XSD 验证 XML 文件 是否可以灵活地指向任何 XSD 文件? 谢谢,如果您有疑问,请告诉我。
High bounty for the following Q:
Hello,
Here is what I tried on Ubuntu 9.10 using Python 2.6, Amara2
(by the way, test.xsd was created using xml2xsd tool):
g@spot:~$ cat test.xml; echo =====o=====; cat test.xsd; echo ====
o=====; cat test.py; echo =====o=====; ./test.py; echo =====o=====
<?xml version="1.0" encoding="utf-8"?>==; ./test.py` >
test.txttest.xsd; echo ===
<test>abcde</test>
=====o=====
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified">
<xs:element name="test" type="xs:NCName"/>
</xs:schema>
=====o=====
#!/usr/bin/python2.6
# I wish to validate an xml file against an external XSD schema.
from amara import bindery, parse
source = 'test.xml'
schema = 'test.xsd'
#help(bindery.parse)
#doc = bindery.parse(source, uri=schema, validate=True) # These 2 seem
to fail in the same way.
doc = parse(source, uri=schema, validate=True) # So, what is the
difference anyway?
#
=====o=====
Traceback (most recent call last):
File "./test.py", line 14, in <module>
doc = parse(source, uri=schema, validate=True)
File "/usr/local/lib/python2.6/dist-packages/Amara-2.0a4-py2.6-linux-
x86_64.egg/amara/tree.py", line 50, in parse
return _parse(inputsource(obj, uri), flags,
entity_factory=entity_factory)
amara.ReaderError: In file:///home/g/test.xml, line 2, column 0:
Missing document type declaration
g@spot:~$
=====o=====
So, why am I seeing this error? Is this functionality not supported?
How can I validate an XML file against an XSD while having the
flexibility to point to any XSD file?
Thanks, and let me know if you have questions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您愿意使用除 amara 之外的其他库,请尝试 lxml。它支持您想要轻松完成的事情:
If you're open to using another library besides amara, try lxml. It supports what you're trying to do pretty easily:
我建议您使用 noNamespaceSchemaLocation 属性将 XML 文件绑定到 XSD 架构。然后,您的 XML 文件 test.xml 将是
文件 test.xsd
应该放置在与 test.xsd 相同的目录中的位置。从 XML 文件引用 XML 模式是通用技术,它应该在 Python 中工作。
优点是您不需要知道每个 XML 文件的架构文件。它将在 XML 文件的解析 (
etree.parse
) 过程中自动找到。I'll recommend you to use noNamespaceSchemaLocation attribute to bind the XML file to the XSD schema. Then your XML file test.xml will be
where the file test.xsd
should be placed in the same directory as the test.xsd. It is general technique to reference the XML schema from the XML file and it should work in Python.
The advantage is that you don't need to know the schema file for every XML file. It will be automatically found during parsing (
etree.parse
) of the XML file.