python:检查 XSD xml 架构
我想检查 python 中的 XSD 模式。目前,我正在使用 lxml,当它只需要根据模式验证文档时,它就可以很好地完成它的工作。但是,我想知道架构内部有什么并访问 lxml 行为中的元素。
架构:
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:include schemaLocation="worker_remote_base.xsd"/>
<xsd:include schemaLocation="transactions_worker_responses.xsd"/>
<xsd:include schemaLocation="transactions_worker_requests.xsd"/>
</xsd:schema>
用于加载架构的 lxml 代码是(简化的):
xsd_file_handle = open( self._xsd_file, 'rb')
xsd_text = xsd_file_handle.read()
schema_document = etree.fromstring(xsd_text, base_url=xmlpath)
xmlschema = etree.XMLSchema(schema_document)
然后我可以使用 schema_document
(即 etree._Element
)来浏览架构,如下所示XML 文档。但由于 etree.fromstring
(至少看起来是这样)需要 XML 文档,因此 xsd:include
元素不会被处理。
目前,问题是通过解析第一个架构文档,然后加载包含元素,然后手动将它们一个一个插入到主文档中来解决的:
BASE_URL = "/xml/"
schema_document = etree.fromstring(xsd_text, base_url=BASE_URL)
tree = schema_document.getroottree()
schemas = []
for schemaChild in schema_document.iterchildren():
if schemaChild.tag.endswith("include"):
try:
h = open (os.path.join(BASE_URL, schemaChild.get("schemaLocation")), "r")
s = etree.fromstring(h.read(), base_url=BASE_URL)
schemas.append(s)
except Exception as ex:
print "failed to load schema: %s" % ex
finally:
h.close()
# remove the <xsd:include ...> element
self._schema_document.remove(schemaChild)
for s in schemas:
# inside <schema>
for sChild in s:
schema_document.append(sChild)
我要求的是如何通过使用更常见的方法来解决问题的想法方式。我已经在 python 中搜索了其他模式解析器,但目前没有任何适合这种情况的解析器。
问候,
I would like to examine a XSD schema in python. Currently I'm using lxml which is doing it's job very very well when it only has to validate a document against the schema. But, I want to know what's inside of the schema and access the elements in the lxml behavior.
The schema:
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:include schemaLocation="worker_remote_base.xsd"/>
<xsd:include schemaLocation="transactions_worker_responses.xsd"/>
<xsd:include schemaLocation="transactions_worker_requests.xsd"/>
</xsd:schema>
The lxml code to load the schema is (simplyfied):
xsd_file_handle = open( self._xsd_file, 'rb')
xsd_text = xsd_file_handle.read()
schema_document = etree.fromstring(xsd_text, base_url=xmlpath)
xmlschema = etree.XMLSchema(schema_document)
I'm then able to use schema_document
(which is etree._Element
) to go through the schema as an XML document. But since etree.fromstring
(at least it seems like that) expects a XML document the xsd:include
elements are not processed.
The problem is currently solved by parsing the first schema document, then load the include elements and then insert them one by one into the main document by hand:
BASE_URL = "/xml/"
schema_document = etree.fromstring(xsd_text, base_url=BASE_URL)
tree = schema_document.getroottree()
schemas = []
for schemaChild in schema_document.iterchildren():
if schemaChild.tag.endswith("include"):
try:
h = open (os.path.join(BASE_URL, schemaChild.get("schemaLocation")), "r")
s = etree.fromstring(h.read(), base_url=BASE_URL)
schemas.append(s)
except Exception as ex:
print "failed to load schema: %s" % ex
finally:
h.close()
# remove the <xsd:include ...> element
self._schema_document.remove(schemaChild)
for s in schemas:
# inside <schema>
for sChild in s:
schema_document.append(sChild)
What I'm asking for is an idea how to solve the problem by using a more common way. I've already searched for other schema parsers in python but for now there was nothing that would fit in that case.
Greetings,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
PyXB 可以处理 xsd:include。我将 PyXB 用于 Amazon.com 的巨大产品架构文件,其中包含的文件包含多个级别的更多 xsd 文件。强烈推荐。
PyXB can process xsd:include. I used PyXB for Amazon.com's huge product schema files where included file includes further xsd files at multiple levels. Highly recommended.