将 XML 目录与 Python 的 lxml 结合使用?
当我使用 lxml 解析 XML 文档时,有没有办法使用外部目录文件根据其 DTD 验证该文档? 我需要能够使用文档 DTD 中定义的固定属性。
Is there a way, when I parse an XML document using lxml, to validate that document against its DTD using an external catalog file? I need to be able to work the fixed attributes defined in a document’s DTD.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你能给个例子吗? 根据 lxml 验证文档,lxml 可以处理 DTD 验证(在 XML 文档中或外部指定)在代码中)和系统目录,涵盖了我能想到的大多数情况。
Can you give an example? According to the lxml validation docs, lxml can handle DTD validation (specified in the XML doc or externally in code) and system catalogs, which covers most cases I can think of.
看来lxml没有公开这个libxml2功能,grep源只会出现一些用于错误处理的#define:
来自 libxml2 页面中的目录实现 通过安装在 /etc/xml/catalog 中进行的“透明”处理似乎仍然可以在 lxml 中工作,但如果您需要更多,您可以随时放弃 lxml 并使用默认值python 绑定,它公开了目录函数。
It seems that lxml does not expose this libxml2 feature, grepping the source only turns up some #defines for the error handling:
From the catalog implementation in libxml2 page it seems possible that the 'transparent' handling through installation in /etc/xml/catalog may still work in lxml, but if you need more than that you can always abandon lxml and use the default python bindings, which do expose the catalog functions.
您可以将目录添加到
XML_CATALOG_FILES
环境变量:请参阅
You can add the catalog to the
XML_CATALOG_FILES
environment variable:See this thread. Note that entries in
XML_CATALOG_FILES
are space-separated URLs. You can use Python'spathname2url
andurljoin
(withfile:
) to generate the URL from a pathname.