如何使用 Python 迭代 XML 来测试子节点是否存在(使用 xml.dom.minidom)
我正在使用 Python 和 xml.dom.minidom 来迭代导出的 Excel 电子表格,通过对 .write 的各种调用为我们的餐厅菜单输出 HTML 表。困难在于 Excel 输出的 XML 不是结构化的。为了弥补这一点,我设置了许多变量(day、previousDay、meal 等),当我遇到具有我正在测试的 nodeValue 的子节点时,这些变量就会被设置。我有一堆 if 语句来确定何时启动一个新表(一周中的每一天)或一个新行(当 day != previousDay 时)等等。
不过,我很难弄清楚如何忽略特定节点。有一些节点从 Excel 获取输出,我需要忽略这些节点,我可以根据它们具有特定值的子节点来执行此操作,但我不知道如何实现它。
基本上,我的主 for 循环中需要以下 if 语句:
for node in dome.getElementsByTagName('data'):
if node contains childNode with nodeValue == 'test':
do something
I am using Python, and xml.dom.minidom, to iterate over an exported Excel Spreadsheet, outputting an HTML table for our dining hall menu with various calls to .write. The difficulty lies in that the XML that Excel outputs isn't structured. To compensate for this, I have set up a number of variables (day, previousDay, meal etc.) that get set when I encounter child nodes that have a nodeValue that I am testing against. I have a bunch of if statements to determine when to start a new table (for each day of the week), or a new row (when day != previousDay) and so on.
I am having difficuly in figuring out how to ignore particular nodes though. There are a handful of nodes that get output from Excel that I need to ignore, and I can do this based on their children nodes having particular values, but I can't figure out how to implement it.
Basically, I need the following if statement in my main for loop:
for node in dome.getElementsByTagName('data'):
if node contains childNode with nodeValue == 'test':
do something
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我的快速倾向是使用一个带有 get-out-of-node-free-card (嗯,例外)的嵌套 for 循环,如下所示。
My quick inclination is to have a nested for-loop with a get-out-of-node-free-card (um, exception) like the following.
您必须使用 xml.dom.minidom 吗?因为这正是 XPath 擅长的事情。例如,使用
lxml.etree
可以找到您想要的所有元素:W3C 的 DOM 确实很难用于解决现实世界的问题,因为它不包含诸如属性返回之类的简单内容一个元素的值。 (XPath 声明一个元素的值是连接在一起的所有子文本节点,这就是上述模式起作用的原因。)
您需要为此类事情实现一个辅助函数,例如:
这使得构建过滤功能,例如:
并获取如下元素:
Do you have to use
xml.dom.minidom
? Because this is the kind of thing that XPath shines at. Usinglxml.etree
, for instance, this finds all of the elements you want:The W3C's DOM is really hard to use for real-world problems, because it doesn't include simple things like an attribute returning an element's value. (XPath declares that an element's value is all of its child text nodes concatenated together, which is why the above pattern works.)
You'll need to implement a helper function for that sort of thing, e.g.:
This makes it easier to build a filter function, e.g.:
and get your elements like this:
您是否考虑过使用 SAX 解析器? Sax 解析器按照节点出现的顺序(深度优先)处理 XML 树结构,并允许您在解析节点值时对其进行处理。
xml.sax.XmlReader
Have you considered using a SAX parser instead? Sax parsers process the XML tree structure in the order of appearance of the nodes (depth first) and allows you to handle the node value at the point of parsing it.
xml.sax.XmlReader