使用 lxml 向现有元素添加属性、删除元素等

发布于 2024-09-08 20:11:01 字数 393 浏览 1 评论 0原文

我使用“

from lxml import etree

tree = etree.parse('test.xml', etree.XMLParser())

现在我想处理已解析的 XML”来解析 XML。我在删除具有命名空间的元素或仅删除一般元素时遇到问题,例如

<rdf:description><dc:title>Example</dc:title></rdf:description>

我想删除整个元素以及标签内的所有内容。我还想向现有元素添加属性。我需要的方法位于 Element 类中,但我不知道如何将其与此处的 ElementTree 对象一起使用。任何指点都将不胜感激,谢谢

I parse in the XML using

from lxml import etree

tree = etree.parse('test.xml', etree.XMLParser())

Now I want to work on the parsed XML. I'm having trouble removing elements with namespaces or just elements in general such as

<rdf:description><dc:title>Example</dc:title></rdf:description>

and I want to remove that entire element as well as everything within the tags. I also want to add attributes to existing elements as well. The methods I need are in the Element class but I have no idea how to use that with the ElementTree object here. Any pointers would definitely be appreciated, thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

扬花落满肩 2024-09-15 20:11:01

您可以通过以下调用获取根元素:root=tree.getroot()

使用该根元素,您可以使用 findall() 并删除符合您条件的元素:

deleteThese = root.findall("title")
for element in deleteThese: root.remove(element)

最后,您可以使用以下命令查看新树的外观: etree.tostring(root, Pretty_print=True)

以下是有关 find/findall 如何工作的一些信息:
http://infohost.nmt.edu/tcc /help/pubs/pylxml/class-ElementTree.html#ElementTree-find

要向元素添加属性,请尝试如下操作:

root.attrib['myNewAttribute']='hello world'

You can get to the root element via this call: root=tree.getroot()

Using that root element, you can use findall() and remove elements that match your criteria:

deleteThese = root.findall("title")
for element in deleteThese: root.remove(element)

Finally, you can see what your new tree looks like with this: etree.tostring(root, pretty_print=True)

Here is some info about how find/findall work:
http://infohost.nmt.edu/tcc/help/pubs/pylxml/class-ElementTree.html#ElementTree-find

To add an attribute to an element, try something like this:

root.attrib['myNewAttribute']='hello world'
一曲爱恨情仇 2024-09-15 20:11:01

remove 方法应该执行您想要的操作:

>>> from lxml import etree
>>> from StringIO import StringIO

>>> s = '<Root><Description><Title>foo</Title></Description></Root>'
>>> tree = etree.parse(StringIO(s))

>>> print(etree.tostring(tree.getroot()))
<Root><Description><Title>foo</Title></Description></Root>

>>> title = tree.find('//Title')
>>> title.getparent().remove(title)
>>> etree.tostring(tree.getroot())
'<Root><Description/></Root>'

>>> print(etree.tostring(tree.getroot()))
<Root><Description/></Root>

The remove method should do what you want:

>>> from lxml import etree
>>> from StringIO import StringIO

>>> s = '<Root><Description><Title>foo</Title></Description></Root>'
>>> tree = etree.parse(StringIO(s))

>>> print(etree.tostring(tree.getroot()))
<Root><Description><Title>foo</Title></Description></Root>

>>> title = tree.find('//Title')
>>> title.getparent().remove(title)
>>> etree.tostring(tree.getroot())
'<Root><Description/></Root>'

>>> print(etree.tostring(tree.getroot()))
<Root><Description/></Root>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文