在 python lxml 中查找前缀标签的技巧?

发布于 2024-12-08 14:49:53 字数 2257 浏览 1 评论 0原文

我正在尝试使用 lxml 的 ElementTree etree 在我的 xml 文档中查找特定标签。 该标签如下所示:

<text:ageInformation>
    <text:statedAge>12</text:statedAge>
</text:ageInformation>

我希望使用 etree.find('text:statedAge'),但该方法不喜欢 'text' 前缀。 它提到我应该将“文本”添加到前缀映射中,但我不确定如何做到这一点。有什么建议吗?

编辑: 我希望能够写入 hr4e 前缀标签。 以下是该文档的重要部分:

<?xml version="1.0" encoding="utf-8"?>
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
  <header>
    <documentID root="18c41e51-5f4d-4d15-993e-2a932fed720a" />
    <title>Health Records for Everyone Continuity of Care Document</title>
    <version>
  <number>1</number>
</version>
<confidentiality codeSystem="2.16.840.1.113883.5.25" code="N" />
<documentTimestamp value="201105300211+0800" />
<personalInformation>
  <patientInformation>
    <personID root="2.16.840.1.113883.3.881.PI13023911" />
    <personAddress>
      <streetAddressLine nullFlavor="NI" />
      <city>Santa Cruz</city>
      <state nullFlavor="NI" />
      <postalCode nullFlavor="NI" />
    </personAddress>
    <personPhone nullFlavor="NI" />
    <personInformation>
      <personName>
        <given>Benjamin</given>
        <family>Keidan</family>
      </personName>
      <gender codeSystem="2.16.840.1.113883.5.1" code="M" />
      <personDateOfBirth value="NI" />
      <hr4e:ageInformation>
        <hr4e:statedAge>9424</hr4e:statedAge>
        <hr4e:estimatedAge>0912</hr4e:estimatedAge>
        <hr4e:yearInSchool>1</hr4e:yearInSchool>
        <hr4e:statusInSchool>attending</hr4e:statusInSchool>
      </hr4e:ageInformation>
    </personInformation>
    <hr4e:livingSituation>
      <hr4e:homeVillage>Putney</hr4e:homeVillage>
      <hr4e:tribe>Oromo</hr4e:tribe>
    </hr4e:livingSituation>
  </patientInformation>
</personalInformation>

I am trying to using lxml's ElementTree etree to find a specific tag in my xml document.
The tag looks as follows:

<text:ageInformation>
    <text:statedAge>12</text:statedAge>
</text:ageInformation>

I was hoping to use etree.find('text:statedAge'), but that method does not like 'text' prefix.
It mentions that I should add 'text' to the prefix map, but I am not certain how to do it. Any tips?

Edit:
I want to be able to write to the hr4e prefixed tags.
Here are the important parts of the document:

<?xml version="1.0" encoding="utf-8"?>
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata" xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
  <header>
    <documentID root="18c41e51-5f4d-4d15-993e-2a932fed720a" />
    <title>Health Records for Everyone Continuity of Care Document</title>
    <version>
  <number>1</number>
</version>
<confidentiality codeSystem="2.16.840.1.113883.5.25" code="N" />
<documentTimestamp value="201105300211+0800" />
<personalInformation>
  <patientInformation>
    <personID root="2.16.840.1.113883.3.881.PI13023911" />
    <personAddress>
      <streetAddressLine nullFlavor="NI" />
      <city>Santa Cruz</city>
      <state nullFlavor="NI" />
      <postalCode nullFlavor="NI" />
    </personAddress>
    <personPhone nullFlavor="NI" />
    <personInformation>
      <personName>
        <given>Benjamin</given>
        <family>Keidan</family>
      </personName>
      <gender codeSystem="2.16.840.1.113883.5.1" code="M" />
      <personDateOfBirth value="NI" />
      <hr4e:ageInformation>
        <hr4e:statedAge>9424</hr4e:statedAge>
        <hr4e:estimatedAge>0912</hr4e:estimatedAge>
        <hr4e:yearInSchool>1</hr4e:yearInSchool>
        <hr4e:statusInSchool>attending</hr4e:statusInSchool>
      </hr4e:ageInformation>
    </personInformation>
    <hr4e:livingSituation>
      <hr4e:homeVillage>Putney</hr4e:homeVillage>
      <hr4e:tribe>Oromo</hr4e:tribe>
    </hr4e:livingSituation>
  </patientInformation>
</personalInformation>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

优雅的叶子 2024-12-15 14:49:53

命名空间前缀必须在 XML 文档中声明(映射到 URI)。然后您可以使用 {URI}localname 表示法 来查找 text:statedAge 和其他元素。像这样:

from lxml import etree

XML = """
<root xmlns:text="http://example.com">
 <text:ageInformation>
   <text:statedAge>12</text:statedAge>
 </text:ageInformation>
</root>"""

root = etree.fromstring(XML)

ageinfo = root.find("{http://example.com}ageInformation")
age = ageinfo.find("{http://example.com}statedAge")
print age.text

这将打印“12”。

另一种方法:

ageinfo = root.find("text:ageInformation",
                    namespaces={"text": "http://example.com"})
age = ageinfo.find("text:statedAge",
                   namespaces={"text": "http://example.com"})
print age.text

您还可以使用 XPath

age = root.xpath("//text:statedAge",
                 namespaces={"text": "http://example.com"})[0]
print age.text

The namespace prefix must be declared (mapped to an URI) in the XML document. Then you can use the {URI}localname notation to find text:statedAge and other elements. Something like this:

from lxml import etree

XML = """
<root xmlns:text="http://example.com">
 <text:ageInformation>
   <text:statedAge>12</text:statedAge>
 </text:ageInformation>
</root>"""

root = etree.fromstring(XML)

ageinfo = root.find("{http://example.com}ageInformation")
age = ageinfo.find("{http://example.com}statedAge")
print age.text

This will print "12".

Another way of doing it:

ageinfo = root.find("text:ageInformation",
                    namespaces={"text": "http://example.com"})
age = ageinfo.find("text:statedAge",
                   namespaces={"text": "http://example.com"})
print age.text

You can also use XPath:

age = root.xpath("//text:statedAge",
                 namespaces={"text": "http://example.com"})[0]
print age.text
尴尬癌患者 2024-12-15 14:49:53

我最终不得不使用嵌套前缀:

from lxml import etree

XML = """
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata"  xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
<personInformation>
 <hr4e:ageInformation>
   <hr4e:statedAge>12</hr4e:statedAge>
 </hr4e:ageInformation>
</personInformation>
</greenCCD>"""

root = etree.fromstring(XML)
#root = etree.parse("hr4e_patient.xml")

ageinfo = root.find("{AlschulerAssociates::GreenCDA}personInformation/{hr4e::patientdata}ageInformation")
age = ageinfo.find("{hr4e::patientdata}statedAge")
print age.text

I ended up having to use nested prefixes:

from lxml import etree

XML = """
<greenCCD xmlns="AlschulerAssociates::GreenCDA" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:hr4e="hr4e::patientdata"  xsi:schemaLocation="AlschulerAssociates::GreenCDA green_ccd.xsd">
<personInformation>
 <hr4e:ageInformation>
   <hr4e:statedAge>12</hr4e:statedAge>
 </hr4e:ageInformation>
</personInformation>
</greenCCD>"""

root = etree.fromstring(XML)
#root = etree.parse("hr4e_patient.xml")

ageinfo = root.find("{AlschulerAssociates::GreenCDA}personInformation/{hr4e::patientdata}ageInformation")
age = ageinfo.find("{hr4e::patientdata}statedAge")
print age.text
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文