如何在Python中获取没有根节点的XML

发布于 2024-12-04 15:11:10 字数 3708 浏览 0 评论 0原文

给定以下数据:

<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.
<channel rdf:about="http://www.gmanews.tv/">
        <title>GMANews.TV</title>
        <description> GMA News.tv bring you the latest news from GMA News teams and highlights of your favorite shows. Subscribe now and stay up-to-date with GMA News.tv.</description>
        <link>http://www.gmanews.tv/</link>
</channel>

<item rdf:about="http://www.gmanews.tv/story/232365/world/magnitude-59-quake-hits-chilean-coast-no-damage">
        <dc:format>text/html</dc:format>
        <dc:date>2011-09-14T16:39:22+08:00</dc:date>
        <dc:source>http://www.gmanews.tv/story/232365/world/magnitude-59-quake-hits-chilean-coast-no-damage </dc:source>
                <title><![CDATA[Magnitude-5.9 quake hits Chilean coast, no damage]]></title>
        <link>http://www.gmanews.tv/story/232365/world/magnitude-59-quake-hits-chilean-coast-no-damage </link>
        <description><![CDATA[SANTIAGO - A magnitude 5.9 quake hit just off the coast of central Chile early on Wednesday, but the state emergency office said there were no reports of damage.]]></description>
    </item>
        <item rdf:about="http://www.gmanews.tv/story/232362/nation/house-minority-blames-pnoys-advisers-for-legal-setbacks">
        <dc:format>text/html</dc:format>
        <dc:date>2011-09-14T16:04:51+08:00</dc:date>
        <dc:source>http://www.gmanews.tv/story/232362/nation/house-minority-blames-pnoys-advisers-for-legal-setbacks </dc:source>
                <title><![CDATA[House minority blames PNoy's advisers for legal 'setbacks']]></title>
        <link>http://www.gmanews.tv/story/232362/nation/house-minority-blames-pnoys-advisers-for-legal-setbacks </link>
        <description><![CDATA[Members of the opposition at the House of Representatives on Wednesday blamed President Benigno Aquino III's advisers for the various legal "setbacks&quot; suffered by his administration and advised him to consider replacing some of his advisers.]]></description>
    </item>
        <item rdf:about="http://www.gmanews.tv/story/232356/nation/ex-sharia-judge-20-others-may-testify-in-poll-fraud-probe">
        <dc:format>text/html</dc:format>
        <dc:date>2011-09-14T15:19:45+08:00</dc:date>
        <dc:source>http://www.gmanews.tv/story/232356/nation/ex-sharia-judge-20-others-may-testify-in-poll-fraud-probe </dc:source>
                <title><![CDATA[Ex-Shari'a judge, 20 others may testify in poll fraud probe]]></title>
        <link>http://www.gmanews.tv/story/232356/nation/ex-sharia-judge-20-others-may-testify-in-poll-fraud-probe </link>
        <description><![CDATA[The former Shari'a court judge who claimed to have helped Gloria Macapagal-Arroyo cheat in the 2004 presidential elections and at least 20 others may serve as witnesses in the joint investigation by the Commission on Elections and Department of Justice on the alleged poll fraud, Comelec chief Sixto Brillantes Jr. said Wednesday.]]></description>
    </item>
</rdf:RDF>

现在我想获取 标签内所有元素的详细信息。这很简单,但我是 python 新手。我不太确定如何解析 rdf,然后提取其中的所有

编辑: 我无法使用任何第三方库,因为我的脚本将在嵌入式系统上运行。

Given the following data:

<rdf:RDF
    xmlns="http://purl.org/rss/1.0/"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
    xmlns:dc="http://purl.org/dc/elements/1.
<channel rdf:about="http://www.gmanews.tv/">
        <title>GMANews.TV</title>
        <description> GMA News.tv bring you the latest news from GMA News teams and highlights of your favorite shows. Subscribe now and stay up-to-date with GMA News.tv.</description>
        <link>http://www.gmanews.tv/</link>
</channel>

<item rdf:about="http://www.gmanews.tv/story/232365/world/magnitude-59-quake-hits-chilean-coast-no-damage">
        <dc:format>text/html</dc:format>
        <dc:date>2011-09-14T16:39:22+08:00</dc:date>
        <dc:source>http://www.gmanews.tv/story/232365/world/magnitude-59-quake-hits-chilean-coast-no-damage </dc:source>
                <title><![CDATA[Magnitude-5.9 quake hits Chilean coast, no damage]]></title>
        <link>http://www.gmanews.tv/story/232365/world/magnitude-59-quake-hits-chilean-coast-no-damage </link>
        <description><![CDATA[SANTIAGO - A magnitude 5.9 quake hit just off the coast of central Chile early on Wednesday, but the state emergency office said there were no reports of damage.]]></description>
    </item>
        <item rdf:about="http://www.gmanews.tv/story/232362/nation/house-minority-blames-pnoys-advisers-for-legal-setbacks">
        <dc:format>text/html</dc:format>
        <dc:date>2011-09-14T16:04:51+08:00</dc:date>
        <dc:source>http://www.gmanews.tv/story/232362/nation/house-minority-blames-pnoys-advisers-for-legal-setbacks </dc:source>
                <title><![CDATA[House minority blames PNoy's advisers for legal 'setbacks']]></title>
        <link>http://www.gmanews.tv/story/232362/nation/house-minority-blames-pnoys-advisers-for-legal-setbacks </link>
        <description><![CDATA[Members of the opposition at the House of Representatives on Wednesday blamed President Benigno Aquino III's advisers for the various legal "setbacks" suffered by his administration and advised him to consider replacing some of his advisers.]]></description>
    </item>
        <item rdf:about="http://www.gmanews.tv/story/232356/nation/ex-sharia-judge-20-others-may-testify-in-poll-fraud-probe">
        <dc:format>text/html</dc:format>
        <dc:date>2011-09-14T15:19:45+08:00</dc:date>
        <dc:source>http://www.gmanews.tv/story/232356/nation/ex-sharia-judge-20-others-may-testify-in-poll-fraud-probe </dc:source>
                <title><![CDATA[Ex-Shari'a judge, 20 others may testify in poll fraud probe]]></title>
        <link>http://www.gmanews.tv/story/232356/nation/ex-sharia-judge-20-others-may-testify-in-poll-fraud-probe </link>
        <description><![CDATA[The former Shari'a court judge who claimed to have helped Gloria Macapagal-Arroyo cheat in the 2004 presidential elections and at least 20 others may serve as witnesses in the joint investigation by the Commission on Elections and Department of Justice on the alleged poll fraud, Comelec chief Sixto Brillantes Jr. said Wednesday.]]></description>
    </item>
</rdf:RDF>

Now I want to get the details of all the elements inside <item> tag . This is trivial but I am new to python . I am not quite sure how I am going to parse rdf and then extract all the <item> inside.

Edit:
I can not use any third partie libraries as my script is going to run on embeded system.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

潜移默化 2024-12-11 15:11:10

lxml 提供了一种处理所有 XML 内容的好方法。您发布的 XML 示例:

from lxml import etree

document = etree.parse('your-example-xml.rdf')
root = document.getroot()

# Namespace shortcuts
ns = root.nsmap.get(None)
rdf = root.nsmap.get('rdf')

for item in root.xpath('purl:item', namespaces={'purl': ns}):
    print item.attrib.get('{%s}about' % rdf)
    print item.xpath('purl:description/text()', namespaces={'purl': ns})
    print

但是,如果您仅解析 RDF,则可能有可用的 RDF 特定库。

lxml provides a nice way to handle all things XML. An example for the XML you posted:

from lxml import etree

document = etree.parse('your-example-xml.rdf')
root = document.getroot()

# Namespace shortcuts
ns = root.nsmap.get(None)
rdf = root.nsmap.get('rdf')

for item in root.xpath('purl:item', namespaces={'purl': ns}):
    print item.attrib.get('{%s}about' % rdf)
    print item.xpath('purl:description/text()', namespaces={'purl': ns})
    print

However, if it's only RDF you are parsing there might be RDF specific libraries available.

幸福不弃 2024-12-11 15:11:10

由于第三方库不是一个选项,这里是使用 Python 内置库完成的相同代码元素树

from xml.etree import ElementTree as etree

document = etree.parse(open('your-example-xml.rdf'))
root = document.getroot()

ns_purl = 'http://purl.org/rss/1.0/'
ns_rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'

for item in root.findall('{%s}item' % ns_purl):
    print item.attrib.get('{%s}about' % ns_rdf)
    print item.find('{%s}description' % ns_purl).text
    print

Since third party libraries are not an option, here's the same code done with Python's built-in ElementTree:

from xml.etree import ElementTree as etree

document = etree.parse(open('your-example-xml.rdf'))
root = document.getroot()

ns_purl = 'http://purl.org/rss/1.0/'
ns_rdf = 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'

for item in root.findall('{%s}item' % ns_purl):
    print item.attrib.get('{%s}about' % ns_rdf)
    print item.find('{%s}description' % ns_purl).text
    print
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文