当前位置：文江博客话题详情

XML xpath nokogiri

根节点的xpath属性

发布于 2025-01-06 21:29:29 字数 3048 浏览 0 评论 0 原文

我想获取根元素中的 ID、LASTEDITED、EXPIRESS 属性。我正在使用 xpath、ruby 和 nokogiri。但这不起作用，有什么想法吗？

xPath 查询：

  doc.xpath('/educationProvider/@id').each do |id_node| 
    puts node.content
  end

  doc.xpath('/educationProvider/@lastEdited').each do |lastedited_node|
    puts lastedited_node.content
  end

  doc.xpath('/educationProvider/@expires').each do |expires_node|
    puts expires_node.content
  end

我的 XML 如下所示：

<?xml version="1.0" encoding="UTF-8"?>
<p:educationProvider xmlns:p="http://skolverket.se/education/provider/1.0" xmlns="http://skolverket.se/education/commontypes/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" expires="2015-01-31" id="provider.uh.msb" lastEdited="2012-11-01T12:51:37" xsi:schemaLocation="http://skolverket.se/education/provider/1.0 educationProvider.xsd">
        <p:vCard>
            <VERSION/> 
            <FN/> 
            <N/> 
            <ADR>
                <LOCALITY>KARLSTAD</LOCALITY> 
                <PCODE>651 81</PCODE> 
            </ADR>
            <TEL>
                <NUMBER>0771-240240</NUMBER> 
            </TEL>
            <EMAIL>
                <USERID>[email protected]</USERID> 
            </EMAIL>
            <ORG>
                <ORGNAME>Myndigheten för samhällsskydd och beredskap</ORGNAME> 
            </ORG>
            <URL>http://www.msbmyndigheten.se</URL>
        </p:vCard>
    </p:educationProvider>

这是我的 RUBY 脚本：

require 'rubygems'
require 'nokogiri'
require 'open-uri'

# parse the HTML document with all the links to the XML files.
doc = Nokogiri::HTML(open('http://testnavet.skolverket.se/SusaNavExport/EmilExporter?GetEvent&EMILVersion=1.1&NotExpired&EIAcademicType=UoH&SelectEP'))
# URLS - array
@urls = Array.new 
#Get all XML-urls and save them in urls-array
doc.xpath('//a/@href').each do |links|
  @urls << links.content
end

@id = Array.new
@lastedited = Array.new
@expires = Array.new

# loop all the url of the XML files
@urls.each do |url|
  doc = Nokogiri::HTML(open(url))
  # grab the content I want
  doc.xpath('/educationProvider/@id').each do |id_node| 
    id_node.content
  end

  doc.xpath('/educationProvider/@lastEdited').each do |lastedited_node|
    @lastedited << lastedited_node.content
  end

  doc.xpath('/educationProvider/@expires').each do |expires_node|
    @expires << expires_node.content
  end
end

#print it out
([email protected] - 1).each do |index|
  puts "ID: #{@id[index]}"
  puts "Lastedited: #{@lastedited[index]}"
  puts "Expiress: #{@expires[index]}"
end

原文

I wan to fetch the ID, LASTEDITED, EXPIRESS attributes in the root element. I am using xpath, ruby and nokogiri. But it dosent work, any ideas?

xPath querys:

  doc.xpath('/educationProvider/@id').each do |id_node| 
    puts node.content
  end

  doc.xpath('/educationProvider/@lastEdited').each do |lastedited_node|
    puts lastedited_node.content
  end

  doc.xpath('/educationProvider/@expires').each do |expires_node|
    puts expires_node.content
  end

This how my XML looks like:

<?xml version="1.0" encoding="UTF-8"?>
<p:educationProvider xmlns:p="http://skolverket.se/education/provider/1.0" xmlns="http://skolverket.se/education/commontypes/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" expires="2015-01-31" id="provider.uh.msb" lastEdited="2012-11-01T12:51:37" xsi:schemaLocation="http://skolverket.se/education/provider/1.0 educationProvider.xsd">
        <p:vCard>
            <VERSION/> 
            <FN/> 
            <N/> 
            <ADR>
                <LOCALITY>KARLSTAD</LOCALITY> 
                <PCODE>651 81</PCODE> 
            </ADR>
            <TEL>
                <NUMBER>0771-240240</NUMBER> 
            </TEL>
            <EMAIL>
                <USERID>[email protected]</USERID> 
            </EMAIL>
            <ORG>
                <ORGNAME>Myndigheten för samhällsskydd och beredskap</ORGNAME> 
            </ORG>
            <URL>http://www.msbmyndigheten.se</URL>
        </p:vCard>
    </p:educationProvider>

HERE IS MY RUBY-SCRIPT:

require 'rubygems'
require 'nokogiri'
require 'open-uri'

# parse the HTML document with all the links to the XML files.
doc = Nokogiri::HTML(open('http://testnavet.skolverket.se/SusaNavExport/EmilExporter?GetEvent&EMILVersion=1.1&NotExpired&EIAcademicType=UoH&SelectEP'))
# URLS - array
@urls = Array.new 
#Get all XML-urls and save them in urls-array
doc.xpath('//a/@href').each do |links|
  @urls << links.content
end

@id = Array.new
@lastedited = Array.new
@expires = Array.new

# loop all the url of the XML files
@urls.each do |url|
  doc = Nokogiri::HTML(open(url))
  # grab the content I want
  doc.xpath('/educationProvider/@id').each do |id_node| 
    id_node.content
  end

  doc.xpath('/educationProvider/@lastEdited').each do |lastedited_node|
    @lastedited << lastedited_node.content
  end

  doc.xpath('/educationProvider/@expires').each do |expires_node|
    @expires << expires_node.content
  end
end

#print it out
([email protected] - 1).each do |index|
  puts "ID: #{@id[index]}"
  puts "Lastedited: #{@lastedited[index]}"
  puts "Expiress: #{@expires[index]}"
end

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏花。依旧 2025-01-13 21:29:29

我想获取根目录中的 ID、LASTEDITED、EXPIRESS 属性
元素。

Just use：

/*/@id

选择 XML 文档顶部元素的 id 属性。

/*/@lastEdited

这将选择 XML 文档顶部元素的 lastEdited 属性。

/*/@expires

这将选择 XML 文档顶部元素的 expires 属性。

或者，可以使用单个 XPath 表达式选择所有这三个属性：

/*/@*[contains('|id|lastEdited|expires|', 
               concat('|', name(), '|')
               )
     ]

基于 XSLT 的验证：

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:for-each select=
  "/*/@*[contains('|id|lastEdited|expires|',
                  concat('|', name(), '|')
                  )
         ]">
   <xsl:value-of select=
   "concat('
',
           name(),
           ' = ',
           .
          )"/>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

当此 XSLT 转换应用于提供的 XML 文档时 strong>：

<p:educationProvider xmlns:p="http://skolverket.se/education/provider/1.0" xmlns="http://skolverket.se/education/commontypes/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" expires="2015-01-31" id="provider.uh.msb" lastEdited="2012-11-01T12:51:37" xsi:schemaLocation="http://skolverket.se/education/provider/1.0 educationProvider.xsd">
    <p:vCard>
        <VERSION/>
        <FN/>
        <N/>
        <ADR>
            <LOCALITY>KARLSTAD</LOCALITY>
            <PCODE>651 81</PCODE>
        </ADR>
        <TEL>
            <NUMBER>0771-240240</NUMBER>
        </TEL>
        <EMAIL>
            <USERID>[email protected]</USERID>
        </EMAIL>
        <ORG>
            <ORGNAME>Myndigheten för samhällsskydd och beredskap</ORGNAME>
        </ORG>
        <URL>http://www.msbmyndigheten.se</URL>
    </p:vCard>
</p:educationProvider>

计算 Xpath 表达式，并为每个选定的属性输出其名称和值：

expires = 2015-01-31
id = provider.uh.msb
lastEdited = 2012-11-01T12:51:37

I wan to fetch the ID, LASTEDITED, EXPIRESS attributes in the root
element.

Just use:

/*/@id

This selects the id attribute of the top element of the XML document.

/*/@lastEdited

This selects the lastEdited attribute of the top element of the XML document.

/*/@expires

This selects the expires attribute of the top element of the XML document.

Alternatively, all these three attributes can be selected with a single XPath expression:

/*/@*[contains('|id|lastEdited|expires|', 
               concat('|', name(), '|')
               )
     ]

XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:for-each select=
  "/*/@*[contains('|id|lastEdited|expires|',
                  concat('|', name(), '|')
                  )
         ]">
   <xsl:value-of select=
   "concat('
',
           name(),
           ' = ',
           .
          )"/>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

when this XSLT transformation is applied on the provided XML document:

<p:educationProvider xmlns:p="http://skolverket.se/education/provider/1.0" xmlns="http://skolverket.se/education/commontypes/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" expires="2015-01-31" id="provider.uh.msb" lastEdited="2012-11-01T12:51:37" xsi:schemaLocation="http://skolverket.se/education/provider/1.0 educationProvider.xsd">
    <p:vCard>
        <VERSION/>
        <FN/>
        <N/>
        <ADR>
            <LOCALITY>KARLSTAD</LOCALITY>
            <PCODE>651 81</PCODE>
        </ADR>
        <TEL>
            <NUMBER>0771-240240</NUMBER>
        </TEL>
        <EMAIL>
            <USERID>[email protected]</USERID>
        </EMAIL>
        <ORG>
            <ORGNAME>Myndigheten för samhällsskydd och beredskap</ORGNAME>
        </ORG>
        <URL>http://www.msbmyndigheten.se</URL>
    </p:vCard>
</p:educationProvider>

the Xpath expression is evaluated and for each of the selected attributes their name and value are output:

expires = 2015-01-31
id = provider.uh.msb
lastEdited = 2012-11-01T12:51:37

回复收藏 0 原文

提笔书几行 2025-01-13 21:29:29

如果只想访问文档中的根节点，可以这样做：

root = doc.root
root_id = root['id']
last_edited = root['lastEdited']

如果需要用XPath查找，则需要使用正确的命名空间。您的根节点的命名空间为“p”，因此您必须执行以下操作：

doc.xpath('/p:educationProvider/@id').first.value

注意节点名称前面的 p:。

If you just want to access the root node in the document, you can do this:

root = doc.root
root_id = root['id']
last_edited = root['lastEdited']

If you need to find it with XPath, you need to use the correct namespace. Your root node has a namespace of "p", so you must do this:

doc.xpath('/p:educationProvider/@id').first.value

Notice the p: that is preceding your node name.

回复收藏 0 原文

~没有更多了~

关于作者

仙女山的月亮

暂无简介

文章

25 人气

关注发私信

卷耳

文章 0 评论 0

关注

佚名

文章 0 评论 0

关注

℉服软

文章 0 评论 0

关注

qq_2gSKZM

文章 0 评论 0

关注

凉宸

文章 0 评论 0

关注

gyhjy

文章 0 评论 0

友情链接

文江博客

根节点的xpath属性

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

卷耳

佚名

℉服软

qq_2gSKZM

凉宸

gyhjy

友情链接

根节点的xpath属性

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

卷耳

佚名

℉服软

qq_2gSKZM

凉宸

gyhjy

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。