用于查询多个选择器的 XPath

发布于 2024-12-29 15:29:34 字数 1264 浏览 0 评论 0原文

我想从选择器获取值和属性 然后根据查询获取其子级的属性和值。

请允许我举个例子。

这就是结构

<div class='message'>
   <div>
   <a href='http://www.whatever.com'>Text</a>
   </div>

   <div>
    <img src='image_link.jpg' />
   </div>

</div>

<div class='message'>
   <div>
   <a href='http://www.whatever2.com'>Text2</a>
   </div>

   <div>
    <img src='image_link2.jpg' />
   </div>

</div>

所以我想进行一次查询来匹配所有这些。

像这样的事情:

 //$dom is the DomDocument() set up after loaded HTML with $dom->loadHTML($html);
$dom_xpath = new DOMXpath($dom);
$elements = $dom_xpath->query('//div[@class="message"], //div[@class="message"] //a, //div[@class="message"] //img');

foreach($elements as $ele){
   echo $ele[0]->getAttribute('class'); //it should return 'message'
   echo $ele[1]->getAttribute('href'); //it should return 'http://www.whatever.com' in the 1st loop, and 'http://www.whatever2.com' in the second loop
   echo $ele[2]->getAttribute('src'); //it should return image_link.jpg in the 1st loop and 'image_link2.jpg' in the second loop
}

有没有像我在示例中所做的那样使用多个 xpath 选择器来做到这一点?以避免一直进行查询并节省一些 CPU。

I want to get values and attributes from a selector
and then get attributes and values of its children based on a query.

allow me to give an example.

this is the structure

<div class='message'>
   <div>
   <a href='http://www.whatever.com'>Text</a>
   </div>

   <div>
    <img src='image_link.jpg' />
   </div>

</div>

<div class='message'>
   <div>
   <a href='http://www.whatever2.com'>Text2</a>
   </div>

   <div>
    <img src='image_link2.jpg' />
   </div>

</div>

So I would like to make a query to match all of those once.

Something like this:

 //$dom is the DomDocument() set up after loaded HTML with $dom->loadHTML($html);
$dom_xpath = new DOMXpath($dom);
$elements = $dom_xpath->query('//div[@class="message"], //div[@class="message"] //a, //div[@class="message"] //img');

foreach($elements as $ele){
   echo $ele[0]->getAttribute('class'); //it should return 'message'
   echo $ele[1]->getAttribute('href'); //it should return 'http://www.whatever.com' in the 1st loop, and 'http://www.whatever2.com' in the second loop
   echo $ele[2]->getAttribute('src'); //it should return image_link.jpg in the 1st loop and 'image_link2.jpg' in the second loop
}

Is there some way of doing that using multiple xpath selectors like I did in the example? to avoid making queries all the time and save some CPU.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

没企图 2025-01-05 15:29:34

在单个表达式中使用联合运算符 (|),如下所示:

//div[@class="message"]|//div[@class="message"]//a|//div[@class="message"]//img

请注意,这将返回展平的结果集(可以这么说)。换句话说,您不会像示例所示那样访问三个一组的元素。相反,您只需迭代表达式匹配的所有内容(按文档顺序)。因此,简单地迭代 //div[@class="message"] 返回的节点并使用 DOM 方法访问它们的子节点(对于其他元素)可能会更聪明。

Use the union operator (|) in a single expression like this:

//div[@class="message"]|//div[@class="message"]//a|//div[@class="message"]//img

Note that this will return a flattened result set (so to speak). In other words, you won't access the elements in groups of three like your example shows. Instead, you'll just iterate everything the expressions matched (in document order). For this reason, it might be even smarter to simply iterate the nodes returned by //div[@class="message"] and use DOM methods to access their children (for the other elements).

↘紸啶 2025-01-05 15:29:34

使用

(//div[@class='message'])[$k]//@*

这将选择属于其 class 属性的文档中第 $k-th div(及其任何后代)的所有三个属性具有字符串值 "message"

您可以评估 N 个此类 XPath 表达式 - 对于 $k 从 1 到 N ,其中 N 是总计数//div[@class='message']

基于 XSLT 的验证

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:for-each select="//div[@class='message']">
    <xsl:variable name="vPos" select="position()"/>

    <xsl:apply-templates select=
    "(//div[@class='message'])[0+$vPos]//@*"/>
 ================
  </xsl:for-each>
 </xsl:template>

 <xsl:template match="@*">
  <xsl:value-of select=
  "concat('name = ', name(), ' value = ', ., '
')"/>
 </xsl:template>
</xsl:stylesheet>

何时将此转换应用于提供的 XML 文档(包含在单个顶部元素变得格式良好):

<html>
    <div class='message'>
        <div>
            <a href='http://www.whatever.com'>Text</a>
        </div>
        <div>
            <img src='image_link.jpg' />
        </div>
    </div>
    <div class='message'>
        <div>
            <a href='http://www.whatever2.com'>Text2</a>
        </div>
        <div>
            <img src='image_link2.jpg' />
        </div>
    </div>
</html>

XPath 表达式计算两次,选定的属性被格式化并输出

name = class value = message
name = href value = http://www.whatever.com
name = src value = image_link.jpg

 ================
name = class value = message
name = href value = http://www.whatever2.com
name = src value = image_link2.jpg

 ================

Use:

(//div[@class='message'])[$k]//@*

This selects all three attributes that belong to the $k-th div (and any of its descendants) in the document whose class attribute has string value "message"

You can evaluate N such XPath expressions -- for $k from 1 to N, where N is the total count of //div[@class='message']

XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:for-each select="//div[@class='message']">
    <xsl:variable name="vPos" select="position()"/>

    <xsl:apply-templates select=
    "(//div[@class='message'])[0+$vPos]//@*"/>
 ================
  </xsl:for-each>
 </xsl:template>

 <xsl:template match="@*">
  <xsl:value-of select=
  "concat('name = ', name(), ' value = ', ., '
')"/>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the provided XML document (wrapped in a single top element to become well-formed):

<html>
    <div class='message'>
        <div>
            <a href='http://www.whatever.com'>Text</a>
        </div>
        <div>
            <img src='image_link.jpg' />
        </div>
    </div>
    <div class='message'>
        <div>
            <a href='http://www.whatever2.com'>Text2</a>
        </div>
        <div>
            <img src='image_link2.jpg' />
        </div>
    </div>
</html>

The XPath expression is evaluated twice and the selected attributes are formatted and output:

name = class value = message
name = href value = http://www.whatever.com
name = src value = image_link.jpg

 ================
name = class value = message
name = href value = http://www.whatever2.com
name = src value = image_link2.jpg

 ================
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文