如何使用 XPath 选择以下同级/XML 标记

发布于 2024-09-07 06:30:46 字数 1354 浏览 1 评论 0原文

我有一个 HTML 文件(来自 Newegg),其 HTML 的组织方式如下。规格表中的所有数据都是“desc”,而每个部分的标题都是“name”。下面是来自 Newegg 页面的两个数据示例。

<tr>
    <td class="name">Brand</td>
    <td class="desc">Intel</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Core i5</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">LGA 1156</td>

<tr>
    <td class="name">Brand</td>
    <td class="desc">AMD</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Phenom II X4</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">Socket AM3</td>
</tr>

最后,我希望有一个 CPU 类(已设置),其中包含品牌、系列、核心和套接字类型来存储每个数据。这是我能想到的唯一方法:

if(parsedDocument.xpath(tr/td[@class="name"])=='Brand'):
    CPU.brand = parsedDocument.xpath(tr/td[@class="name"]/nextsibling?).text

并对其余值执行此操作。我将如何完成下一个兄弟姐妹,是否有更简单的方法来做到这一点?

I have an HTML file (from Newegg) and their HTML is organized like below. All of the data in their specifications table is 'desc' while the titles of each section are in 'name.' Below are two examples of data from Newegg pages.

<tr>
    <td class="name">Brand</td>
    <td class="desc">Intel</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Core i5</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">LGA 1156</td>

<tr>
    <td class="name">Brand</td>
    <td class="desc">AMD</td>
</tr>
<tr>
    <td class="name">Series</td>
    <td class="desc">Phenom II X4</td>
</tr>
<tr>
    <td class="name">Cores</td>
    <td class="desc">4</td>
</tr>
<tr>
    <td class="name">Socket</td>
    <td class="desc">Socket AM3</td>
</tr>

In the end I would like to have a class for a CPU (which is already set up) that consists of a Brand, Series, Cores, and Socket type to store each of the data. This is the only way I can think of to go about doing this:

if(parsedDocument.xpath(tr/td[@class="name"])=='Brand'):
    CPU.brand = parsedDocument.xpath(tr/td[@class="name"]/nextsibling?).text

And doing this for the rest of the values. How would I accomplish the nextsibling and is there an easier way of doing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

╭ゆ眷念 2024-09-14 06:30:46

我将如何完成下一个兄弟姐妹
有没有更简单的方法
这个?

您可以使用

tr/td[@class='name']/following-sibling::td

但我宁愿直接使用

tr[td[@class='name'] ='Brand']/td[@class='desc']

这假设

  1. XPath 表达式所针对的上下文节点评估的是所有 tr 元素的父元素 - 未在您的问题中显示。

    评估

  2. 每个 tr 元素只有一个 td,其 class 属性值为 'name' 并且只有一个 tdclass 属性值为 'desc'

How would I accomplish the nextsibling
and is there an easier way of doing
this?

You may use:

tr/td[@class='name']/following-sibling::td

but I'd rather use directly:

tr[td[@class='name'] ='Brand']/td[@class='desc']

This assumes that:

  1. The context node, against which the XPath expression is evaluated is the parent of all tr elements -- not shown in your question.

  2. Each tr element has only one td with class attribute valued 'name' and only one td with class attribute valued 'desc'.

吐个泡泡 2024-09-14 06:30:46

尝试使用following-sibling 轴 (following-sibling::td)。

Try the following-sibling axis (following-sibling::td).

像你 2024-09-14 06:30:46

为了完整性 - 添加到上面接受的答案 - 如果您对任何同级感兴趣,无论元素类型如何,您可以使用变体:

following-sibling::*

For completeness - adding to accepted answer above - in case you are interested in any sibling regardless of the element type you can use variation:

following-sibling::*

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文