需要查询 XPath 来查找所有包含 7 个的元素元素

发布于 2024-11-16 04:27:52 字数 2574 浏览 4 评论 0原文

您好,希望感谢您的帮助。

老实说,我对 XPath 的经验不是很丰富,我希望那里的专家能给我一个快速的答案。

我正在抓取网页以获取数据。我想要的数据的定义方面是它包含在具有 7 个 元素的行 中。每个 元素都有我需要导入的数据片段之一。我正在 CodePlex 上使用 HTML Agility Pack 来获取数据,但我似乎不知道如何定义查询。

网页中包含这样的部分:

<table border="0" cellpadding="3" cellspacing="1" width="100%">
  <tr class="bgWhite" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
    <td class="dataHdrText02" valign="top" width="50" align="center"><nobr>SYMBOL</nobr></td>
    <td class="dataHdrText02" valign="top" align="center">PERIOD</td>
    <td class="dataHdrText02" valign="top" align="center" width="*">EVENT TITLE</td>
    <td class="dataHdrText02" valign="top" align="center">EPS ESTIMATE</td>
    <td class="dataHdrText02" valign="top" align="center">EPS ACTUAL</td>
    <td class="dataHdrText02" valign="top" align="center">PREV. YEAR ACTUAL</td>
    <td class="dataHdrText02" valign="top" align="center"><nobr>DATE/TIME (ET)</nobr></td>
  </tr>
  <tr class="bgWhite">
    <td align="center" width="50"><nobr>CSCO&#160;</nobr></td>
    <td align="center">Q4&#160;2011</td>
    <td align="left" width="*">Q4 2011 CISCO Systems Inc Earnings Release</td>
    <td align="center">$ 0.38&#160;</td>
    <td align="center">n/a&#160;</td>
    <td align="center">$ 0.43&#160;</td>
    <td align="center"><nobr>10-Aug-11</nobr></td>
  </tr>
  <tr class="bgWhite">
    <td align="center" width="50"><nobr>CSCO &#160;</nobr></td>
    <td align="center">Q3&#160;2011</td>
    <td align="left" width="*">Q3 2011 Cisco Systems Earnings Release</td>
    <td align="center">$ 0.37&#160;</td>
    <td align="center">$ 0.42&#160;</td>
    <td align="center">$ 0.42&#160;</td>
    <td align="center"><nobr>11-May-11 AMC</nobr></td>
  </tr>
  <tr class="bgWhite" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
     <td align="center" colspan="7"><img src="/format/cb/images/spacer.gif" width="1" height="4"></td>
  </tr>
</table>

我的目标是获取收益事件数据并将其放入数据库中进行分析。我最初的想法是获取具有 7 个 元素的所有 元素,然后处理该数据。欢迎任何建议或替代建议。

Hello and hopefully thanks for the help.

Honestly I am not very experienced at XPath and I am hoping a guru out there will have a quick answer for me.

I am scraping a web page for data. The defining aspect of the data I want is that it is contained in a row <tr> that has 7 <td> elements. Each <td> element has one of the pieces of data I need to import. I am using the HTML Agility Pack on CodePlex to grab the data, but I can't seem to figure out how to define the query.

Contained in the web page is a section like this:

<table border="0" cellpadding="3" cellspacing="1" width="100%">
  <tr class="bgWhite" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
    <td class="dataHdrText02" valign="top" width="50" align="center"><nobr>SYMBOL</nobr></td>
    <td class="dataHdrText02" valign="top" align="center">PERIOD</td>
    <td class="dataHdrText02" valign="top" align="center" width="*">EVENT TITLE</td>
    <td class="dataHdrText02" valign="top" align="center">EPS ESTIMATE</td>
    <td class="dataHdrText02" valign="top" align="center">EPS ACTUAL</td>
    <td class="dataHdrText02" valign="top" align="center">PREV. YEAR ACTUAL</td>
    <td class="dataHdrText02" valign="top" align="center"><nobr>DATE/TIME (ET)</nobr></td>
  </tr>
  <tr class="bgWhite">
    <td align="center" width="50"><nobr>CSCO </nobr></td>
    <td align="center">Q4 2011</td>
    <td align="left" width="*">Q4 2011 CISCO Systems Inc Earnings Release</td>
    <td align="center">$ 0.38 </td>
    <td align="center">n/a </td>
    <td align="center">$ 0.43 </td>
    <td align="center"><nobr>10-Aug-11</nobr></td>
  </tr>
  <tr class="bgWhite">
    <td align="center" width="50"><nobr>CSCO  </nobr></td>
    <td align="center">Q3 2011</td>
    <td align="left" width="*">Q3 2011 Cisco Systems Earnings Release</td>
    <td align="center">$ 0.37 </td>
    <td align="center">$ 0.42 </td>
    <td align="center">$ 0.42 </td>
    <td align="center"><nobr>11-May-11 AMC</nobr></td>
  </tr>
  <tr class="bgWhite" xmlns:msxsl="urn:schemas-microsoft-com:xslt">
     <td align="center" colspan="7"><img src="/format/cb/images/spacer.gif" width="1" height="4"></td>
  </tr>
</table>

My goal is to grab the earnings event data and place it into a database for analysis. My original thought was to grab all <tr> elements with 7 <td> elements then work with that data. Any advice or alternative suggestions would be welcome.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

走走停停 2024-11-23 04:27:52

这应该适合你。

//tr[count(td)=7]

This should do it for you.

//tr[count(td)=7]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文