TD内的XPath所有A还包含具有“Directs”内部文本的H3。
我正在抓取一个网站。有一个 TD,其中第一个子节点是 H3,其内部文本为“Directs”。在 TD 中,其他子项(H3 除外)是链接。我知道 XPath 完全能够从 TD 返回 A 标签子级,该 TD 还包含带有“Directs”内部文本的 H3,但我似乎无法正确理解。我想出的丑陋的解决方法如下,但我想学习最好的 XPath 方法:
For Each thisH3 As HtmlNode In Doc.SelectNodes("//h3")
If thisH3.InnerText = "Directs" Then
For Each nChild As HtmlNode In thisH3.ParentNode.ChildNodes
If nChild.Name = "a" Then
Debug.Print(nChild.InnerText)
End If
Next
End If
Next
I'm scraping a website. There's a TD where the first child node is an H3 with an innertext of "Directs". In the TD the other children (besides the H3) are the links. I know XPath is perfectly capable of just returning the A tag children from a TD that also contains an H3 with an innertext of "Directs", I just can't seem to get it right. The ugly work-around I came up with is the following, but I want to learn the best XPath method:
For Each thisH3 As HtmlNode In Doc.SelectNodes("//h3")
If thisH3.InnerText = "Directs" Then
For Each nChild As HtmlNode In thisH3.ParentNode.ChildNodes
If nChild.Name = "a" Then
Debug.Print(nChild.InnerText)
End If
Next
End If
Next
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用此 XPath 检索
td
中具有h3
且值为Directs
的所有a
:Use this XPath to retrieve all
a
intd
which haveh3
with valueDirects
: