如何使用 HTML Agility Pack 选择特定表格单元格
我必须从 HTML 表格的单元格中提取特定字段。使用 Firebug,我能够获得所需单元格的确切 XPath(不幸的是,这些单元格没有 id 标签)。我以为我可以使用 DocumentNode.SelectSingleNode 并传入该路径,但它似乎工作不正常。我做错了什么?或者有比我这样做更好的方法吗?不幸的是,我没有使用 XPath 的经验,所以这比我预期的要困难。这是我到目前为止所得到的(我知道 HTML 特别混乱,但这不是我可以控制更改的):
Dim page As New HtmlAgilityPack.HtmlDocument
Dim node As HtmlAgilityPack.HtmlNode
page.LoadHtml(fileContents)
node = page.DocumentNode.SelectSingleNode("/html/body/form/div[6]/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td[2]")
非常感谢。
I have to pull out particular fields from cells in an HTML table. Using Firebug I was able to get the exact XPath to the cells I need (unfortunately, the cells don't have an id tag). I thought I could use DocumentNode.SelectSingleNode and pass in that path, but it doesn't seem to be working right. What am I doing wrong? Or is there a better approach to this than how I am doing it? Unfortunately, I have no experience with XPath so this is turning out harder than I expected it to be. Here's what I have so far (I know the HTML is particuarly messy, but that's not in my control to change):
Dim page As New HtmlAgilityPack.HtmlDocument
Dim node As HtmlAgilityPack.HtmlNode
page.LoadHtml(fileContents)
node = page.DocumentNode.SelectSingleNode("/html/body/form/div[6]/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td[2]")
Much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Firebug 可能修复了损坏的 html 标签。
如果你想选择Html节点,建议使用class或id。
例如:
缩短路径,并使用类或id选择器。
如果表有它自己的id,你可以使用:
尝试一下,你会发现XPATH很有趣。
Firebug maybe fixed broken html tags.
If you want to pick and Html node,it is recommend use class or id.
For example:
shorten the path,and use class or id selector.
if the table has it's own id,you can use:
try it,you will find XPATH is interesting.