Html Agility Pack 检索数据时出现问题
我正在尝试解析网页 http://www.bbb 中的数据。 org/kitchener/accredited-business-directory?letter=a
我想获得所有类别,例如
会计师 - 注册会计师 (2)
会计服务 (1) 等等,但问题是当我转到节点时,标记 a 为空,我不知道为什么,但 HTMLagility pack 没有获取这些标记。检查手表它说 div 只包含最多注释的断线标签,而不是标签,就像我们在页面源代码中看到的那样,这里
doc.DocumentNode.SelectNodes("//tr/td/table/tr/td/div/div")[0].OuterHtml "<div style=\"font-size: 12px;line-height: 16px;\"><!--<br />-->\r\n<!--<br />-->\r\n</div>"
是该 div 的开始 注意,我只包含了 HTML 中的 2 个标签,
<div style="float: left; width: 305px;">
<h5 style="margin: 0px; margin-bottom: 5px; border-bottom: 1px solid #cccccc; padding-bottom: 5px; font-size: 12px;">Categories Starting with letter 'a'</h5>
<div style="font-size: 12px;line-height: 16px;">
<!--<br />-->
<!--<br />-->
<a class="listingName" href="/kitchener/accredited-business-directory/accountants">Accountants (11)</a><br />
<a class="listingName" href="/kitchener/accredited-business-directory/accountants-certified-public">Accountants - Certified Public (2)</a><br />
</div>
</div>
我如何获取数据
即使放置也不会显示链接
foreach (var test in doc.DocumentNode.SelectNodes("//a[@href]"))
{ MessageBox.Show(test.InnerText+"\n"+test.InnerHtml); }
I am trying to parse data from web page http://www.bbb.org/kitchener/accredited-business-directory?letter=a
i want to get all the categories like
Accountants - Certified Public (2)
Accounting Services (1)
etc but problem is when i goto node then tag a is null i donot know why but HTMLagility pack does not get these tags. Checking in watch it says that div only encloses thest commented breakline tags not the tag where as when we see in page source it is there
doc.DocumentNode.SelectNodes("//tr/td/table/tr/td/div/div")[0].OuterHtml "<div style=\"font-size: 12px;line-height: 16px;\"><!--<br />-->\r\n<!--<br />-->\r\n</div>"
here is start of that div
Note i have included only 2 tags from the HTML
<div style="float: left; width: 305px;">
<h5 style="margin: 0px; margin-bottom: 5px; border-bottom: 1px solid #cccccc; padding-bottom: 5px; font-size: 12px;">Categories Starting with letter 'a'</h5>
<div style="font-size: 12px;line-height: 16px;">
<!--<br />-->
<!--<br />-->
<a class="listingName" href="/kitchener/accredited-business-directory/accountants">Accountants (11)</a><br />
<a class="listingName" href="/kitchener/accredited-business-directory/accountants-certified-public">Accountants - Certified Public (2)</a><br />
</div>
</div>
how can i get data
Even putting does not reveal the links
foreach (var test in doc.DocumentNode.SelectNodes("//a[@href]"))
{ MessageBox.Show(test.InnerText+"\n"+test.InnerHtml); }
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用以下示例,这对我来说效果很好:
输出(缩短):
This worked fine for me using the following sample:
Output (shortened):