如何在 HTML 文档中查找链接？ (C#)

发布于 2024-12-11 16:36:56 字数 313 浏览 0 评论 0原文

我有一个带有 WebBrowser 对象的 C# 表单。该对象包含 HTML 文档。该文档中有一个没有标记的链接（没有 ID 和名称）我怎样才能访问这个元素？

我尝试使用这个：

webBrowser1.Document.GetElementsByTagName("a")[n]

但它不是很有用，因为如果页面上有一些新链接，我需要重建所有程序。

我也无法循环遍历文档，或获取 Document.ToString() 的子字符串，因为这样我就无法单击链接。

如果您能给我一些建议，那就太好了。

原文

I have a C# Form with WebBrowser object.
This object contains HTML Document.
And there is a link in that document that has no markers (no id and no name)
How can I access this element??

I tried to use this:

webBrowser1.Document.GetElementsByTagName("a")[n]

But it is not very useful, because if there will be some new link on the page, I'll need to rebuild all program.

I also can not do loops through document, or get a substring of Document.ToString() because then I can not click the link.

Would be great if you could give me some advice.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

你是年少的欢喜 2024-12-18 16:36:56

在这种情况下，最好的想法始终是找到一个“锚点”，意思是 - 文档中永远不会改变的位置。

假设

<a href="http://site.com">dada</a>

没有 ID 或名称，因此您最接近的方法就是检查您要查找的元素的父元素是否有 ID。

<div id="parentDiv">
      Some text
      Some other stuff
      <a href="http://site.com">The link you're looking for</a>
</div>

这样你就可以获得parentDiv，你知道它不会改变，然后是父级内的A标签（它应该是永久的，除非该网站完全改变了结构，这是解析外部HTML页面的问题之一）

Shai。

In this kind of situation the best idea is always to find an "Anchor", meaning - a place in the document that never change.

Lets say that

<a href="http://site.com">dada</a>

Doesn't have an ID or Name, so the closest you can go is check if the parent of the element you're looking for has an ID.

<div id="parentDiv">
      Some text
      Some other stuff
      <a href="http://site.com">The link you're looking for</a>
</div>

That way you could get the parentDiv, which you know doesn't change, and then the A tag inside that parent (which should be permanent unless that website completely changes the structure which is one of the problems in parsing external HTML pages)

Shai.

回复收藏 0 原文

把人绕傻吧 2024-12-18 16:36:56

您可以使用 Html 敏捷包。并通过 xpath 选择链接

 HtmlWeb htmlWeb  = new HtmlWeb();
 HtmlDocument doc = htmlWeb.Load(/* url */);
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
   // do stuff
 }

you can use Html Agility Pack. and select links by xpath

 HtmlWeb htmlWeb  = new HtmlWeb();
 HtmlDocument doc = htmlWeb.Load(/* url */);
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
   // do stuff
 }

回复收藏 0 原文