如何在 HTML 文档中查找链接? (C#)
我有一个带有 WebBrowser 对象的 C# 表单。 该对象包含 HTML 文档。 该文档中有一个没有标记的链接(没有 ID 和名称) 我怎样才能访问这个元素?
我尝试使用这个:
webBrowser1.Document.GetElementsByTagName("a")[n]
但它不是很有用,因为如果页面上有一些新链接,我需要重建所有程序。
我也无法循环遍历文档,或获取 Document.ToString() 的子字符串,因为这样我就无法单击链接。
如果您能给我一些建议,那就太好了。
I have a C# Form with WebBrowser object.
This object contains HTML Document.
And there is a link in that document that has no markers (no id and no name)
How can I access this element??
I tried to use this:
webBrowser1.Document.GetElementsByTagName("a")[n]
But it is not very useful, because if there will be some new link on the page, I'll need to rebuild all program.
I also can not do loops through document, or get a substring of Document.ToString() because then I can not click the link.
Would be great if you could give me some advice.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在这种情况下,最好的想法始终是找到一个“锚点”,意思是 - 文档中永远不会改变的位置。
假设
没有 ID 或名称,因此您最接近的方法就是检查您要查找的元素的父元素是否有 ID。
这样你就可以获得parentDiv,你知道它不会改变,然后是父级内的A标签(它应该是永久的,除非该网站完全改变了结构,这是解析外部HTML页面的问题之一)
Shai。
In this kind of situation the best idea is always to find an "Anchor", meaning - a place in the document that never change.
Lets say that
Doesn't have an ID or Name, so the closest you can go is check if the parent of the element you're looking for has an ID.
That way you could get the parentDiv, which you know doesn't change, and then the A tag inside that parent (which should be permanent unless that website completely changes the structure which is one of the problems in parsing external HTML pages)
Shai.
您可以使用 Html 敏捷包。并通过 xpath 选择链接
you can use Html Agility Pack. and select links by xpath
您应该了解一些有关如何识别链接的信息。它可能是 ID、名称或文本。如果文本始终相同,则检查该链接的内部文本。
You should have some info on how to identify the link. it may be id or name or the text. If the text is always same then check the inner text of that link.