为什么 HTML Agility Pack HtmlDocument.DocumentNode 为 null?
我使用此代码来更改 HTML 流的 href 属性。
首先,我使用此代码下载完整的 html 页面:(URL 是网页地址)
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse myHttpWebResponse =
(HttpWebResponse)myHttpWebRequest.GetResponse();
Stream s = myHttpWebResponse.GetResponseStream();
然后我处理这个:
HtmlDocument doc = new HtmlDocument();
doc.Load(s);
foreach (HtmlNode link in doc.DocumentNode.SelectNodes("/a"))
{
string att = link.Attributes["href"].Value;
link.Attributes["href"].Value = "http://ahmadalli.somee.com/default.aspx?url=" + att;
}
doc.Save(s);
s
是 html 流。
但我有一个异常,说 doc.DocumentNode
为空!
我尝试了很多网站,但doc.DocumentNode
为空
I'm using this code to change the href attribute of a HTML stream.
first I download a full html page using this code:(URL is webpage address)
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse myHttpWebResponse =
(HttpWebResponse)myHttpWebRequest.GetResponse();
Stream s = myHttpWebResponse.GetResponseStream();
then I process this:
HtmlDocument doc = new HtmlDocument();
doc.Load(s);
foreach (HtmlNode link in doc.DocumentNode.SelectNodes("/a"))
{
string att = link.Attributes["href"].Value;
link.Attributes["href"].Value = "http://ahmadalli.somee.com/default.aspx?url=" + att;
}
doc.Save(s);
s
is html stream.
but I've got an exception that says doc.DocumentNode
is null!
i tried many sites but doc.DocumentNode
is null to
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
这对我有用。
另请参阅
HttpUtility.UrlEncode
以便能够正确获取 url。否则,原url中的某些参数可能会导致问题。使用 HttpUtility.UrlDecode 对其进行解码。
This works for me.
Also see the
HttpUtility.UrlEncode
to be able to get the url back correctly. Otherwise, some parameters in original url may cause problem.Use
HttpUtility.UrlDecode
to decode it.尝试使用
//a
而不是/a
。在 XPath 中,这基本上意味着给我文档中的所有链接,而不是给我文档根中的所有链接。
更新:
以下代码可以正常工作:
Try using
//a
instead of/a
.In XPath, this basically means give me all the links in the document, as opposed to give me all the links in the document root.
Update:
The following code works fine:
这是您的答案:HTML Agility Pack Null 参考。
Here is your answer: HTML Agility Pack Null Reference.
尝试使用以下代码:
我正在使用 HtmlAgility 包版本:1.4.0
解决了您的问题吗?如果没有,请评论。否则标记为答案。
Try using the below code:
I am using HtmlAgility pack version: 1.4.0
Solved your problem? If no, please comment. Else mark as answer.
锚标记引用是一个错误转义的字符串:
原始代码无法选择任何节点,并且计算结果为 null;应该对此进行检查,以防止在根本没有链接的文档上失败(尽管不太可能:)
Anchor tag reference is an incorrectly escaped string:
The original code fails to select any nodes and evaluates to null; this should be checked against to prevent failing on, say, a document where there are no links at all (however unlikely that is :)