使用 HTMLAgilityPack 从节点的子节点中选择所有
我有以下代码用于获取 html 页面。将网址设置为绝对,然后将链接设置为 rel nofollow 并在新窗口/选项卡中打开。我的问题是向 添加属性。
string url = "http://www.mysite.com/";
string strResult = "";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if ((request.HaveResponse) && (response.StatusCode == HttpStatusCode.OK)) {
using (StreamReader sr = new StreamReader(response.GetResponseStream())) {
strResult = sr.ReadToEnd();
sr.Close();
}
}
HtmlDocument ContentHTML = new HtmlDocument();
ContentHTML.LoadHtml(strResult);
HtmlNode ContentNode = ContentHTML.GetElementbyId("content");
foreach (HtmlNode node in ContentNode.SelectNodes("/a")) {
node.Attributes.Append("rel", "nofollow");
node.Attributes.Append("target", "_blank");
}
return ContentNode.WriteTo();
谁能看到我做错了什么吗?在这里尝试了一段时间但没有运气。此代码表明 ContentNode.SelectNodes("/a") 未设置为对象的实例。我想尝试将蒸汽设置为0?
干杯, 丹尼斯
I've got the following code that I'm using to get a html page. Make the urls absolute and then make the links rel nofollow and open in a new window/tab. My issue is around the adding of the attributes to the <a>
s.
string url = "http://www.mysite.com/";
string strResult = "";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if ((request.HaveResponse) && (response.StatusCode == HttpStatusCode.OK)) {
using (StreamReader sr = new StreamReader(response.GetResponseStream())) {
strResult = sr.ReadToEnd();
sr.Close();
}
}
HtmlDocument ContentHTML = new HtmlDocument();
ContentHTML.LoadHtml(strResult);
HtmlNode ContentNode = ContentHTML.GetElementbyId("content");
foreach (HtmlNode node in ContentNode.SelectNodes("/a")) {
node.Attributes.Append("rel", "nofollow");
node.Attributes.Append("target", "_blank");
}
return ContentNode.WriteTo();
Can anyone see what I'm doing wrong? Been try for a while here with no luck. This code comes up that ContentNode.SelectNodes("/a") isn't set to an instance of an object. I though to try and set the steam to 0?
Cheers,
Denis
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
ContentNode
是否为空?您可能需要使用查询"//*[@id='content']"
选择 select-single。有关信息,
“/a”
表示所有锚点位于根。“descendant::a”
有效吗?还有可能更容易的HtmlElement.GetElementsByTagName
- 即yourElement.GetElementsByTagName("a")
。Is
ContentNode
null? You might need to select-single with the query"//*[@id='content']"
.For info,
"/a"
means all anchors at the root. does"descendant::a"
work? There is alsoHtmlElement.GetElementsByTagName
which might be easier - i.e.yourElement.GetElementsByTagName("a")
.