使用 HTMLAgilityPack c# 按类名删除元素
我正在使用 html 敏捷包将 html 文档的内容读入字符串等。完成此操作后,我想按其类删除该内容中的某些元素,但是我遇到了一个问题。
我的 Html 看起来像这样:
<div id="wrapper">
<div class="maincolumn" >
<div class="breadCrumbContainer">
<div class="breadCrumbs">
</div>
</div>
<div class="seo_list">
<div class="seo_head">Header</div>
</div>
Content goes here...
</div>
现在,我使用了 xpath 选择器来获取 中的所有内容,并使用了 InnerHtml 属性,如下所示:
node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
if (node != null)
{
pageContent = node.InnerHtml;
}
从这一点开始,我想删除带有“breadCrumbContainer”类的 div,但是当使用下面的代码,我收到错误:“在集合中找不到节点“””
node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
node = node.RemoveChild(node.SelectSingleNode("//div[@class='breadCrumbContainer']"));
if (node != null)
{
pageContent = node.InnerHtml;
}
任何人都可以解释一下吗?我对 Xpath 很陌生,对 HtmlAgility 库也很陌生。
谢谢,
戴夫
I'm using the html agility pack to read the contents of my html document into a string etc. After this is done, I would like to remove certian elements in that content by their class, however I am stumbling upon a problem.
My Html looks like this:
<div id="wrapper">
<div class="maincolumn" >
<div class="breadCrumbContainer">
<div class="breadCrumbs">
</div>
</div>
<div class="seo_list">
<div class="seo_head">Header</div>
</div>
Content goes here...
</div>
Now, I have used an xpath selector to get all the content within the and used the InnerHtml property like so:
node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
if (node != null)
{
pageContent = node.InnerHtml;
}
From this point, I would like to remove the div with the class of "breadCrumbContainer", however when using the code below, I get the error: "Node "" was not found in the collection"
node = doc.DocumentNode.SelectSingleNode("//div[@id='wrapper']");
node = node.RemoveChild(node.SelectSingleNode("//div[@class='breadCrumbContainer']"));
if (node != null)
{
pageContent = node.InnerHtml;
}
Can anyone shed some light on this please? I'm quite new to Xpath, and really new to the HtmlAgility library.
Thanks,
Dave
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是因为RemoveChild只能删除直接子级,而不能删除孙级。试试这个:
It's because RemoveChild can only remove a direct child, not a grand child. Try this instead:
对于 XSLT 来说,这是一个超级简单的任务:
当此转换应用于提供的 XML 文档时(添加另一个
并包装到
顶部元素中,使其更具挑战性和现实性):
产生了想要的正确结果:
This is a super-simple task for XSLT:
when this transformation is applied on the provided XML document (with added another
<div>
and wrapped into an<html>
top element to make it more challenging and realistic):the wanted, correct result is produced: