如何使用 HTML Agility Pack 注释掉 html 文档中的所有脚本标签
我想注释掉 HtmlDocument 中的所有脚本标签。这样,当我渲染文档时,脚本不会被执行,但我们仍然可以看到那里有什么。不幸的是,我当前的方法失败了:
foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
{
var commentedScript = new HtmlNode(HtmlNodeType.Comment, htmlDocument, 0) { InnerHtml = scriptTag.ToString() };
scriptTag.ParentNode.AppendChild(commentedScript);
scriptTag.Remove();
}
请注意,我可以使用 html 上的替换函数来做到这一点,但我认为它不会那么强大:
domHtml = domHtml.Replace("<script", "<!-- <script");
domHtml = domHtml.Replace("</script>", "</script> -->");
I would like to comment out all script tags from an HtmlDocument. This way when I render the document the scripts are not executed however we can still see what was there. Unfortunately, my current approach is failing:
foreach (var scriptTag in htmlDocument.DocumentNode.SelectNodes("//script"))
{
var commentedScript = new HtmlNode(HtmlNodeType.Comment, htmlDocument, 0) { InnerHtml = scriptTag.ToString() };
scriptTag.ParentNode.AppendChild(commentedScript);
scriptTag.Remove();
}
Note that I can do this using replace functions on the html, but I do not think it would be as robust:
domHtml = domHtml.Replace("<script", "<!-- <script");
domHtml = domHtml.Replace("</script>", "</script> -->");
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
试试这个:
Try this:
请参阅此 SO 帖子 - 利用 HTML Agility Pack 的 Linq 查询支持的非常干净的解决方案:
htmlagilitypack - 删除脚本和样式?
Refer to this SO post - very clean solution utilising the Linq query support of the HTML Agility Pack:
htmlagilitypack - remove script and style?