适用于 .NET 的 HTML 清理工具

发布于 2024-07-09 15:40:00 字数 177 浏览 10 评论 0原文

我正在启动一个使用 asp.net mvc 面向公众的项目。 我知道大约有 10 亿个 php、python 和 ruby​​ html 清理程序,但是有人能指出 .net 中的任何好东西吗? 您对外面的事物有什么经验? 我知道 stackoverflow 是一个用 asp.net 完成的网站,允许自由格式的 HTML,它使用什么?

I'm starting a project that will be public facing using asp.net mvc. I know there are about a billion php, python, and ruby html sanitizers out there, but does anyone have some pointers to anything good in .net? What are your experiences with what is out there? I know stackoverflow is a site done in asp.net that allows freeform HTML, what does it use?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

墨洒年华 2024-07-16 15:40:00

HtmlSanitizer

来源:https://github.com/mganss/HtmlSanitizer

一个相当强大的消毒剂。 它理解并可以清理内联样式,但没有可以处理

HtmlSanitizer

Source: https://github.com/mganss/HtmlSanitizer

A fairly robust sanitizer. It understands and can clean inline styles, but doesn't have a parser that can deal with <style> blocks, so it strips them. It's certainly up to and probably beyond the level that Microsoft's AntiXSS was at, before it was abandoned.

等风来 2024-07-16 15:40:00

HtmlRuleSanitizer

根据您的问题,我有以下建议:

  • 您希望允许免费表单 HTML,因此您需要一个解决方案来指定一组允许的标签、属性和/或 CSS 类。
  • 通过允许自由格式的 HTML,您可能还必须处理格式错误的 HTML,因为用户会犯错误(有意或无意)。 因此,您需要一个基于容错解析器(例如 Html Agility Pack)构建的解决方案。
  • 您需要采用白名单方法,因为黑名单清理程序无法保护您免受任何新 HTML 规范的影响。 此外,由于 HTML 规范的大小,很难保证黑名单首先涵盖所有情况。

我遇到了同样的问题,并构建了 HtmlRuleSanitizer,它是基于 Html Agility Pack 之上的基于白名单规则的 HTML 清理程序。

HtmlRuleSanitizer

Based on your question I have the following suggestions:

  • You want to allow free form HTML, so you need a solution to be able to specify a set of tags, attributes and/or CSS classes which are allowed.
  • By allowing free form HTML it is likely that you'll also have to deal with malformed HTML because users make errors (deliberate or not). You thus need a solution built on a tolerant parser such as the Html Agility Pack.
  • You'll want to take a white listing approach because a black listing sanitizer does not protect your from any new HTML specifications. In addition it is very hard to guarantee that a black list covers all cases in the first place due to the size of the HTML specification.

I faced the same problem and built HtmlRuleSanitizer which is a white listing rule based HTML sanitizer on top of the Html Agility Pack.

屌丝范 2024-07-16 15:40:00

there is a c# version here

一直在等你来 2024-07-16 15:40:00

这是微软构建的一个。 http://wpl.codeplex.com/

var cleanHtml = Sanitizer.GetSafeHtml(unsafeHtml);

Here is one built by microsoft. http://wpl.codeplex.com/

var cleanHtml = Sanitizer.GetSafeHtml(unsafeHtml);
dawn曙光 2024-07-16 15:40:00

我们还可以使用

AntiXss.GetSafeHtmlFragments

通过解析 HTML 片段来清理输入,使用此清理程序来处理丰富的内容,以确保它不包含任何有害脚本,并且可以安全地显示在浏览器上。对于文本输入(不丰富)内容)以使用 AntiXss.HtmlEncode 或任何其他等效的 html 编码器。这是丰富内容的示例。

 string mal = "<IMG NAME = 'myPic' SRC = 'images / myPic.gif' onerror='alert(1)' onerror='alert(1) ><div bottommargin = 150 ondblclick = 'alert('double clicked!')' >< p > Double - click anywhere in the page.</p> </div> ";
                var cleanHtml = Sanitizer.GetSafeHtmlFragment(mal);
                Console.Write(cleanHtml);
                Console.Read(); 

注意:从 nugetpackage 管理器下载 AntiXSS 库并包含此名称空间
源代码中的Microsoft.Security.Application

We can also use

AntiXss.GetSafeHtmlFragments

sanitize input by parsing the HTML fragment,to use this sanitizer for rich content to ensure that it does not content any harmful script and it is safe to be displayed on the browser.For the text input(not rich content) to use AntiXss.HtmlEncode or any other equivalent html encoder.Here is the Sample for rich content.

 string mal = "<IMG NAME = 'myPic' SRC = 'images / myPic.gif' onerror='alert(1)' onerror='alert(1) ><div bottommargin = 150 ondblclick = 'alert('double clicked!')' >< p > Double - click anywhere in the page.</p> </div> ";
                var cleanHtml = Sanitizer.GetSafeHtmlFragment(mal);
                Console.Write(cleanHtml);
                Console.Read(); 

Note: Download AntiXSS library fron nugetpackage manager and include this namesapce
Microsoft.Security.Application in the souce code;

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文