.NET HTML 白名单(反 xss/跨站脚本)

发布于 2024-07-30 05:14:12 字数 1158 浏览 10 评论 0原文

我遇到过一种常见情况,即我获得使用 HTML 子集的用户输入(使用tinyMCE 输入)。 我需要一些针对 XSS 攻击的服务器端保护,并且正在寻找人们用来执行此操作的经过良好测试的工具。 在 PHP 方面,我看到很多像 HTMLPurifier 这样的库可以完成这项工作,但我似乎在 .NET 中找不到任何东西。

我基本上是在寻找一个库来过滤标签白名单、这些标签上的属性,并使用 a:href 和 img:src 等“困难”属性做正确的事情

我已经在 http://refactormycode.com/codes/333-sanitize-html,但我不知道它是如何最新的。 它与网站当前使用的内容有任何关系吗? 无论如何,我不确定我是否对尝试用正则表达式输出有效输入的策略感到满意。

这篇博客文章列出了似乎更引人注目的策略:

http://blog.bvsoftware.com/post/2009/01/08/How-to-filter-Html- Input-to-Prevent-Cross-Site-Scripting-but-Still-Allow-Design.aspx

此方法实际上是将 HTML 解析为 DOM,对其进行验证,然后从中重建有效的 HTML。 如果 HTML 解析可以合理地处理格式错误的 HTML,那就太好了。 如果没有,也没什么大不了的——我可以要求格式良好的 HTML,因为用户应该使用tinyMCE 编辑器。 无论哪种情况,我都会重写我所知道的安全、格式良好的 HTML。

问题是这只是一个描述,没有指向实际执行该算法的任何库的链接。

这样的图书馆存在吗? 如果没有,什么是好的 .NET HTML 解析引擎? 应该使用什么正则表达式来执行额外的验证 a:href, img:src? 我在这里错过了其他重要的事情吗?

我不想在这里重新安装越野车轮子。 当然有一些常用的库。 有任何想法吗?

I've got the common situation where I've got user input that uses a subset of HTML (input with tinyMCE). I need to have some server-side protection against XSS attacks and am looking for a well-tested tool that people are using to do this. On the PHP side I'm seeing lots of libraries like HTMLPurifier that do the job, but I can't seem to find anything in .NET.

I'm basically looking for a library to filter down to a whitelist of tags, attributes on those tags, and does the right thing with "difficult" attributes like a:href and img:src

I've seen Jeff Atwood's post at http://refactormycode.com/codes/333-sanitize-html, but I don't know how up-to-date it is. Does it have any bearing at all to what the site is currently using? And in any case I'm not sure I'm comfortable with that strategy of trying to regexp out valid input.

This blog post lays out what seems to be a much more compelling strategy:

http://blog.bvsoftware.com/post/2009/01/08/How-to-filter-Html-Input-to-Prevent-Cross-Site-Scripting-but-Still-Allow-Design.aspx

This method is to actually parse the HTML into a DOM, validate that, then rebuild valid HTML from it. If the HTML parsing can handle malformed HTML sensibly, then great. If not, no big deal -- I can demand well-formed HTML since the users should be using the tinyMCE editor. In either case I'm rewriting what I know is safe, well-formed HTML.

The problem is that's just a description, without a link to any library that actually executes that algorithm.

Does such a library exist? If not, what would be a good .NET HTML parsing engine? And what regular expressions should be used to perform extra validation a:href, img:src? Am I missing something else important here?

I don't want re-implement a buggy wheel here. Surely there's some commonly used libraries out there. Any ideas?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

空心空情空意 2024-08-06 05:14:12

好吧,如果您想要解析,并且担心传入无效的 (x)HTML,那么 HTML Agility Pack< /a> 可能是用于解析的最佳选择。 请记住,虽然它不仅仅是元素,还需要允许元素上的属性(当然,您应该使用允许的元素及其属性的白名单,而不是尝试通过黑名单删除可能不可靠的内容)

还有OWASP AntiSamy 项目 这是一项正在进行的工作 - 他们还有您可以尝试使用 XSS

正则表达式,因为在我看来这可能风险太大。

Well if you want to parse, and you're worried about invalid (x)HTML coming in then the HTML Agility Pack is probably the best thing to use for parsing. Remember though it's not just elements, but also attributes on allowed elements you need to allow (of course you should work to an allowed whitelist of elements and their attributes, rather than try to strip things that might be dodgy via a blacklist)

There's also the OWASP AntiSamy Project which is an ongoing work in progress - they also have a test site you can try to XSS

Regex for this is probably too risky IMO.

空心空情空意 2024-08-06 05:14:12

Microsoft 有一个开源库来防御 XSS:AntiXSS

Microsoft has an open-source library to protect against XSS: AntiXSS.

空‖城人不在 2024-08-06 05:14:12

http://www.microsoft.com/en-us/download /details.aspx?id=28589
您可以在此处下载一个版本,但我将其链接到有用的 DOCX 文件。 我的首选方法是使用 NuGet 包管理器来获取最新的 AntiXSS 包。

您可以使用 4.x AntiXss 库中的 HtmlSanitizationLibrary 程序集。 请注意,GetSafeHtml() 位于 HtmlSanitizationLibrary 中的 Microsoft.Security.Application.Sanitizer 下。

http://www.microsoft.com/en-us/download/details.aspx?id=28589
You can download a version here, but I linked it for the useful DOCX file. My preferred method is to use the NuGet package manager to get the latest AntiXSS package.

You can use the HtmlSanitizationLibrary assembly found in the 4.x AntiXss library. Note that GetSafeHtml() is in the HtmlSanitizationLibrary, under Microsoft.Security.Application.Sanitizer.

咿呀咿呀哟 2024-08-06 05:14:12

我们正在使用 HtmlSanitizer .Net 库,该库:

也在 NuGet

We are using the HtmlSanitizer .Net library, which:

Also on NuGet

萌︼了一个春 2024-08-06 05:14:12

https://github.com/Vereyon/HtmlRuleSanitizer 完全解决了这个问题。

在将 wysihtml5 编辑器集成到 ASP.NET MVC 应用程序中时,我遇到了这个挑战。 我注意到它有一个非常漂亮但简单的基于白名单的清理程序,它使用规则来允许 HTML 子集通过。 我实现了它的服务器端版本,它依赖于 HtmlAgility 包进行解析。

Microsoft Web Protection Library(以前的 AntiXSS)似乎简单地删除了几乎所有 HTML 标签,从我读到的内容来看,您无法轻松地根据您想要使用的 HTML 子集定制规则。 所以这对我来说不是一个选择。

这个 HTML sanitizer 看起来也很有前途,将是我的第二选择。

https://github.com/Vereyon/HtmlRuleSanitizer exactly solves this problem.

I had this challenge when integrating the wysihtml5 editor in an ASP.NET MVC application. I noted that it had a very nice yet simple white list based sanitizer which used rules to allow a subset of HTML to pass through. I implemented a server side version of it which depends on the HtmlAgility pack for parsing.

Microsoft Web Protection Library (former AntiXSS) seems to simply rip out almost all HTML tags and from what I read you cannot easily tailor the rules to the HTML subset you want to use. So that was not an option for me.

This HTML sanitizer also looks very promising and would be my second choice.

薆情海 2024-08-06 05:14:12

几年前,当我使用 TinyMCE 时,我遇到了完全相同的问题。

.Net 似乎仍然没有任何像样的 XSS/HTML 白名单解决方案,因此我上传了我创建并使用了几年的解决方案。

http://www.codeproject.com/KB/aspnet/html- white-listing.aspx

白名单定义基于TinyMCE的有效元素。

取二:
环顾四周,微软最近发布了基于白名单的Anti-XSS Library(V3.0),看看:

微软反交叉网站
脚本库V3.0(防XSS V3.0)
是一个编码库,旨在
帮助开发人员保护他们的 ASP.NET
来自 XSS 的基于 Web 的应用程序
攻击。 它与大多数编码不同
库,因为它使用
白名单技术——有时
称为原则
内含物——提供保护
抵御XSS攻击。 这种方法
首先定义一个有效的或
允许的字符集,以及
编码此集合之外的任何内容
(无效字符或潜在的
攻击)。 白名单方法
与其他产品相比具有多种优势
编码方案。 此中的新功能
微软反交叉版本
站点脚本库包括: -
扩展白名单,支持更多
语言 - 性能改进 -
性能数据表(在线
help) - 支持 Shift_JIS 编码
适用于移动浏览器 - 示例
应用程序 - 安全运行时引擎
(SRE) HTTP 模块

I had the exact same problem a few years back when I was using TinyMCE.

There still doesn't seem to be any decent XSS / HTML white-listing solutions for .Net so I've uploaded a solution I created and have been using for a few years.

http://www.codeproject.com/KB/aspnet/html-white-listing.aspx

The white list defnintion is based on TinyMCE's valid-elements.

Take Two:
Looking around, Microsoft have recently released a white-list based Anti-XSS Library (V3.0), check that out:

The Microsoft Anti-Cross Site
Scripting Library V3.0 (Anti-XSS V3.0)
is an encoding library designed to
help developers protect their ASP.NET
web-based applications from XSS
attacks. It differs from most encoding
libraries in that it uses the
white-listing technique -- sometimes
referred to as the principle of
inclusions -- to provide protection
against XSS attacks. This approach
works by first defining a valid or
allowable set of characters, and
encodes anything outside this set
(invalid characters or potential
attacks). The white-listing approach
provides several advantages over other
encoding schemes. New features in this
version of the Microsoft Anti-Cross
Site Scripting Library include: - An
expanded white list that supports more
languages - Performance improvements -
Performance data sheets (in the online
help) - Support for Shift_JIS encoding
for mobile browsers - A sample
application - Security Runtime Engine
(SRE) HTTP module

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文