白名单,通过 C# 中的 WMD 控制防止 XSS
我在这里做的事情有什么问题吗?这是我第一次处理这样的事情,我只是想确保我了解不同方法的所有风险等。
我使用 WMD 获取用户输入,并使用文字控件显示它。 由于一旦输入就无法编辑,我将存储 HTML 而不是 Markdown,
input = Server.HTMLEncode(stringThatComesFromWMDTextArea)
然后运行类似以下内容的标签,我希望用户能够使用。
// Unescape whitelisted tags.
string output = input.Replace("<b>", "<b>").Replace("</b>", "</b>")
.Replace("<i>", "<i>").Replace("</i>", "</i>");
编辑这是我目前正在做的事情:
public static string EncodeAndWhitelist(string html)
{
string[] whiteList = { "b", "i", "strong", "img", "ul", "li" };
string encodedHTML = HttpUtility.HtmlEncode(html);
foreach (string wl in whiteList)
encodedHTML = encodedHTML.Replace("<" + wl + ">", "<" + wl + ">").Replace("</" + wl + ">", "</" + wl + ">");
return encodedHTML;
}
- 我在这里所做的事情是否可以保护我免受 XSS?
- 还有其他考虑吗 应该这样做吗?
- 有没有一个好的正常列表 标签加入白名单?
Are there any problems with what I am doing here? This is my first time to deal with something like this, and I just want to make sure I understand all the risks, etc. to different methods.
I am using WMD to get user input, and I am displaying it with a literal control.
Since it is uneditable once entered I will be storing the HTML and not the Markdown,
input = Server.HTMLEncode(stringThatComesFromWMDTextArea)
And then run something like the following for tags I want users to be able to use.
// Unescape whitelisted tags.
string output = input.Replace("<b>", "<b>").Replace("</b>", "</b>")
.Replace("<i>", "<i>").Replace("</i>", "</i>");
Edit Here is what I am doing currently:
public static string EncodeAndWhitelist(string html)
{
string[] whiteList = { "b", "i", "strong", "img", "ul", "li" };
string encodedHTML = HttpUtility.HtmlEncode(html);
foreach (string wl in whiteList)
encodedHTML = encodedHTML.Replace("<" + wl + ">", "<" + wl + ">").Replace("</" + wl + ">", "</" + wl + ">");
return encodedHTML;
}
- Will what I am doing here keep me protected from XSS?
- Are there any other considerations
that should be made? - Is there a good list of normal
tags to whitelist?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您的要求确实非常基本,您可以进行如此简单的字符串替换,那么是的,这对于 XSS 是“安全的”。 (但是,仍然可以提交格式不正确的内容,其中
和
错误嵌套或未闭合,这可能会造成混乱内容最终插入的页面。)
但这还远远不够。例如,目前不允许
或
。如果您想允许这些或其他带有属性值的标记,您将需要做更多的工作。然后,您可以使用正则表达式来处理它,但这会给您带来无尽的问题,即意外嵌套和替换已替换的内容,就像正则表达式无法解析 HTML 一样。
为了解决这两个问题,通常的方法是在输入上使用 [X][HT]ML 解析器,然后遍历 DOM,删除除已知良好的元素和属性之外的所有元素和属性,最后重新序列化为 [X]HTML。然后保证结果格式良好并且仅包含安全内容。
If your requirements really are that basic that you can do such simple string replacements then yes, this is ‘safe’ against XSS. (However, it's still possible to submit non-well-formed content where
<i>
and<b>
are mis-nested or unclosed, which could potentially mess up the page the content ends up inserted into.)But this is rarely enough. For example currently
<a href="...">
or<img src="..." />
are not allowed. If you wanted to allow these or other markup with attribute values in, you'd have a whole lot more work to do. You might then approach it with regex, but that gives you endless problems with accidental nesting and replacement of already-replaced content, seeing as how regex can't parse HTML, and that.To solve both problems, the usual approach is to use an [X][HT]ML parser on the input, then walk the DOM removing all but known-good elements and attributes, then finally re-serialise to [X]HTML. The result is then guaranteed well-formed and contains only safe content.