Apache Commons Text StringEscapeUtils 与 JSoup 预防 XSS?
我想清理用户输入以帮助防止 XSS 攻击,并且我们不一定关心 HTML 白名单,因为我们的用户不需要发布任何 HTML / CSS。
看看现有的替代方案,哪个会更好? Apache Commons Text 的 StringEscapeUtils或 JSoup更清洁?
更新:
在为 JSoup 和 Apache Commons Text 编写一些单元测试后,我选择了 JSoup。
我喜欢 JSoup 不会弄乱单引号(即“Alan's mom”没有改变,而 Apache Commons Text 将其变成“Alan's mom”)。
而且白名单根本不是问题。它不需要任何配置,相反,它们包含一些内置选项,如果我们选择允许某些 HTML 标签子集,这些选项可能会派上用场。
I want to clean user input for help preventing XSS attacks and we don't necessarily care to have a HTML whitelist, as our users shouldn't need to post any HTML / CSS.
Eyeing the alternatives out there, which would be better? Apache Commons Text's StringEscapeUtils or JSoup Cleaner?
Update:
I went with JSoup after writing some unit tests for both it and Apache Commons Text.
I like how JSoup won't mess with single quotation marks (i.e. "Alan's mom" isn't unchanged, whereas Apache Commons Text turns it into "Alan's mom").
And the whitelist wasn't a problem at all. It didn't require any configuration, rather, they have some built-in options included which may come in handy if we choose to allow some subsets of HTML tags.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
“更好的”?我认为这并不重要。 Cleaner 有一个 Whitelist.none(),escape utils 会转义所有内容。
这取决于您希望如何呈现“已清理”的输入:您只需要文本节点,还是希望显示转义的 HTML?
"Better"? I don't think it matters. Cleaner has a Whitelist.none(), escape utils will escape everything.
It depends on how you want the "cleaned" input to render: do you want just the text nodes, or do you want the escaped HTML to show up?
我很想看到 Cuga 的测试用例,因为如果您在 2.6 中使用 Apache Commons escapeHtml 或在 3+ 中使用 escapeHtml4,它不会添加斜杠。它只是将字符转换为 HTML 实体,文档中对此有明确说明。
我什至有一个公共示例来测试这一点:
https://gist.github.com/croucha/2e2925264890886cbf4d< /a>
所以请证明我错了,否则你关于转义添加斜杠的部分是错误的。如果您想仍然显示这些不安全字符但避免在浏览器内执行,那么您最好的选择是 Apache commons。据我所知,Jsoup 完全省略了包括内容在内的字符,即使它是安全的。
I would love to see Cuga's test cases because if you are using the Apache Commons escapeHtml in 2.6 or escapeHtml4 in 3+ it does not add slashes. It simply converts characters to HTML entities, which is clearly stated in the documentation.
I even have a public example to test this out:
https://gist.github.com/croucha/2e2925264890886cbf4d
So please, prove me wrong otherwise your part about the escaping adding slashes is wrong. If you want to still display these unsafe characters but avoid execution inside the browser, then your best option is Apache commons. As far as I can tell, Jsoup completely omits the characters including the contents even if it's safe.