Rails“原始”帮手
在可能有许多用户的 Rails 应用程序中使用“原始”助手是否安全?
我将把 TinyMCE 与我的应用程序集成,以便用户将 HTML 内容添加到某种形式的帖子中。使用“原始”显示其内容是否存在安全问题?
或者有更合适的做事方式吗?
谢谢!
Is it safe to use the 'raw' helper in a Rails app that may have many users?
I will be integrating TinyMCE with my app, for users to add HTML content to some form of post. Is it a security issue to use 'raw' to display their content?
Or is there a more proper way of doing things?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
来自精细手册:
所以,是的,使用
raw
可能会带来一些安全问题(对于您的用户而言),除非您正确清理传入的 HTML。您不应该信任客户端。即使您已经使用有限的标签集设置了 TinyMCE,您也无法保证您的服务器收到的 HTML 实际上来自 TinyMCE,或者有人没有以某种方式解决过 TinyMCE。
如果您接受用户的 HTML,则需要在存储之前将标签和属性列入白名单。
您可以使用 Nokogiri 按标签解析传入的 HTML 标签,如果某个标签在您的白名单上,则让它通过,如果您没有明确允许该标签(即它不在您的白名单中),然后将其丢弃。并且,您需要检查允许通过的标签上的属性,以便只有您想要的属性和属性值可以通过。不在白名单中的任何标签、属性或属性值都会被丢弃。清除传入的 HTML 后,您可以存储它并使用
raw
帮助器将其安全地呈现给用户。这种增加的复杂性是许多网站使用 Markdown、BB-Code 或其他生成 HTML 的标记语言的原因之一。
From the fine manual:
So, yes, using
raw
can be bit of a security issue (for your users) unless you properly sanitize the HTML that comes in.You shouldn't trust the client. Even if you've set up TinyMCE with a limited set of tags, you have no guarantee that the HTML that your server receives actually came from TinyMCE or that someone hasn't worked around TinyMCE in some way.
If you're accepting HTML from users, then you need to whitelist both the tags and attributes before you store it.
You can use Nokogiri to parse the incoming HTML tag by tag, if a tag is on your whitelist then let it through, if you aren't explicitly allowing that tag (i.e. it isn't on your whitelist) then throw it away. And, you'll want to check the attributes on the tags you let through so that only attributes and attribute values that you want get through. Any tags, attributes, or attribute values that aren't on your whitelists get thrown away. Once you've scrubbed the incoming HTML, you can store it and safely present it to your user's using the
raw
helper.This added complexity is one reason that a lot of sites use Markdown, BB-Code, or some other markup language that generates HTML.