原始 HTML 是否应该在输出或输入时进行清理?

发布于 2024-12-12 02:07:57 字数 406 浏览 0 评论 0原文

CakePHP 的 Data Santiziation 页面指出,应该将用户输入的可能原始 HTML 存储在一个人的数据库并在输出时进行清理:

为了针对 XSS 进行清理,通常最好将原始 HTML 保存在数据库中而不进行修改,并在输出/显示时进行清理。

为什么最好将(有潜在危险的)HTML 存储在数据库中并仅在输出时对其进行清理?在产生相同功能的同时,首先进行清理是否会导致存储空间更小?

我能看到像这样存储原始 HTML 的唯一原因是,如果某些页面要清理输出,而某些页面要么不清理输出,要么比其他页面或多或少严格。

CakePHP's page on Data Santiziation states one should store possibly raw HTML from user input in one's database and sanitize at time of output:

For sanitization against XSS its generally better to save raw HTML in database without modification and sanitize at the time of output/display.

Why would it be preferable to store (potentially dangerous) HTML in one's database and only sanitize it for output? Wouldn't sanitizing first result in smaller storage while yielding the same function?

The only reason I can see where you would store raw HTML like this is if some pages were to sanitize the output, and some pages either did not santitize the output or were more or less strict about it than other pages.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

冷︶言冷语的世界 2024-12-19 02:07:57

您希望手动将原始数据保持在原始状态,以防止意外删除积极的清理脚本。

使用 CakePHP,您应该在视图中回显用户在系统中输入的所有内容上使用 h() 快捷方式。

如果您使用 Sanitize 类,我建议您创建一个方法来清理记录,并将对此方法的调用放入模型的 afterFind() 回调中,并将其应用于返回的每个记录。如果不需要,您仍然可以根据需要对数据调用清理方法。

You want to have the original data by hand in its original state to prevent accidental removal of aggressive cleanup scripts.

Using CakePHP you should use the h() shortcut on everything that was entered by a user in the system when echoing it in the view.

If you're using the Sanitize class I would suggest you to create a method that will sanitize a record and put a call to this method into the afterFind() callback of a model and apply it each record that is returned. If that's not desired you can still call your sanitize method on the data as needed.

云淡月浅 2024-12-19 02:07:57

我想到的一个重要原因是数据的错误污染。如果您对传入的 HTML 应用过于激进的过滤器,它将被永久损坏。您必须再次输入所有内容才能兑换。如果您对输出进行清理,您始终拥有“原始”,并且可以根据需要调整过滤。

One big reason that comes to my mind is faulty tainting of the data. If you were to apply an overly aggressive filter to incoming HTML, it would be permanently damaged. You would have to have all that content entered in again to redeem it. If you sanitize on output, you always have "the original" and can adjust the filtering as appropriate.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文