用户输入 & 符号字符搞乱了我的网站 w3c 验证
我的社交网站是 w3c xhtml 有效的,但是用户可以发布博客报告和内容,有时会输入 & 符号,这反过来又会扰乱我的验证。我该如何解决这个问题,是否还有其他我需要注意的单个字符可能会扰乱我的验证?
my social networking site is w3c xhtml valid however users are able to post blog reports and stuff and at times enter in ampersand characters which in turn mess up my validation. How can I fix this and are there any other single characters that I need to look out for that could mess up my validation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
显示用户生成的内容时,通过 htmlspecialchars() 函数运行它。
When displaying user produced content, run it through the htmlspecialchars() function.
作为一般原则,在不进行验证或过滤的情况下直接将用户提交的(或实际上任何外部的)内容包含到页面中是错误的。除了导致验证错误之外,它还可能导致“损坏的页面”和大的安全漏洞(跨站点脚本攻击)。
每当您从非 100% 可信的任何地方获取数据时,您都需要以某种方式确保其安全。您可以通过执行部分或全部操作来实现此目的:
如果您的用户输入旨在被解释为文本,那么您主要会考虑选项 1;如果您让用户使用 HTML,那么您将考虑选项 2 和 3。第四个选项是让用户使用一些限制性更强的非 HTML 标记,例如 Markdown 或 bbCode,使用(希望)该库不允许注入安全漏洞、分页结构或其他可怕的东西。
As a matter of general principle it's a mistake to include user-submitted (or indeed any external) content into your page directly without validation or filtering. Besides causing validation errors it can also cause "broken pages" and large security holes (cross-site scripting attacks).
Whenever you get data from anywhere that isn't 100% trusted, you need to make it safe in some way. You can do this by doing some or all of:
If your user input is meant to be interpreted as text then you're mostly looking at option 1; if you're letting the users use HTML then you're looking at options 2 and 3. A fourth option is to have the users use some more restrictive non-HTML markup such as Markdown or bbCode, translating between that markup and HTML using a library that (hopefully) doesn't allow the injection of security holes, page-breaking constructs, or other scary things.
允许用户输入 HTML 标记不是一个好主意。
这使得各种令人讨厌的事情成为可能,最明显的是跨站点脚本(XSS)漏洞和隐藏垃圾邮件的注入(对您隐藏,而不是对搜索引擎机器人隐藏)。
您应该:
使用
htmlspecialchars()
删除所有 HTML 标记,并仅使用nl2br()
保留换行符。您可以通过实现自己的安全标记来允许某些格式,该安全标记仅允许非常具体的标记(例如 phpBB 或类似 Wiki 的标记)。使用HTML Purifier可靠地消除所有潜在危险的标记。 PHP 的
strip_tags()
函数从根本上被破坏,如果使用白名单参数,则允许在属性中包含危险代码。It's a bad idea to allow users to enter HTML markup.
This enables all kinds of nasty things, most notably cross-site scripting (XSS) exploits and injection of hidden spam (hidden from you, not search engine bots).
You should:
Obliterate all HTML tags using
htmlspecialchars()
and only preserve newlines withnl2br()
. You might allow some formatting by implementing your own safe markup that allows only very specific tags (things like phpBB or Wiki-like markup).Use HTML Purifier to reliably eliminate all potentially-dangerous markup. PHP's
strip_tags()
function is fundamentally broken and allows dangerous code in attributes if you use whitelist argument.