处理 htmlescape/htmlspecialchars

发布于 2025-01-04 12:06:48 字数 753 浏览 1 评论 0原文

为了防止 XSS，每当您输出回用户输入时（就像显示输入错误的内容或使用之前提交的值重新绘制表单时所做的那样），您确实需要转义 html。这是肯定的事情……

所以，这样做

echo "the name which was supplied as {$_GET['company_name']} is not accepted"

是不对的。

相反，我们会这样做。

echo "the name which was supplied as " . htmlspecialchars($_GET['company_name']) . " is not accepted"

考虑到这一点，我的问题来了；当 $_GET['company_name'] 需要显示回它开始的文本框中时，你该怎么办？也许您希望您的用户更正该 company_name 只是因为它太长？

如果您要使用 htmlspecialchars，并且 company_name 是 AT&T，则 &那里会逃脱并显示为 &放大器；不是吗？

那么遇到这种情况我们该如何处理呢？当然，有人可能会说，那不htmlspecialchar它，就按原样返回吗？

但随后有人可能会向我们发送一个精心设计的 company_name，以阻止文本框启动 javascript onclick 并从那里执行 XSS。

在这些情况下你如何处理 htmlescape？只使用history.go(-1)？

原文

To prevent XSS, whenever you output back the user input ( like you do in displaying what was entered wrong or when re-painting the form with the earlier submitted values ), you do need to escape the html. That's a sure thing...

so, doing something like

echo "the name which was supplied as {$_GET['company_name']} is not accepted"

would not be right.

Instead, we would do this.

echo "the name which was supplied as " . htmlspecialchars($_GET['company_name']) . " is not accepted"

With that in mind, here comes my question;, what do you do when the $_GET['company_name'] needs to be displayed back in the textbox where it started from? maybe you want your user to correct that company_name just because it's too long?

if you were to use htmlspecialchars, and if the company_name was say AT&T, the & there would have escaped and appear as & amp; Isn't it?

So how do we deal with this situation? Of course, one might say, then don't htmlspecialchar it, just return it as is?

but then somebody may send us a company_name which is carefully crafted to stop the textbox start a javascript onclick and do the XSS from there.

How do you deal with the htmlescape in these situations? Just use the history.go(-1)?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

陌路黄昏 2025-01-11 12:06:48

我强烈建议您查看OWASP XSS 预防备忘单，如果您有兴趣了解有关预防 XSS 的更多信息。

当浏览器呈现 HTML（以及相关内容，如 CSS）时，它会为不同类型的输入识别不同的呈现上下文。每个上下文对于如何以及何时执行脚本代码都有不同的语义。因此，浏览器处理 HTML 的规则与它用于呈现 JavaScript 的规则不同，CSS 的规则也不同，等等。这意味着，如果您试图防止 XSS，则必须对放入不受信任数据的上下文非常敏感。

如果您使用 PHP 等服务器端代码将不安全值回显到 HTML 属性中（包括值）表单输入），您需要转义 HTML 属性的文本。假设页面使用 UTF-8 编码，您将执行以下操作：

<input type="text" value="<?php echo htmlspecialchars($_GET['company_name'], ENT_QUOTES, 'UTF-8'); ?>" >

“ENT_QUOTES”选项很重要，因为它告诉 PHP 对 HTML 转义引号。未转义的引号可用于“突破”属性并添加 JavaScript 事件处理程序，例如“onclick”、“onfocus”等。

在“AT&T”示例中，您不会看到 &<这是因为在 HTML 属性的上下文中，您的浏览器可能会将 HTML 实体（如 &）呈现为其关联字符（如 &）

。你看如果您使用 JavaScript 修改输入的值，您的浏览器将使用一组不同的规则来

确定如何处理 HTML。转义“AT&T”，然后使用 yourInput.setAttribute(“value”, HtmlEscapingFunction('AT&T')) 之类的内容插入新值，用户会< /em> 参见AT&T 这是因为您现在正在 DOM 执行上下文中工作，而在 DOM 执行上下文中，HTML 转义属性值会导致双重编码。

I strongly encourage you to check out the OWASP XSS prevention cheat sheet if you're interested in learning more about preventing XSS.

When a browser renders HTML (and associated content, like CSS), it identifies different rendering contexts for different types of input. Each context has distinct semantics for how and when it can execute script code. So your browser's rules for handling HTML are different than the rules it uses to render JavaScript, which are different for the rules for CSS, and so on. This means that if you're trying to prevent XSS, you have to be very sensitive to the context the untrusted data is being put in.

If you are using server-side code like PHP to echo unsafe values into HTML attributes (including the value of a form input), you need to escape the text for HTML attributes. Assuming the page is using UTF-8 encoding, you would would do something like:

<input type="text" value="<?php echo htmlspecialchars($_GET['company_name'], ENT_QUOTES, 'UTF-8'); ?>" >

The "ENT_QUOTES" option is important, because it tells PHP to HTML escape quotation marks. Unescaped quotation marks can be used to "break out" of an attribute and add JavaScript event handlers like "onclick", 'onfocus" etc.

In your "AT&T" example, you would not see & in the input box. This is because in the context of an HTML attribute, your browser renders HTML entities (like &) as their associated characters (like &).

When might you see & in the text box?

If you modify the value of the input using JavaScript, your browser uses a different set of rules for determining how the new value will be handled. If you were to HTML escape 'AT&T' and then insert that new value using something like, ex. yourInput.setAttribute(“value”, HtmlEscapingFunction('AT&T')), the user would see AT&T. This is because you're now working in a DOM execution context, and in a DOM execution context, HTML escaping an attribute value causes double-encoding.

回复收藏 0 原文

~没有更多了~