为什么 HTML 编码可以防止某些 XSS 攻击?
我一直在读到您在从服务器返回客户端的过程中进行 HTML 编码(我认为?),这将防止许多类型的 XSS 攻击。然而,我一点也不明白。 HTML 仍然会被浏览器使用和渲染,对吧?
这怎么能阻止任何事情呢?
我在多个地方、网站和书籍中读到过有关此内容的内容,但没有任何地方真正解释为什么这有效。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
想一想:编码 HTML 是什么样子的?例如,它可能如下所示:
因此它将在客户端上呈现为文字(如 ),而不是 HTML。这意味着您不会看到实际的链接,而是看到代码本身。
XSS 攻击的基础是,某人可以让客户端浏览器解析站点提供商不打算出现的 HTML;如果上述内容未进行编码,则意味着所提供的链接将嵌入到网站中,尽管网站提供商并不希望如此。
XSS 当然比这更复杂一些,通常也涉及 JavaScript(因此称为跨站点脚本),但出于演示目的,这个简单的示例应该足够了; JavaScript 代码与简单的 HTML 标记相同,因为 XSS 是更一般的 HTML 注入的特殊情况。
Think about it: What does encoded HTML look like? For example, it could look like this:
So it will be rendered on the client as the literals (as <a href="www.stackoverflow.com">), not as HTML. Meaning you won't see an actual link, but the code itself.
XSS attacks work on the basis that someone can make a client browser parse HTML that the site provider didn't intend to be on there; if the above weren't encoded, it would mean that the provided link would be embedded in the site, although the site provider didn't want that.
XSS is of course a little more elaborate than that, and usually involves JavaScript as well (hence the Cross Site Scripting), but for demonstration purposes this simple example should suffice; it's the same with JavaScript code as with simple HTML tags, since XSS is a special case of the more general HTML injection.
HTML 编码将
转换为
<div>
,这意味着任何 HTML 标记都将在页面上显示为文本,而不是作为 HTML 执行标记。转换的基本实体有:
&
到&
<
到<
>
到>
"
到"
OWASP 建议对一些附加字符进行编码:
'至
'
/
到/
这些编码是您以文本方式表示字符的方式,否则这些字符将被用作标记。如果您想编写
a,则必须小心,如果您使用
a<,则
不会被视为 HTML 元素。 ;b
文本将向用户显示的是a。
HTML encoding turns
<div>
into<div>
, which means that any HTML markup will display on the page as text, rather than executed as HTML markup.The basic entities that are converted are:
&
to&
<
to<
>
to>
"
to"
OWASP recommends encoding some additional characters:
'
to'
/
to/
These encodings are how you textually represent characters that would otherwise be consumed as markup. If you wanted to write
a<b
you'd have to be careful that<b
isn't treated like an HTML element. If you usea<b
the text that will be displayed to the user will bea<b
.