在
一方面,如果我有
<script>
var s = 'Hello </script>';
console.log(s);
</script>
浏览器,它会提前终止 块,基本上我会把页面搞砸。
另一方面,字符串的值可能来自用户(例如,通过先前提交的表单,现在字符串最终作为文字插入到 块中) ,因此您可以期待该字符串中的任何内容,包括恶意形成的标签。现在,如果我在生成页面时使用 htmlentities() 对字符串文字进行转义,则 s 的值将按字面意思包含转义的实体,即 s 将输出
Hello </script>
在这种情况下不是所需的行为。
在 块中正确转义 JS 字符串的一种方法是转义斜杠(如果斜杠位于左尖括号后面),或者始终转义斜杠,即
var s = 'Hello <\/script>';
这似乎工作正常。
接下来是 HTML 事件处理程序中的 JS 代码问题,它也很容易被破坏,例如,
<div onClick="alert('Hello ">')"></div>
乍一看似乎有效,但在大多数(或所有?)浏览器中都会破坏。这显然需要完整的 HTML 实体编码。
我的问题是:如果您的 JS 代码可以部分在服务器端生成并且可能包含恶意数据,那么正确涵盖上述所有情况(即脚本块中的 JS、事件处理程序中的 JS)的最佳/标准实践是什么?
On the one hand if I have
<script>
var s = 'Hello </script>';
console.log(s);
</script>
the browser will terminate the <script>
block early and basically I get the page screwed up.
On the other hand, the value of the string may come from a user (say, via a previously submitted form, and now the string ends up being inserted into a <script>
block as a literal), so you can expect anything in that string, including maliciously formed tags. Now, if I escape the string literal with htmlentities() when generating the page, the value of s will contain the escaped entities literally, i.e. s will output
Hello </script>
which is not desired behavior in this case.
One way of properly escaping JS strings within a <script>
block is escaping the slash if it follows the left angle bracket, or just always escaping the slash, i.e.
var s = 'Hello <\/script>';
This seems to be working fine.
Then comes the question of JS code within HTML event handlers, which can be easily broken too, e.g.
<div onClick="alert('Hello ">')"></div>
looks valid at first but breaks in most (or all?) browsers. This, obviously requires the full HTML entity encoding.
My question is: what is the best/standard practice for properly covering all the situations above - i.e. JS within a script block, JS within event handlers - if your JS code can partly be generated on the server side and can potentially contain malicious data?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
以下字符可能干扰 HTML 或 Javascript 解析器,应在字符串文字中转义:
<、>、"、'、\、
和&正如
您所发现的,在使用转义字符的脚本块中,串联方法 (
'
) 可能很难阅读。对于内联HTML 中的 Javascript,您可以使用实体:
演示: http://jsfiddle.net/ThinkingStiff/67RZH/
有效的方法在
块和内联 Javascript 中都是
\uxxxx
,其中xxxx
是十六进制字符代码。<
-\u003c
>
-\u003e
"
-\u0022
'
-\u0027
\
-\u005c
&
-\u0026
演示:http://jsfiddle.net/ThinkingStiff/Vz8n7/
HTML:
The following characters could interfere with an HTML or Javascript parser and should be escaped in string literals:
<, >, ", ', \,
and&
.In a script block using the escape character, as you found out, works. The concatenation method (
</scr' + 'ipt>'
) can be hard to read.For inline Javascript in HTML, you can use entities:
Demo: http://jsfiddle.net/ThinkingStiff/67RZH/
The method that works in both
<script>
blocks and inline Javascript is\uxxxx
, wherexxxx
is the hexadecimal character code.<
-\u003c
>
-\u003e
"
-\u0022
'
-\u0027
\
-\u005c
&
-\u0026
Demo: http://jsfiddle.net/ThinkingStiff/Vz8n7/
HTML:
我想说最好的做法是首先避免使用内联 JS。
将 JS 代码放在一个单独的文件中,并将其包含在
src
属性中,并使用它从内部设置事件处理程序,而不是将它们放在 HTML 中。
I'd say the best practice would be avoiding inline JS in the first place.
Put the JS code in a separate file and include it with the
src
attributeand use it to set event handlers from the inside isntead of putting those in the HTML.
(编辑-不知何故没有注意到您已经在问题中提到了斜杠转义...)
好的,所以您知道如何转义斜杠。
在内联事件处理程序中,您不能在文字中使用边界字符,因此请使用另一个:
但这一切都是为了让您的生活变得困难。只是不要使用内联事件处理程序!或者,如果您绝对必须这样做,则让它们调用其他地方定义的函数。
一般来说,服务器端代码编写 JavaScript 的原因很少。不要从服务器生成脚本 - 将数据传递给预先编写的脚本。
(原文)
您可以使用反斜杠(这不是特殊的转义字符)转义 JS 字符串文字中的任何内容:
这也具有使其不被解释为 html 的积极效果。因此,您可以将“/”全部替换为“\/”,不会产生任何不良影响。
不过,一般来说,我担心您会将用户提交的数据作为字符串文字嵌入到 JavaScript 中。您是否在服务器上生成 javascript 代码?为什么不直接以 JSON 或 HTML“数据”属性或其他方式传递数据呢?
(edit - somehow didn't notice you mentioned slash-escape in your question already...)
OK so you know how to escape a slash.
In inline event handlers, you can't use the bounding character inside a literal, so use the other one:
But this is all in aid of making your life difficult. Just don't use inline event handlers! Or if you absolutely must, then have them call a function defined elsewhere.
Generally speaking, there are few reasons for your server-side code to be writing javascript. Don't generate scripts from the server - pass data to pre-written scripts instead.
(original)
You can escape anything in a JS string literal with a backslash (that is not otherwise a special escape character):
This also has the positive effect of causing it to not be interpreted as html. So you could do a blanket replace of "/" with "\/" to no ill effect.
Generally, though, I am concerned that you would have user-submitted data embedded as a string literal in javascript. Are you generating javascript code on the server? Why not just pass data as JSON or an HTML "data" attribute or something instead?
我是这样做的:
Here's how I do it:
大多数人都会使用这个技巧:
Most people use this trick: