在 <script> 中转义 JavaScript 字符串文字中的 HTML 实体堵塞

在

一方面，如果我有

<script>
var s = 'Hello </script>';
console.log(s);
</script>

浏览器，它会提前终止

另一方面，字符串的值可能来自用户（例如，通过先前提交的表单，现在字符串最终作为文字插入到

Hello &lt;/script&gt;

在这种情况下不是所需的行为。

在

var s = 'Hello <\/script>';

这似乎工作正常。

接下来是 HTML 事件处理程序中的 JS 代码问题，它也很容易被破坏，例如，

<div onClick="alert('Hello ">')"></div>

乍一看似乎有效，但在大多数（或所有？）浏览器中都会破坏。这显然需要完整的 HTML 实体编码。

我的问题是：如果您的 JS 代码可以部分在服务器端生成并且可能包含恶意数据，那么正确涵盖上述所有情况（即脚本块中的 JS、事件处理程序中的 JS）的最佳/标准实践是什么？

原文

On the one hand if I have

<script>
var s = 'Hello </script>';
console.log(s);
</script>

the browser will terminate the <script> block early and basically I get the page screwed up.

On the other hand, the value of the string may come from a user (say, via a previously submitted form, and now the string ends up being inserted into a <script> block as a literal), so you can expect anything in that string, including maliciously formed tags. Now, if I escape the string literal with htmlentities() when generating the page, the value of s will contain the escaped entities literally, i.e. s will output

Hello </script>

which is not desired behavior in this case.

One way of properly escaping JS strings within a <script> block is escaping the slash if it follows the left angle bracket, or just always escaping the slash, i.e.

var s = 'Hello <\/script>';

This seems to be working fine.

Then comes the question of JS code within HTML event handlers, which can be easily broken too, e.g.

<div onClick="alert('Hello ">')"></div>

looks valid at first but breaks in most (or all?) browsers. This, obviously requires the full HTML entity encoding.

My question is: what is the best/standard practice for properly covering all the situations above - i.e. JS within a script block, JS within event handlers - if your JS code can partly be generated on the server side and can potentially contain malicious data?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

寻梦旅人 2025-01-01 06:39:53

以下字符可能干扰 HTML 或 Javascript 解析器，应在字符串文字中转义：<、>、"、'、\、 和 &正如

您所发现的，在使用转义字符的脚本块中，串联方法 (') 可能很难阅读。

var s = 'Hello <\/script>';

对于内联HTML 中的 Javascript，您可以使用实体：

<div onClick="alert('Hello ">')">click me</div>

演示： http://jsfiddle.net/ThinkingStiff/67RZH/

有效的方法在

< - \u003c
> - \u003e
" - \u0022
' - \u0027
\ - \u005c
& - \u0026

演示：http://jsfiddle.net/ThinkingStiff/Vz8n7/

HTML：

<div onClick="alert('Hello \u0022>')">click me</div>

<script>
    var s = 'Hello \u003c/script\u003e';
alert( s );
</script>

The following characters could interfere with an HTML or Javascript parser and should be escaped in string literals: <, >, ", ', \, and &.

In a script block using the escape character, as you found out, works. The concatenation method (</scr' + 'ipt>') can be hard to read.

var s = 'Hello <\/script>';

For inline Javascript in HTML, you can use entities:

<div onClick="alert('Hello ">')">click me</div>

Demo: http://jsfiddle.net/ThinkingStiff/67RZH/

The method that works in both <script> blocks and inline Javascript is \uxxxx, where xxxx is the hexadecimal character code.

< - \u003c
> - \u003e
" - \u0022
' - \u0027
\ - \u005c
& - \u0026

Demo: http://jsfiddle.net/ThinkingStiff/Vz8n7/

HTML:

<div onClick="alert('Hello \u0022>')">click me</div>

<script>
    var s = 'Hello \u003c/script\u003e';
alert( s );
</script>

回复收藏 0 原文

终弃我 2025-01-01 06:39:53

我想说最好的做法是首先避免使用内联 JS。

将 JS 代码放在一个单独的文件中，并将其包含在 src 属性中

<script src="path/to/file.js"></script>

，并使用它从内部设置事件处理程序，而不是将它们放在 HTML 中。

//jquery example
$('div.something').on('click', function(){
    alert('Hello>');
})

I'd say the best practice would be avoiding inline JS in the first place.

Put the JS code in a separate file and include it with the src attribute

<script src="path/to/file.js"></script>

and use it to set event handlers from the inside isntead of putting those in the HTML.

//jquery example
$('div.something').on('click', function(){
    alert('Hello>');
})

回复收藏 0 原文

尤怨 2025-01-01 06:39:53

（编辑-不知何故没有注意到您已经在问题中提到了斜杠转义...）

好的，所以您知道如何转义斜杠。

在内联事件处理程序中，您不能在文字中使用边界字符，因此请使用另一个：

<div onClick='alert("Hello \"")'>test</div>

但这一切都是为了让您的生活变得困难。只是不要使用内联事件处理程序！或者，如果您绝对必须这样做，则让它们调用其他地方定义的函数。

一般来说，服务器端代码编写 JavaScript 的原因很少。不要从服务器生成脚本 - 将数据传递给预先编写的脚本。

（原文）

您可以使用反斜杠（这不是特殊的转义字符）转义 JS 字符串文字中的任何内容：

var s = 'Hello <\/script>';

这也具有使其不被解释为 html 的积极效果。因此，您可以将“/”全部替换为“\/”，不会产生任何不良影响。

不过，一般来说，我担心您会将用户提交的数据作为字符串文字嵌入到 JavaScript 中。您是否在服务器上生成 javascript 代码？为什么不直接以 JSON 或 HTML“数据”属性或其他方式传递数据呢？

(edit - somehow didn't notice you mentioned slash-escape in your question already...)

OK so you know how to escape a slash.

In inline event handlers, you can't use the bounding character inside a literal, so use the other one:

<div onClick='alert("Hello \"")'>test</div>

But this is all in aid of making your life difficult. Just don't use inline event handlers! Or if you absolutely must, then have them call a function defined elsewhere.

Generally speaking, there are few reasons for your server-side code to be writing javascript. Don't generate scripts from the server - pass data to pre-written scripts instead.

(original)

You can escape anything in a JS string literal with a backslash (that is not otherwise a special escape character):

var s = 'Hello <\/script>';

This also has the positive effect of causing it to not be interpreted as html. So you could do a blanket replace of "/" with "\/" to no ill effect.

Generally, though, I am concerned that you would have user-submitted data embedded as a string literal in javascript. Are you generating javascript code on the server? Why not just pass data as JSON or an HTML "data" attribute or something instead?

回复收藏 0 原文

-黛色若梦 2025-01-01 06:39:53

我是这样做的：

function encode(r){
return r.replace(/[\x26\x0A\<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"})
}

var myString='Encode HTML entities!\n"Safe" escape <script></'+'script> & other tags!';

test.value=encode(myString);

testing.innerHTML=encode(myString);

/*************
* \x26 is &ersand (it has to be first),
* \x0A is newline,
*************/

<textarea id=test rows="9" cols="55"></textarea>

<div id="testing">www.WHAK.com</div>

Here's how I do it:

function encode(r){
return r.replace(/[\x26\x0A\<>'"]/g,function(r){return"&#"+r.charCodeAt(0)+";"})
}

var myString='Encode HTML entities!\n"Safe" escape <script></'+'script> & other tags!';

test.value=encode(myString);

testing.innerHTML=encode(myString);

/*************
* \x26 is &ersand (it has to be first),
* \x0A is newline,
*************/

<textarea id=test rows="9" cols="55"></textarea>

<div id="testing">www.WHAK.com</div>

回复收藏 0 原文

笛声青案梦长安 2025-01-01 06:39:53

大多数人都会使用这个技巧：

var s = 'Hello </scr' + 'ipt>';

Most people use this trick:

var s = 'Hello </scr' + 'ipt>';

回复收藏 0 原文

~没有更多了~

关于作者

星

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。