当字符串包含html实体时,在Javascript中设置文本节点的nodeValue
当我设置带有&符号的文本节点的值时,它
node.nodeValue="string with &#xxxx; sort of characters"
会被转义。 是否有捷径可寻?
When I set a value of a text node with
node.nodeValue="string with xxx; sort of characters"
ampersand gets escaped. Is there an easy way to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您需要对 Unicode 字符使用 Javascript 转义:
You need to use Javascript escapes for the Unicode characters:
来自http://code.google.com/p/jslibs/wiki/JavascriptTips:(
同时转换实体引用和数字实体)
From http://code.google.com/p/jslibs/wiki/JavascriptTips:
(converts both entity references and numeric entities)
发生这种情况的原因是因为 & 字符串中的 被浏览器扩展为&符号实体。 为了解决这个问题,您需要自己转换实体。
The reason this is happening is because the & in your string is being expanded into the ampersand entity by the browser. To get around this, you'll need to convert the entities yourself.
正如其他答案中所述,我需要用 javascript 编码的实体替换 html 编码的实体。 从 BaileyP 的回答开始,我已经做了这个:
As noted in other answers, I need to replace html encoded entities with javascript encoded ones. Starting from BaileyP's answer, I've made this:
OP 有实体/实体引用,并希望它们出现在 DOM 中的文本节点中。
这就是为什么接受的答案和许多其他答案很棒; 这些答案将实体转换为其 unicode 等效项 使用 Javascript unicode 转义序列。
但我有不同的需求,我有 unicode 字符,我想将它们作为实体引用放入文本节点中。 我特别想要实体引用,以便表示我的文档的 XML 字符串可以用 ASCII 编码(即
encoding="ascii"
)。 否则,正如 @Bjorn 所说,Unicode 字符将被“解码为垃圾”这就是我想要的,注意 ASCII 编码:
上面的 ASCII 编码的 XML/HTML 在浏览器中看起来不错
: sstatic.net/9oCjC.png" rel="nofollow noreferrer">
所以我不能使用其他答案,因为它们插入了 unicode 字符(但我想要 ASCII)。
我无法使用 DOM 文本节点 API 插入未转义实体引用。 正如OP指出的:如果您使用DOM文本节点API来设置节点的
textContent
或nodeValue
DOM 将 始终转义您尝试注入的任何实体...&
变为& amp;
“
变为“
作为已删除的答案建议,您可以尝试直接操作 HTML使用
innerHTML
或outerHTML
,但Text
API 没有这些属性。即使您正在处理非文本节点(例如
),DOM API
在我的浏览器中,实体不会保持完整,实体被“解析”/取消引用到它们的 utf-8 字符串,例如
“
变成“
但我想要我的文档是 ASCII 编码的,我不能使用 DOM 设置的 UTF-8 字符;
“
和”
将被“解码为垃圾”,如下所示:< /a>
是的,我可以简单地使用 utf-8 编码,因此我不需要实体引用(示例如下所示),但我更喜欢尊重原始编码(恰好是 ASCII)。
因此,如果您只使用 DOM,则没有好方法来放置未转义的内容实体引用到文本节点时,它们要么被转义,要么取消引用为 utf-8 行为。 我认为这是设计/预期行为,我很欣赏那...如果您只是操纵 DOM 来更改浏览器中呈现的内容,这可能没有问题。
但就我而言,我使用 DOM 创建和下载 XML 文档,因此我有机会获取
outerHTML
字符串并独立于 DOM API 操作它下载它。我获取
outerHTML
并运行下面的函数来转换 非 ASCII 字符与其实体等效项 (C# 中的类似方法)。 通过用实体引用替换非 ASCII,我的文档可以编码为 ASCII 并毫无问题地读取。The OP has entities / entity references, and wants them to appear in the DOM in a text node.
That's why the accepted answer and many other answers are great; those answers convert entities to their unicode equivalents using Javascript unicode escape sequences.
But I had a different need, I had unicode characters and I want to put them into the text node as entity references. I want entity references specifically so that the XML string representing my document could be encoded in ASCII (i.e.
encoding="ascii"
). Otherwise, as @Bjorn said, the Unicode characters would be "decoded as junk"This is what I want, note the ASCII encoding:
The ASCII encoded XML/HTML above looks good in a browser:
So I can't use the other answers because they insert unicode characters (but I want ASCII).
And I can't use the DOM text node API to insert unescaped entity references. As the OP points out: if you use DOM text node API to set the node's
textContent
ornodeValue
DOM will always escape any entities you try to inject...&
becomes&
“
becomes“
As a deleted answer suggested, you could try to manipulate HTML directly using
innerHTML
orouterHTML
, but theText
API does not have those properties.Even if you are working on a non-Text node (like a
<span>
), the DOM APIin my browser won't leave the entities intact, the entities are "parsed"/dereferenced to their utf-8 strings like
“
becomes“
But I want my document to be ASCII encoded, I can't use the UTF-8 characters as set by the DOM;
“
and”
will be "decoded as junk" as shown below:Yes I could simply use utf-8 encoding and therefore I don't need entity references (example shown below), but I prefer to respect the original encoding (which happened to be ASCII).
So if you are only using DOM, there's no good way to put unescaped entity references into the text nodes, they are either escaped or dereferenced to utf-8 behavior. I think this is as-designed/expected behavior, and I appreciate that... If you're only manipulating the DOM to change what renders in your browser, this might be no problem.
But in my case I was using the DOM to create and download an XML document, so I had an opportunity to get the
outerHTML
string and manipulate it independently of the DOM API before downloading it.I get the
outerHTML
and run the function below to convert non-ASCII characters to their entity equivalents (similar approach in C#). By replacing the non-ASCII with entity references, my document could be encoded as ASCII and read without problems.