javascript 中 charcode 中的 Unicode 字符 for charcodes > 0xFFFF
我需要从 unicode 字符代码获取字符串/字符,最后将其放入 DOM TextNode 中,以使用客户端 JavaScript 添加到 HTML 页面中。
目前,我正在做:
String.fromCharCode(parseInt(charcode, 16));
其中charcode
是包含charcode的十六进制字符串,例如“1D400”
。应该返回的unicode字符是
I need to get a string / char from a unicode charcode and finally put it into a DOM TextNode to add into an HTML page using client side JavaScript.
Currently, I am doing:
String.fromCharCode(parseInt(charcode, 16));
where charcode
is a hex string containing the charcode, e.g. "1D400"
. The unicode character which should be returned is ????
, but a 퐀
is returned! Characters in the 16 bit range (0000
... FFFF
) are returned as expected.
Any explanation and / or proposals for correction?
Thanks in advance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
String.fromCharCode 只能处理BMP 中的代码点(即最大U+FFFF)。为了处理更高的代码点,可以使用 Mozilla 开发者网络 的此函数返回代理对表示:
String.fromCharCode can only handle code points in the BMP (i.e. up to U+FFFF). To handle higher code points, this function from Mozilla Developer Network may be used to return the surrogate pair representation:
问题是 JavaScript 中的字符是(大部分)UCS-2 编码,但可以表示一个字符在 JavaScript 中的基本多语言平面之外作为 UTF-16 代理对。
以下函数改编自 将带有破折号字符的 punycode 转换为 Unicode:
The problem is that characters in JavaScript are (mostly) UCS-2 encoded but can represent a character outside the Basic Multilingual Plane in JavaScript as a UTF-16 surrogate pair.
The following function is adapted from Converting punycode with dash character to Unicode:
EcmaScript 语言规范第 8.4 节说
因此,您需要将补充代码点编码为 UTF-16 代码单元对。
文章“Java 平台中的增补字符” 很好地描述了如何执行此操作。
了解 UTF-16 后代码单元,您可以使用 javascript 函数
String.fromCharCode
创建字符串:Section 8.4 of the EcmaScript language spec says
So you need to encode supplemental code-points as pairs of UTF-16 code units.
The article "Supplementary Characters in the Java Platform" gives a good description of how to do this.
Once you know the UTF-16 code units, you can create a string using the javascript function
String.fromCharCode
:String.fromCodePoint()
似乎也能做到这一点。请参阅此处。输出:
String.fromCodePoint()
seems to do the trick as well. See here.Output: