如何获取 Unicode 字符的字形 unicode 表示形式
Windows 使用 uniscribe 库根据位置替换阿拉伯语和印度语键入的字符。新字形仍然具有键入字符的原始 unicode,尽管它有其专用的 Unicode 表示形式 如何获取实际显示内容而不是键入内容的 Unicode。
Windows use uniscribe library to substitute arabic and indi typed characters based on their location. The new glyph is still have the original unicode of the typed character althogh it has its dedicated representation in Unicode
How to get the Unicode of what is actually displayed not what is typed.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您对 Uniscribe 中发生的情况的解释不正确。
一旦有了字形,原始信息就消失了,没有可靠的方法可以返回到 Unicode。
即使不使用阿拉伯语,也无法区分 fi 连字的字形(例如)来自“f”和“i”(U+0066 U+0069) 还是来自“fi”(U+FB01) 。
(http://www.fileformat.info/info/unicode/char/ fb01/index.htm)
此外,某些生成的字形没有与之关联的 Unicode 值,因此不存在“实际显示内容的 Unicode”
Your interpretation of what is happening in Uniscribe is not correct.
Once you have glyphs the original information is gone there is no reliable way to go back to Unicode.
Even without going to Arabic, there is no way to distinguish if the glyph for the fi ligature (for example) comes from 'f' and 'i' (U+0066 U+0069) or from 'fi' (U+FB01).
(http://www.fileformat.info/info/unicode/char/fb01/index.htm)
Also, some of the resulting glyphs do not have a Unicode value associated with them, so there is no "Unicode of what is actually displayed"
有很多工具可以用于此目的,例如 ICU、Charmap 等等。我自己推荐http://unicode.codeplex.com,它使用Unicode字符数据库来表示字符。
请注意,unicode 只是有关字符的一些信息,并没有谈到表示。他们只是建议像他们的例子一样实现一个单词。因此,要查看每个代码,您需要
标准Unicode字体
,例如MS Arial Unicode,这是Windows平台上最大和最好的选择。大多数字符都是以此字体实现的,但对于新字符,您需要对其进行更新(如果有这样的更新),或者您可以使用您知道它实现了您想要的字符的字体
There are lots of tools for this like ICU, Charmap and the rest. I myself recommand http://unicode.codeplex.com, it uses Unicode Character Database to represent characters.
Note that unicode is just some information about characters and never spoke about representation. They just suggest to implement a word just like their example. so that to view each code you need
Standard Unicode Font
like MS Arial Unicode whichis the largest and the best choise in windows platform.Most of the characters are implemented in this font but for new characters you need an update for it (if there are such an update) or you can use the font which you know that it implemented your desire characters