Unicode 字符如何映射到字体中的字形?
我想知道,Unicode 中的每个字符都有一个代码点;字体中的字符的类似术语是什么?
我从来不理解当解码文件需要映射到字体(或字体,通过一些现代字体替换技术)时的过程部分。
例如,当文本编辑器根据字符编码解码文件时,假设有希腊字母 α (U+03B1)。该应用程序选择字体中的特定字形的确切过程是什么?大多数应用程序都有首选字体。假设这是快递。 (如果像心 ♥ (U+2665) 这样的罕见 Unicode 字符不在默认字体中,会发生什么情况?应用程序如何知道该字体不包含该字符?)
字体是否包含有关以下内容的元信息:它有什么符号?
如果两种字体都具有符号 alpha,它们是否一定共享相同的“代码点”?还是取决于字体类型,例如 Type1、Type3、TrueType、OpenType? ...
感谢您的任何指示或参考。
I am wondering, that each char in Unicode has a code point; what's the analogous term for a character in a font?
I never understood the part of the process when a decoded file needs to be mapped to font (or fonts, by some modern font substitution technology).
For example, when a text editor has decoded a file from its character encoding, and suppose there's Greek alpha α (U+03B1). What's the exact process this app chooses a particular glyph in a font? Most app has a preferred font. Let's say it's Courier. (And what happens in the case of a rare Unicode char likethe heart ♥ (U+2665), that's not in the default font? How does the app know the font doesn't contain that char?)
Does a font contain meta info about what symbols it has?
If 2 fonts both have the symbol alpha, do they necessarily share the same “code point”? Or is it dependent on the type of font such as Type1, Type3, TrueType, OpenType? ...
Thanks for any pointers or references.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
TrueType 字体由许多部分组成,对于这个问题来说,最重要的是“字形”表和用于将字符映射到这些字形的表(“cmap”)。
长话短说,操作系统使用“cmap”表将字符转换为字形索引,用默认字形替换任何没有匹配条目的字形。不幸的是,在这些表中存在多个版本的字体文件规范(更不用说不同类型的字体)和相同映射的不同字符编码,因此进行映射的实际过程以及高效地进行映射的实际过程使得文本绘制速度很快,最终变得极其复杂。
“代码点”完全独立于字符、编码和字体。特定的代码点是通用的,但它有多种编码(UTF-8、UTF-16 等),并且它将映射到不同字体的不同字形索引。
Apple 的开发人员文档中有一个关于 TrueType 字体详细信息的非常好的部分:
http://developer.apple.com /fonts/ttrefman/
具体来说:
字形表:https ://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6glyf.html
字符映射表:https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
我还推荐一个名为 BabelMap,它为您提供了大量有关字体的有趣信息。具体查看工具/Unicode 摘要、字体/字体分析实用程序和字体/字体信息,您可以在其中将整个字形映射表提取到剪贴板。
TrueType fonts consist of a number of sections, most importantly for this question a table of "glyphs" and a table ("cmap") for mapping characters to those glyphs.
Long story short, the operating system uses the "cmap" table to convert characters into glyph indexes, substituting a default glyph for any which have no matching entry. Unfortunately there are multiple versions of the font file specification (not to mention different types of fonts) and different character encodings of the same mappings in those tables, so the actual process of doing the mapping, and doing it efficiently so that text drawing is fast, ends up being extremely complex.
A "Code Point" is completely independent of characters, encodings and fonts. A particular code point is universal, but there are many encodings for it (UTF-8, UTF-16, etc.) and it will map to different glyph indexes in different fonts.
Apple's developer documentation has a pretty good section on the details of TrueType fonts:
http://developer.apple.com/fonts/ttrefman/
Specifically:
Glyph table: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6glyf.html
Character map: https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
I also recommend an application called BabelMap, which gives you a lot of interesting information about fonts. Specifically look at Tools/Unicode Summary, Fonts/Font Analysis Utility, and Fonts/Font Information, where you can extract the entire glyph mapping table to the clipboard.