TeX 连字符模式:它们代表什么
如果您向下滚动此页面 稍微,您会看到英国英语的连字符模式,例如:
\patterns{ % just type <return> if you're not using INITEX
.ab4i
.ab3ol
.ace4
.acet3
.ach4
.ac5tiva
.ab4i 等这些模式是什么意思?
If you scroll down this page a bit, you'd see UK English hyphenation patterns like:
\patterns{ % just type <return> if you're not using INITEX
.ab4i
.ab3ol
.ace4
.acet3
.ach4
.ac5tiva
What do these patterns like .ab4i mean?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
TeX 连字模式中有三种字符。点
.
是单词边界的锚点。一个字母代表其本身,即要连字符连接的单词中的一个字母。数字代表潜在的连字点,数字表示连字级别。总共有五个级别。该算法的基本思想是将单词与模式进行匹配,并从每个匹配的模式中插入连字符级别。如果两个不同模式的两个级别在同一点匹配,则选择较高的级别。在最终值中,只有奇数级别表示允许的连字符点。这个想法是能够指定可能的连字符点和不应插入连字符的位置。因此,例如,如果单词中的特定位置与该位置具有 1 和 2 的两个模式匹配,则不允许在该点进行连字符,因为 2 会覆盖 1,并且只有奇数值表示允许的连字符点。
查看您的示例,
.ab4i
表示单词开头的abi
很少会在b
和i< 之间收到连字符/code> 因为级别 4(偶数)将抑制连字符,除非被 5 覆盖。另一方面,以
activa
开头的单词始终可以在c
之间连字符code> 和t
因为 5 将覆盖任何其他值,并且作为奇数,允许连字符。There are three kinds of characters in a TeX hyphenation pattern. The dot
.
is an anchor for word boundary. A letter stands for itself, that is, a letter in the word to be hyphenated. A number stands for a potential hyphenation point, and the number signifies the hyphenation level. There are five levels in total.The basic idea of the algorithm is that a word is matched against the patterns, and the hyphenation level inserted from each pattern that matches. If two levels from two different patterns match at the same point, the higher one is selected. Of the final values, only odd levels indicate allowed hyphenation points. The idea is to be able to specify both possible hyphenation points and places where a hyphen should not be inserted. So, for example, if a specific spot in a word matches two patterns that have a 1 and a 2 in that spot, hyphenation at that point is not allowed because the 2 overrides the 1 and only an odd value indicates a permitted hyphenation point.
Looking at your examples,
.ab4i
indicates thatabi
at the start of a word will rarely receive a hyphen betweenb
andi
because a level of 4, being even, will inhibit hyphenation unless overridden by a 5. On the other hand, a word beginning withactiva
can always be hyphenated between thec
and thet
because the 5 will override any other value and, being odd, permits hyphenation.这些模式是使用名为
patgen2
的工具创建的。有关此工具的教程的 TeX 源代码位于 patgen2.tutorial,有关该主题的博士论文可通过 tug.org 获取。These patterns are created using a tool called
patgen2
. There's TeX source for a tutorial about this tool at patgen2.tutorial, and the Ph. D. thesis on this topic available through tug.org.