我正在开发一个生成 PDF 的项目,其中可以包含相当复杂的数学和科学公式。 文本以 Times New Roman 格式呈现,它具有相当不错的 Unicode 覆盖率,但并不完整。 我们有一个系统可以用更完整的 Unicode 字体来替换 TNR 中没有字形的代码点(就像大多数“陌生”数学符号),但我似乎找不到查询的方法*.ttf 文件以查看给定字形是否存在。 到目前为止,我只是硬编码了一个包含代码点的查找表,但我更喜欢自动解决方案。
我在 ASP.net 下的 Web 系统中使用 VB.Net,但任何编程语言/环境中的解决方案将不胜感激。
编辑:win32 解决方案看起来很棒,但我试图解决的具体情况是在 ASP.Net Web 系统中。 有没有办法在不将 Windows API DLL 包含到我的网站中的情况下执行此操作?
I'm working on a project that generates PDFs that can contain fairly complex math and science formulas. The text is rendered in Times New Roman, which has pretty good Unicode coverage, but not complete. We have a system in place to swap in a more Unicode complete font for code points that don't have a glyph in TNR (like most of the "stranger" math symbols,) but I can't seem to find a way to query the *.ttf file to see if a given glyph is present. So far, I've just hard-coded a lookup table of which code points are present, but I'd much prefer an automatic solution.
I'm using VB.Net in a web system under ASP.net, but solutions in any programming language/environment would be appreciated.
Edit: The win32 solution looks excellent, but the specific case I'm trying to solve is in an ASP.Net web system. Is there a way to do this without including the windows API DLLs into my web site?
发布评论
评论(6)
这是使用 C# 和 Windows API 的一个过程。
然后,给定一个要检查的 char toCheck 和一个 Font theFont 来测试它...
使用 VB.Net 的相同代码
Here's a pass at it using c# and the windows API.
Then, given a char toCheck that you want to check and a Font theFont to test it against...
Same code using VB.Net
我仅通过 VB.Net 单元测试完成此操作,没有进行 WIN32 API 调用。 它包括检查特定字符 U+2026(省略号)和 U+2409 (HTab),还返回具有字形的字符数(以及低值和高值)。 我只对等宽字体感兴趣,但很容易改变......
输出是
I have done this with just a VB.Net Unit Test and no WIN32 API calls. It includes code to check specific characters U+2026 (ellipsis) & U+2409 (HTab), and also returns # of characters (and low and high values) that have glyphs. I was only interested in Monospace fonts, but easy enough to change ...
The output was
Scott Nichols 发布的代码很棒,但有一个错误:如果字形 id 大于 Int16.MaxValue,则会抛出 OverflowException。 为了解决这个问题,我添加了以下函数:
然后将函数 GetUnicodeRangesForFont 中的主 for 循环更改为如下所示:
The code posted by Scott Nichols is great, except for one bug: if the glyph id is greater than Int16.MaxValue, it throws an OverflowException. To fix it, I added the following function:
And then changed the main for loop in the function GetUnicodeRangesForFont to look like this:
这篇 Microsoft 知识库文章可能会有所帮助:
http://support.microsoft.com/kb/241020
它有点过时了(最初是为 Windows 95 编写),但一般原则可能仍然适用。 示例代码是 C++,但由于它只是调用标准 Windows API,因此它很可能也可以在 .NET 语言中工作,只需一点点努力。
-编辑-
看来旧的 95 时代的 API 已经被微软称为“Uniscribe",它应该能够完成您需要的操作。
This Microsoft KB article may help:
http://support.microsoft.com/kb/241020
It's a bit dated (was originally written for Windows 95), but the general principle may still apply. The sample code is C++, but since it's just calling standard Windows APIs, it'll more than likely work in .NET languages as well with a little elbow grease.
-Edit-
It seems that the old 95-era APIs have been obsoleted by a new API Microsoft calls "Uniscribe", which should be able to do what you need it to.
FreeType 是一个可以读取 TrueType 字体文件(以及其他文件)并可用于查询字体的库特定的字形。 但是,FreeType 是为渲染而设计的,因此使用它可能会导致您引入比该解决方案所需的更多代码。
不幸的是,即使在 OpenType / TrueType 字体领域,也没有真正明确的解决方案; 字符到字形的映射有大约十几种不同的定义,具体取决于字体的类型及其最初设计的平台。 您可以尝试查看 Microsoft 的 cmap 表定义 a href="http://www.microsoft.com/typography/otspec/otff.htm" rel="nofollow noreferrer">OpenType 规范,但它并不容易阅读。
FreeType is a library that can read TrueType font files (among others) and can be used to query the font for a specific glyph. However, FreeType is designed for rendering, so using it might cause you to pull in more code than you need for this solution.
Unfortunately, there's not really a clear solution even within the world of OpenType / TrueType fonts; the character-to-glyph mapping has about a dozen different definitions depending on the type of font and what platform it was originally designed for. You might try to look at the cmap table definition in Microsoft's copy of the OpenType spec, but it's not exactly easy reading.
斯科特的回答很好。 这是另一种方法,如果每个字体只检查几个字符串(在我们的例子中每个字体 1 个字符串),该方法可能会更快。 但如果您使用一种字体来检查大量文本,速度可能会更慢。
Scott's answer is good. Here is another approach that is probably faster if checking just a couple of strings per font (in our case 1 string per font). But probably slower if you are using one font to check a ton of text.