iTextSharp 国际文本

发布于 2024-08-10 16:40:05 字数 88 浏览 10 评论 0原文

我在 asp.net 页面中有一个表格,并尝试将其导出为 PDF 文件,我有几个国际字符未显示在生成的 PDF 文件中,请提供任何建议,

提前致谢

I have a table in asp.net page,and trying to export it as a PDF file,I have couple of international characters that are not shown in generated PDF file,any suggestions,

Thanks in advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

笑看君怀她人 2024-08-17 16:40:05

正确显示替代字符集(俄语、中文、日语等)的关键是在创建 BaseFont 时使用 IDENTITY_H 编码。

Dim bfR As iTextSharp.text.pdf.BaseFont
  bfR = iTextSharp.text.pdf.BaseFont.CreateFont("MyFavoriteFont.ttf", iTextSharp.text.pdf.BaseFont.IDENTITY_H, iTextSharp.text.pdf.BaseFont.EMBEDDED)

IDENTITY_H 为您选择的字体提供 unicode 支持,因此您应该能够显示几乎任何字符。我用它来表示俄语、希腊语和所有不同的欧洲语言字母。

编辑 - 2013 年 5 月 28 日

这也适用于 iTextSharp v5.0.2。

编辑 - 2015 年 6 月 23 日

下面给出的是完整的代码示例(C# 语言):

private void CreatePdf()
{
  string testText = "đĔĐěÇøç";
  string tmpFile = @"C:\test.pdf";
  string myFont = @"C:\<<valid path to the font you want>>\verdana.ttf";
  iTextSharp.text.Rectangle pgeSize = new iTextSharp.text.Rectangle(595, 792);
  iTextSharp.text.Document doc = new iTextSharp.text.Document(pgeSize, 10, 10, 10, 10);
  iTextSharp.text.pdf.PdfWriter wrtr;
  wrtr = iTextSharp.text.pdf.PdfWriter.GetInstance(doc,
      new System.IO.FileStream(tmpFile, System.IO.FileMode.Create));
  doc.Open();
  doc.NewPage();
  iTextSharp.text.pdf.BaseFont bfR;
  bfR = iTextSharp.text.pdf.BaseFont.CreateFont(myFont,
    iTextSharp.text.pdf.BaseFont.IDENTITY_H,
    iTextSharp.text.pdf.BaseFont.EMBEDDED);

  iTextSharp.text.BaseColor clrBlack = 
      new iTextSharp.text.BaseColor(0, 0, 0);
  iTextSharp.text.Font fntHead =
      new iTextSharp.text.Font(bfR, 12, iTextSharp.text.Font.NORMAL, clrBlack);

  iTextSharp.text.Paragraph pgr = 
      new iTextSharp.text.Paragraph(testText, fntHead);
  doc.Add(pgr);
  doc.Close();
}

这是创建的 pdf 文件的屏幕截图:

sample pdf

要记住的重要一点是,如果您选择的字体不支持您尝试发送到 pdf 文件的字符,则您在 iTextSharp 中执行的任何操作都不会起作用改变这一点。 Verdana 很好地显示了我所知道的所有欧洲字体中的字符。
其他字体可能无法显示那么多字符。

The key for proper display of alternate characters sets (Russian, Chinese, Japanese, etc.) is to use IDENTITY_H encoding when creating the BaseFont.

Dim bfR As iTextSharp.text.pdf.BaseFont
  bfR = iTextSharp.text.pdf.BaseFont.CreateFont("MyFavoriteFont.ttf", iTextSharp.text.pdf.BaseFont.IDENTITY_H, iTextSharp.text.pdf.BaseFont.EMBEDDED)

IDENTITY_H provides unicode support for your chosen font, so you should be able to display pretty much any character. I've used it for Russian, Greek, and all the different European language letters.

EDIT - 2013-May-28

This also works for v5.0.2 of iTextSharp.

EDIT - 2015-June-23

Given below is a complete code sample (in C#):

private void CreatePdf()
{
  string testText = "đĔĐěÇøç";
  string tmpFile = @"C:\test.pdf";
  string myFont = @"C:\<<valid path to the font you want>>\verdana.ttf";
  iTextSharp.text.Rectangle pgeSize = new iTextSharp.text.Rectangle(595, 792);
  iTextSharp.text.Document doc = new iTextSharp.text.Document(pgeSize, 10, 10, 10, 10);
  iTextSharp.text.pdf.PdfWriter wrtr;
  wrtr = iTextSharp.text.pdf.PdfWriter.GetInstance(doc,
      new System.IO.FileStream(tmpFile, System.IO.FileMode.Create));
  doc.Open();
  doc.NewPage();
  iTextSharp.text.pdf.BaseFont bfR;
  bfR = iTextSharp.text.pdf.BaseFont.CreateFont(myFont,
    iTextSharp.text.pdf.BaseFont.IDENTITY_H,
    iTextSharp.text.pdf.BaseFont.EMBEDDED);

  iTextSharp.text.BaseColor clrBlack = 
      new iTextSharp.text.BaseColor(0, 0, 0);
  iTextSharp.text.Font fntHead =
      new iTextSharp.text.Font(bfR, 12, iTextSharp.text.Font.NORMAL, clrBlack);

  iTextSharp.text.Paragraph pgr = 
      new iTextSharp.text.Paragraph(testText, fntHead);
  doc.Add(pgr);
  doc.Close();
}

This is a screenshot of the pdf file that is created:

sample pdf

An important point to remember is that if the font you have chosen does not support the characters you are trying to send to the pdf file, nothing you do in iTextSharp is going to change that. Verdana nicely displays the characters from all the European fonts I know of.
Other fonts may not be able to display as many characters.

月寒剑心 2024-08-17 16:40:05

未呈现字符有两个潜在原因:

  1. 编码。正如 Stewbob 指出的那样,Identity-H 是完全避免该问题的好方法,尽管它确实需要您嵌入字体的子集。这有两个后果。
    1. 与未嵌入的字体相比,它会稍微增加文件大小。
    2. 该字体必须获得嵌入子集的许可。大多数是,有些不是。
  2. 字体必须包含该字符。如果您要求使用西里尔文(俄语)字体提供一些阿拉伯连字,那么它不太可能出现。涵盖多种语言的字体很少,而且它们往往很大。我遇到的最大/最全面的字体是“Arial Unicode MS”。超过 23 兆字节。

这是要求嵌入子集的另一个充分理由。因为想要添加几个中文字形而增加几兆字节有点陡峭。

如果您感到偏执,您可以使用 myBaseFont.charExists(someChar) 根据给定的 BaseFont 实例(我相信这也考虑了编码)检查您的字符串。如果你有一个你有信心的字体,我不会打扰。

PS:Identity-H 需要嵌入子集还有另一个很好的理由。 Identity-H 从内容流中读取字节作为字形索引。一种字体与另一种字体的字形顺序可能会有很大差异,甚至同一字体的不同版本之间也会有很大差异。依赖查看器系统拥有完全相同的字体是一个坏主意,因此它是非法的......特别是当 Acrobat/Reader 开始替换字体时,因为它无法找到您要求的确切字体并且您没有嵌入它。

There are two potential reasons characters aren't rendered:

  1. The encoding. As Stewbob pointed out, Identity-H is a great way to avoid the issue entirely, though it does require you to embed a subset of the font. This has two consequences.
    1. It increases the file size a bit over unembedded fonts.
    2. The font has to be licensed for embedded subsets. Most are, some are not.
  2. The font has to contain that character. If you ask for some Arabic ligatures out of a Cyrillic (Russian) font, chances aren't good that it'll be there. There are very few fonts that cover a variety of languages, and they tend to be HUGE. The biggest/most comprehensive font I've run into was "Arial Unicode MS". Over 23 megabytes.

That's another good reason to require embedding SUBSETS. Tacking on a few megabytes because you wanted to add a couple Chinese glyphs is a bit steep.

If you're feeling paranoid, you can check your strings against a given BaseFont instance (which I believe takes the encoding into account as well) with myBaseFont.charExists(someChar). If you have a font you're confident in, I wouldn't bother.

PS: There's another good reason that Identity-H requires an embedded subset. Identity-H reads the bytes from the content stream as Glyph Indexes. The order of glyphs can vary wildly from one font to the next, or even between versions of the same font. Relying on a viewers system to have the EXACT same font is a bad idea, so its illegal... particularly when Acrobat/Reader starts substituting fonts because it couldn't find the exact font you asked for and you didn't embed it.

海风掠过北极光 2024-08-17 16:40:05

您可以尝试设置您正在使用的字体的编码。在 Java 中会是这样的:

BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.EMBEDDED);

其中 BaseFont.CP1252 是编码。尝试搜索要显示的字符所需的确切编码。

You can try setting the encoding for the font you are using. In Java would be something like this:

BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.EMBEDDED);

where the BaseFont.CP1252 is the encoding. Try to search for the exact encoding you need for the characters to be displayed.

金橙橙 2024-08-17 16:40:05

这是由默认的 iTextSharp 字体 - Helvetica - 引起的,它不支持除基本字符之外的其他字符(或不支持所有其他字符)。

实际上有 2 个选项:

  1. 一种是手动将表格内容重写到代码中。这种方法可能看起来更快你可以,但它也需要在代码中重复对原始表格的任何修改(打破了 DRY 原则)。在这种情况下,你可以轻松地设置你想要的字体。
  2. 另一种方法是从 HTML 中提取 PDF。这可能听起来有点复杂(确实如此),但是,工作解决方案更加灵活和通用,不久前我自己也遇到了与特殊字符的斗争,并决定在其他下发布一个稍微完整的解决方案。 stackoverflow 上有类似的解决方案: https://stackoverflow.com/a/24587745/1138663

It caused by default iTextSharp font - Helvetica - that does not support other than base characters (or not support all other characters.

There are actually 2 options:

  1. One is to rewrite the table content by hand into the code. This approach might look faster to you, but it requires any modification to the original table to be repeated in the code as well (breaking DRY principle). In this case, you can easily set-up font as you wish.
  2. The other is to extract PDF from HTML extracted from HtmlEngine. This might sound a bit more complicated and complex (and it is), however, working solution is much more flexible and universal. I suffered the struggle with special characters myself just a while ago and decided to post a somewhat complete solution under other similar solution here on stackoverflow: https://stackoverflow.com/a/24587745/1138663
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文