使用带有 unicode 的 xhtml2pdf 时遇到问题

发布于 2024-09-29 19:43:50 字数 440 浏览 9 评论 0原文

我一直在尝试转换希伯来语 html 文件但没有成功；无论我尝试哪种编码，希伯来语字符在输出 PDF 中都会显示为黑色矩形。

我尝试了 pisa 发行版中包含的一些 unicode 测试文件： pisa-3.0.33\test\test-unicode-all.html 和 \test-bi Direction-text.html 。我在使用和不使用 --encoding utf-8 的情况下从命令行运行了 xhtml2pdf。结果相同：没有一个非拉丁字符能够通过。

这是字体问题*吗？如果 unicode 测试文件适合您，您是否做了任何设置？

*FWIW，至少其中一些语言（包括希伯来语）应该与 Arial 兼容。

编辑：或者，如果有人设置了 pisa 并且可以尝试转换上面的 unicode 测试文件，我将非常感激。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

笑脸一如从前 2024-10-06 19:43:50

将以下代码插入 html 帮助我

<style>
@page {
size: a4;
margin: 0.5cm;
}

@font-face {
font-family: "Verdana";
src: url("verdana.ttf");
}

html {
font-family: Verdana;
font-size: 11pt;
}

</style>

使用 url 而不是“verdana.ttf”，您应该在操作系统中输入字体的绝对路径

Inserting following code into html helped me

<style>
@page {
size: a4;
margin: 0.5cm;
}

@font-face {
font-family: "Verdana";
src: url("verdana.ttf");
}

html {
font-family: Verdana;
font-size: 11pt;
}

</style>

in url instead of "verdana.ttf" you should put absolute path to font in your os

回复收藏 0 原文

不必在意 2024-10-06 19:43:50

如果将来有人像我一样尝试弄清楚如何正确使用 xhtml2pdf 创建包含希伯来语的 PDF 文件，这对我有用：

第一件事：将字体设置包括为@eviltrue 在我的 HTML 中进行了描述。这可以是任何字体，只要它支持希伯来语字符即可，否则输入 HTML 中的任何希伯来语字符都会在 PDF 中简单地显示为黑色矩形。
在撰写此答案时，虽然可以在 xhtml2pdf 中将希伯来语字符输出到 PDF，但希伯来语字符以相反的顺序输出，即 שלום כתה א
将是 א התйכ םולש。

此时我被困住了，但后来我偶然发现了这个SO答案：
https://stackoverflow.com/a/15449145/1918837

安装python-bidi后包，这里是一个完整解决方案的示例（在 python 应用程序中使用）：

from bidi import algorithm as bidialg
from xhtml2pdf import pisa

HTMLINPUT = """
            <!DOCTYPE html>
            <html>
            <head>
               <meta http-equiv="content-type" content="text/html; charset=utf-8">
               <style>
                  @page {
                      size: a4;
                      margin: 1cm;
                  }

                  @font-face {
                      font-family: DejaVu;
                      src: url(my_fonts_dir/DejaVuSans.ttf);
                  }

                  html {
                      font-family: DejaVu;
                      font-size: 11pt;
                  }
               </style>
            </head>
            <body>
               <div>Something in English - משהו בעברית</div>
            </body>
            </html>
            """

pdf = pisa.CreatePDF(bidialg.get_display(HTMLINPUT, base_dir="L"), outpufile)

# I'm using base_dir="L" so that "< >" signs in HTML tags wouldn't be
flipped by the bidi algorithm

bidi 算法的好处是，您可以在同一行中混合使用 RTL 和 LTR 语言（如上面的 HTML 示例），并且仍然可以使用格式正确的结果。

编辑：
现在最好的方法肯定是使用 wkhtmltopdf

If anyone in the future tries, like me, to figure out how to PROPERLY create a PDF file that contains Hebrew using xhtml2pdf, here's what worked for me:

First thing: including the fonts settings as described here by @eviltrue in my HTML. This can be any font as long as it supports Hebrew characters, otherwise any Hebrew characters in the input HTML would simply appear as black rectangles in the PDF.
At the time of writing this answer, while it is possible to output Hebrew characters to PDF in xhtml2pdf, Hebrew characters are outputted in revers order, i.e. שלום כיתה א
would be א התיכ םולש.

At this point I was stuck, but then I stumbled upon this SO asnwer:
https://stackoverflow.com/a/15449145/1918837

After installing the python-bidi package, here is an example of a complete solution (used in a python app):

from bidi import algorithm as bidialg
from xhtml2pdf import pisa

HTMLINPUT = """
            <!DOCTYPE html>
            <html>
            <head>
               <meta http-equiv="content-type" content="text/html; charset=utf-8">
               <style>
                  @page {
                      size: a4;
                      margin: 1cm;
                  }

                  @font-face {
                      font-family: DejaVu;
                      src: url(my_fonts_dir/DejaVuSans.ttf);
                  }

                  html {
                      font-family: DejaVu;
                      font-size: 11pt;
                  }
               </style>
            </head>
            <body>
               <div>Something in English - משהו בעברית</div>
            </body>
            </html>
            """

pdf = pisa.CreatePDF(bidialg.get_display(HTMLINPUT, base_dir="L"), outpufile)

# I'm using base_dir="L" so that "< >" signs in HTML tags wouldn't be
flipped by the bidi algorithm

The nice thing about the bidi algorithm is that you can have mixed RTL and LTR languages in the same line (like in the HTML example above) and still have a correctly formatted result.

EDIT:
The best way to go now is definitely using wkhtmltopdf

回复收藏 0 原文