Python 中类似于 Jinja 的 Pdf

发布于 2024-08-17 11:32:34 字数 75 浏览 8 评论 0原文

我正在寻找 Python 中最准确的 PDF 工具,其工作方式类似于 Jinja 对 HTML 的作用。

您有什么建议?

I am looking for the best accurate tool for PDF in Python that works like Jinja does for HTML.

What are your suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

桃扇骨 2024-08-24 11:32:34

正如 jbochi 所回答的,ReportLab 是几乎所有生成 PDF 的 Python 项目的基础。

但为了满足您的需求,您可能需要查看 Pisa / xhtml2pdf。您可以使用 Jinja 模板生成 HTML,然后使用 Pisa 将 HTML 转换为 PDF。 Pisa 建立在 ReportLab 之上。

编辑:我忘记的另一个选项是wkhtmltopdf

As answered by jbochi, ReportLab is the foundation for almost all Python projects that generate PDF.

But for your needs you might want to check out Pisa / xhtml2pdf. You would generate your HTML with a Jinja template and then use Pisa to convert the HTML to PDF. Pisa is built on top of ReportLab.

Edit: another option I'd forgotten about is wkhtmltopdf

太阳男子 2024-08-24 11:32:34

查看 ReportLab 工具包

不过,您只能在商业版本中使用模板。

Have a look at ReportLab Toolkit.

You can use templates only with the commercial version, though.

用心笑 2024-08-24 11:32:34

现在,这个街区出现了一个新成员,名为 WeasyPrint

There's now a new kid on the block called WeasyPrint.

熊抱啵儿 2024-08-24 11:32:34

我和OP的要求完全相同。不幸的是,WeasyPrint 不是一个可行的解决方案,因为我需要非常精确的定位和条形码支持。经过几天的工作,我完成了一个支持 Jinja2 的 reportlab XML 包装器。

代码可以在 GitHub 上找到
包括一个示例 XML 生成以下 PDF

I had exactly the same requirement as the OP. Unfortunately WeasyPrint wasn't a viable solution, because I needed very exact positioning and barcode support. After a few days of work I finished a reportlab XML wrapper with Jinja2 support.

The code can be found on GitHub
including an example XML wich generates the following PDF.

木槿暧夏七纪年 2024-08-24 11:32:34

使用 rst2pdf 或 < 将 python/jinja 转换为 rst/html 并将 html/rst 转换为 pdf 怎么样? a href="http://johnmacfarlane.net/pandoc/" rel="nofollow">pandoc。

这两种方法对我来说都效果很好,但是。像plaes一样,我将来可能会尝试Weasyprint

What about python/jinja to rst/html and html/rst to pdf using either rst2pdf or pandoc.

Both of these have worked well for me but. like plaes, I may try Weasyprint in the future.

各自安好 2024-08-24 11:32:34

还有什么比 Jinja 本身更准确的 Python 中的 PDF 工具,与 Jinja 一样工作呢?

您只需确保Jinja 块、变量和注释标识字符串不与LaTeX 命令冲突。一旦您将 Jinja 环境更改为模仿 LaTeX 环境,您就可以开始了!

下面是一个开箱即用的代码片段:

Python 源代码: ./create_pdf.py

import os, jinja2
from jinja2 import Template

latex_jinja_env = jinja2.Environment(
    block_start_string    = '\BLOCK{',
    block_end_string      = '}',
    variable_start_string = '\VAR{',
    variable_end_string   = '}',
    comment_start_string  = '\#{',
    comment_end_string    = '}',
    line_statement_prefix = '%%',
    line_comment_prefix   = '%#',
    trim_blocks           = True,
    autoescape            = False,
    loader                = jinja2.FileSystemLoader(os.path.abspath('./latex/'))
)
template = latex_jinja_env.get_template('latex_template.tex')

# populate a dictionary with the variables of interest
template_vars  = {}
template_vars['section_1'] = 'The Section 1 Title'
template_vars['section_2'] = 'The Section 2 Title'

# create a file and save the latex
output_file = open('./generated_latex.tex', 'w')
# pass the dictionary with variable names to the renderer
output_file.write( template.render( template_vars ) )
output_file.close()

Latex 模板: ./latex/latex_template .tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{\VAR{section_1}}
\begin{itemize}
\BLOCK{ for x in range(0,3) }
  \item Counting: \VAR{x}
\BLOCK{ endfor }
\end{itemize}

\#{This is a long-form Jinja comment}
\BLOCK{ if subsection_1_1 }
\subsection{ The subsection }
This appears only if subsection_1_1 variable is passed to renderer.
\BLOCK{ endif }

%# This is a short-form Jinja comment
\section{\VAR{section_2}}
\begin{itemize}
%% for x in range(0,3)
  \item Counting: \VAR{x}
%% endfor
\end{itemize}

\end{document}

现在只需调用: $>; python ./create_pdf.py

生成的乳胶源: ./ generated_latex.tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{The Section 1 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\section{The Section 2 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\end{document}

生成的 Pdf:

在此处输入图像描述

参考文献:

What more accurate tool for PDF in Python that works like Jinja than Jinja itself?

You just have to make sure that the Jinja block, variable, and comment identification strings do not conflict with the LaTeX commands. Once you change the Jinja environment to mimic the LaTeX environment you're ready to go!

Here's a snippet that works out of the box:

Python Source: ./create_pdf.py

import os, jinja2
from jinja2 import Template

latex_jinja_env = jinja2.Environment(
    block_start_string    = '\BLOCK{',
    block_end_string      = '}',
    variable_start_string = '\VAR{',
    variable_end_string   = '}',
    comment_start_string  = '\#{',
    comment_end_string    = '}',
    line_statement_prefix = '%%',
    line_comment_prefix   = '%#',
    trim_blocks           = True,
    autoescape            = False,
    loader                = jinja2.FileSystemLoader(os.path.abspath('./latex/'))
)
template = latex_jinja_env.get_template('latex_template.tex')

# populate a dictionary with the variables of interest
template_vars  = {}
template_vars['section_1'] = 'The Section 1 Title'
template_vars['section_2'] = 'The Section 2 Title'

# create a file and save the latex
output_file = open('./generated_latex.tex', 'w')
# pass the dictionary with variable names to the renderer
output_file.write( template.render( template_vars ) )
output_file.close()

Latex Template: ./latex/latex_template.tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{\VAR{section_1}}
\begin{itemize}
\BLOCK{ for x in range(0,3) }
  \item Counting: \VAR{x}
\BLOCK{ endfor }
\end{itemize}

\#{This is a long-form Jinja comment}
\BLOCK{ if subsection_1_1 }
\subsection{ The subsection }
This appears only if subsection_1_1 variable is passed to renderer.
\BLOCK{ endif }

%# This is a short-form Jinja comment
\section{\VAR{section_2}}
\begin{itemize}
%% for x in range(0,3)
  \item Counting: \VAR{x}
%% endfor
\end{itemize}

\end{document}

Now simply call: $> python ./create_pdf.py

Resulting Latex Source: ./generated_latex.tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{The Section 1 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\section{The Section 2 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\end{document}

Generated Pdf:

enter image description here

References:

柏拉图鍀咏恒 2024-08-24 11:32:34

如果您想使用现有的 PDF 作为模板,而不更改原始文档,您可以使用 Dhek 模板编辑器,它允许在单独的模板文件中定义区域(边界、名称、类型)。

模板以 JSON 格式保存,以便可以在 Python 中进行解析,以填充 PDF 上的区域并生成最终文档(例如,使用 Web 表单中的值)。

请参阅 https://github.com/applicius/dhek 处的文档。

[编辑]

最初的答案来自 dhek 的作者。
我已经使用过这个工具,如果您的表单不是以通常的方式生成的,那么这非常有用(它甚至适用于从图像完成的 PDF)。

下载、解压缩并运行 DHEK(无需安装,可移植)后,您可以选择区域并为其命名:

DHEK Areas

然后,您可以将“映射”保存为 JSON,以便获得区域的位置和尺寸:

{
    "pages": [
        {
            "areas": [
                {
                    "name": "FirstName",
                    "x": 198.48648648648648,
                    "type": "text",
                    "y": 151.22779922779924,
                    "height": 15.75289575289574,
                    "width": 181.15830115830119
                },
                {
                    "name": "LastName",
                    "x": 195.33590733590734,
                    "type": "text",
                    "y": 176.43243243243245,
                    "height": 18.115830115830107,
                    "width": 185.0965250965251
                }
            ]
        }
    ],
    "format": "dhek-1.0.13"
}

然后您可以将这些位置与 reportlab 一起使用来创建包含文本的 PDF:

from reportlab.pdfgen.canvas import Canvas


def write_text(
    canvas: Canvas, txt: str, x: float, y: float, height: float, in_middle: bool = True
) -> None:
    """Write text in a form (in middle of height)"""
    if canvas.bottomup:
        y = canvas._pagesize[1] - y
    canvas.drawString(x, y + height / 2, txt)


def create_overlay(overlay_path: str):
    """
    Create the data that will be overlayed on top
    of the form that we want to fill
    """
    c = Canvas(overlay_path, bottomup=0)  # DHEK has (0,0) at top-left
    write_text(c, "Mike", 198.48648648648648, 151.22779922779924, 15.75289575289574)
    write_text(c, "Jagger", 195.33590733590734, 176.43243243243245, 18.115830115830107)
   
    c.save()


create_overlay("form_overlay.pdf")

然后您可以使用任何工具/pdf 库(例如 pdfrw )将两者合并在一个页面中:(

import pdfrw

def merge_pdfs(form_pdf, overlay_pdf, output):
    """
    Merge the specified fillable form PDF with the
    overlay PDF and save the output
    """
    form = pdfrw.PdfReader(form_pdf)
    olay = pdfrw.PdfReader(overlay_pdf)

    for form_page, overlay_page in zip(form.pages, olay.pages):
        merge_obj = pdfrw.PageMerge()
        overlay = merge_obj.add(overlay_page)[0]
        pdfrw.PageMerge(form_page).add(overlay).render()

    writer = pdfrw.PdfWriter()
    writer.write(output, form)


merge_pdfs("form.pdf", "form_overlay.pdf", "form_filled.pdf")

创建覆盖和合并的代码的最后一部分来自优秀博客“Mouse vs Python”:https://www.blog.pythonlibrary.org/2018/05/22/filling-pdf-forms-with-python /)

If you want to use existing PDF as template, without altering original document, you can use Dhek template editor, which allows to define area (bounds, name, type) in a separate template file.

Template is saved in JSON format so that it can be parsed in Python, to fill areas over PDF and generate the final document (e.g. with values from Web form).

See documentation at https://github.com/applicius/dhek .

[EDIT]

Initial answer was from the author of dhek.
I have used this tool and this is great if your form has not been generated in the usual way (it even works on PDF done from images).

After you downloaded, unzipped, and run DHEK (no install needed, it is portable), you can select areas and given them a name:

DHEK Areas

You can then save the "mapping" to JSON so you can get the positions and dimensions of the areas:

{
    "pages": [
        {
            "areas": [
                {
                    "name": "FirstName",
                    "x": 198.48648648648648,
                    "type": "text",
                    "y": 151.22779922779924,
                    "height": 15.75289575289574,
                    "width": 181.15830115830119
                },
                {
                    "name": "LastName",
                    "x": 195.33590733590734,
                    "type": "text",
                    "y": 176.43243243243245,
                    "height": 18.115830115830107,
                    "width": 185.0965250965251
                }
            ]
        }
    ],
    "format": "dhek-1.0.13"
}

You can then use these positions with reportlab to create a PDF that contains the text:

from reportlab.pdfgen.canvas import Canvas


def write_text(
    canvas: Canvas, txt: str, x: float, y: float, height: float, in_middle: bool = True
) -> None:
    """Write text in a form (in middle of height)"""
    if canvas.bottomup:
        y = canvas._pagesize[1] - y
    canvas.drawString(x, y + height / 2, txt)


def create_overlay(overlay_path: str):
    """
    Create the data that will be overlayed on top
    of the form that we want to fill
    """
    c = Canvas(overlay_path, bottomup=0)  # DHEK has (0,0) at top-left
    write_text(c, "Mike", 198.48648648648648, 151.22779922779924, 15.75289575289574)
    write_text(c, "Jagger", 195.33590733590734, 176.43243243243245, 18.115830115830107)
   
    c.save()


create_overlay("form_overlay.pdf")

You can then use any tool / pdf library (e.g. pdfrw) to merge the two in a single page:

import pdfrw

def merge_pdfs(form_pdf, overlay_pdf, output):
    """
    Merge the specified fillable form PDF with the
    overlay PDF and save the output
    """
    form = pdfrw.PdfReader(form_pdf)
    olay = pdfrw.PdfReader(overlay_pdf)

    for form_page, overlay_page in zip(form.pages, olay.pages):
        merge_obj = pdfrw.PageMerge()
        overlay = merge_obj.add(overlay_page)[0]
        pdfrw.PageMerge(form_page).add(overlay).render()

    writer = pdfrw.PdfWriter()
    writer.write(output, form)


merge_pdfs("form.pdf", "form_overlay.pdf", "form_filled.pdf")

(last part of code to create Overlay and Merge is coming from the Excellent blog "Mouse vs Python": https://www.blog.pythonlibrary.org/2018/05/22/filling-pdf-forms-with-python/)

樱桃奶球 2024-08-24 11:32:34

...还有用于此目的的库 pdfjinjahttps:// github.com/rammie/pdfjinja

它使用注释来创建模板值。

在我的用例中,我没有带有正确表单字段的 PDF,因此 cchantep 建议的解决方案更合适。

... There is also the library pdfjinja that is for this purpose: https://github.com/rammie/pdfjinja

It is using annotations to create template values.

In my use case, I didn't have a PDF with proper form fields so the solution suggested by cchantep was more suitable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文