当前位置：文江博客话题详情

Python 中类似于 Jinja 的 Pdf

发布于 2024-08-17 11:32:34 字数 75 浏览 8 评论 0原文

我正在寻找 Python 中最准确的 PDF 工具，其工作方式类似于 Jinja 对 HTML 的作用。

您有什么建议？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

桃扇骨 2024-08-24 11:32:34

正如 jbochi 所回答的，ReportLab 是几乎所有生成 PDF 的 Python 项目的基础。

但为了满足您的需求，您可能需要查看 Pisa / xhtml2pdf。您可以使用 Jinja 模板生成 HTML，然后使用 Pisa 将 HTML 转换为 PDF。 Pisa 建立在 ReportLab 之上。

编辑：我忘记的另一个选项是wkhtmltopdf

回复收藏 0 原文

太阳男子 2024-08-24 11:32:34

查看 ReportLab 工具包。

不过，您只能在商业版本中使用模板。

回复收藏 0 原文

用心笑 2024-08-24 11:32:34

现在，这个街区出现了一个新成员，名为 WeasyPrint。

回复收藏 0 原文

熊抱啵儿 2024-08-24 11:32:34

我和OP的要求完全相同。不幸的是，WeasyPrint 不是一个可行的解决方案，因为我需要非常精确的定位和条形码支持。经过几天的工作，我完成了一个支持 Jinja2 的 reportlab XML 包装器。

代码可以在 GitHub 上找到
包括一个示例 XML 生成以下 PDF。

回复收藏 0 原文

木槿暧夏七纪年 2024-08-24 11:32:34

使用 rst2pdf 或 < 将 python/jinja 转换为 rst/html 并将 html/rst 转换为 pdf 怎么样？ a href="http://johnmacfarlane.net/pandoc/" rel="nofollow">pandoc。

这两种方法对我来说都效果很好，但是。像plaes一样，我将来可能会尝试Weasyprint。

回复收藏 0 原文

各自安好 2024-08-24 11:32:34

还有什么比 Jinja 本身更准确的 Python 中的 PDF 工具，与 Jinja 一样工作呢？

您只需确保Jinja 块、变量和注释标识字符串不与LaTeX 命令冲突。一旦您将 Jinja 环境更改为模仿 LaTeX 环境，您就可以开始了！

下面是一个开箱即用的代码片段：

Python 源代码： ./create_pdf.py

import os, jinja2
from jinja2 import Template

latex_jinja_env = jinja2.Environment(
    block_start_string    = '\BLOCK{',
    block_end_string      = '}',
    variable_start_string = '\VAR{',
    variable_end_string   = '}',
    comment_start_string  = '\#{',
    comment_end_string    = '}',
    line_statement_prefix = '%%',
    line_comment_prefix   = '%#',
    trim_blocks           = True,
    autoescape            = False,
    loader                = jinja2.FileSystemLoader(os.path.abspath('./latex/'))
)
template = latex_jinja_env.get_template('latex_template.tex')

# populate a dictionary with the variables of interest
template_vars  = {}
template_vars['section_1'] = 'The Section 1 Title'
template_vars['section_2'] = 'The Section 2 Title'

# create a file and save the latex
output_file = open('./generated_latex.tex', 'w')
# pass the dictionary with variable names to the renderer
output_file.write( template.render( template_vars ) )
output_file.close()

Latex 模板： ./latex/latex_template .tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{\VAR{section_1}}
\begin{itemize}
\BLOCK{ for x in range(0,3) }
  \item Counting: \VAR{x}
\BLOCK{ endfor }
\end{itemize}

\#{This is a long-form Jinja comment}
\BLOCK{ if subsection_1_1 }
\subsection{ The subsection }
This appears only if subsection_1_1 variable is passed to renderer.
\BLOCK{ endif }

%# This is a short-form Jinja comment
\section{\VAR{section_2}}
\begin{itemize}
%% for x in range(0,3)
  \item Counting: \VAR{x}
%% endfor
\end{itemize}

\end{document}

现在只需调用： $>; python ./create_pdf.py

生成的乳胶源： ./ generated_latex.tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{The Section 1 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\section{The Section 2 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\end{document}

生成的 Pdf：

参考文献：

What more accurate tool for PDF in Python that works like Jinja than Jinja itself?

You just have to make sure that the Jinja block, variable, and comment identification strings do not conflict with the LaTeX commands. Once you change the Jinja environment to mimic the LaTeX environment you're ready to go!

Here's a snippet that works out of the box:

Python Source: ./create_pdf.py

import os, jinja2
from jinja2 import Template

latex_jinja_env = jinja2.Environment(
    block_start_string    = '\BLOCK{',
    block_end_string      = '}',
    variable_start_string = '\VAR{',
    variable_end_string   = '}',
    comment_start_string  = '\#{',
    comment_end_string    = '}',
    line_statement_prefix = '%%',
    line_comment_prefix   = '%#',
    trim_blocks           = True,
    autoescape            = False,
    loader                = jinja2.FileSystemLoader(os.path.abspath('./latex/'))
)
template = latex_jinja_env.get_template('latex_template.tex')

# populate a dictionary with the variables of interest
template_vars  = {}
template_vars['section_1'] = 'The Section 1 Title'
template_vars['section_2'] = 'The Section 2 Title'

# create a file and save the latex
output_file = open('./generated_latex.tex', 'w')
# pass the dictionary with variable names to the renderer
output_file.write( template.render( template_vars ) )
output_file.close()

Latex Template: ./latex/latex_template.tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{\VAR{section_1}}
\begin{itemize}
\BLOCK{ for x in range(0,3) }
  \item Counting: \VAR{x}
\BLOCK{ endfor }
\end{itemize}

\#{This is a long-form Jinja comment}
\BLOCK{ if subsection_1_1 }
\subsection{ The subsection }
This appears only if subsection_1_1 variable is passed to renderer.
\BLOCK{ endif }

%# This is a short-form Jinja comment
\section{\VAR{section_2}}
\begin{itemize}
%% for x in range(0,3)
  \item Counting: \VAR{x}
%% endfor
\end{itemize}

\end{document}

Now simply call: $> python ./create_pdf.py

Resulting Latex Source: ./generated_latex.tex

\documentclass{article}
\begin{document}
\section{Example}
An example document using \LaTeX, Python, and Jinja.

% This is a regular LaTeX comment
\section{The Section 1 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\section{The Section 2 Title}
\begin{itemize}
  \item Counting: 0
  \item Counting: 1
  \item Counting: 2
\end{itemize}

\end{document}

Generated Pdf:

References:

回复收藏 0 原文

柏拉图鍀咏恒 2024-08-24 11:32:34

如果您想使用现有的 PDF 作为模板，而不更改原始文档，您可以使用 Dhek 模板编辑器，它允许在单独的模板文件中定义区域（边界、名称、类型）。

模板以 JSON 格式保存，以便可以在 Python 中进行解析，以填充 PDF 上的区域并生成最终文档（例如，使用 Web 表单中的值）。

请参阅 https://github.com/applicius/dhek 处的文档。

[编辑]

最初的答案来自 dhek 的作者。
我已经使用过这个工具，如果您的表单不是以通常的方式生成的，那么这非常有用（它甚至适用于从图像完成的 PDF）。

下载、解压缩并运行 DHEK（无需安装，可移植）后，您可以选择区域并为其命名：

然后，您可以将“映射”保存为 JSON，以便获得区域的位置和尺寸：

{
    "pages": [
        {
            "areas": [
                {
                    "name": "FirstName",
                    "x": 198.48648648648648,
                    "type": "text",
                    "y": 151.22779922779924,
                    "height": 15.75289575289574,
                    "width": 181.15830115830119
                },
                {
                    "name": "LastName",
                    "x": 195.33590733590734,
                    "type": "text",
                    "y": 176.43243243243245,
                    "height": 18.115830115830107,
                    "width": 185.0965250965251
                }
            ]
        }
    ],
    "format": "dhek-1.0.13"
}

然后您可以将这些位置与 reportlab 一起使用来创建包含文本的 PDF：

from reportlab.pdfgen.canvas import Canvas


def write_text(
    canvas: Canvas, txt: str, x: float, y: float, height: float, in_middle: bool = True
) -> None:
    """Write text in a form (in middle of height)"""
    if canvas.bottomup:
        y = canvas._pagesize[1] - y
    canvas.drawString(x, y + height / 2, txt)


def create_overlay(overlay_path: str):
    """
    Create the data that will be overlayed on top
    of the form that we want to fill
    """
    c = Canvas(overlay_path, bottomup=0)  # DHEK has (0,0) at top-left
    write_text(c, "Mike", 198.48648648648648, 151.22779922779924, 15.75289575289574)
    write_text(c, "Jagger", 195.33590733590734, 176.43243243243245, 18.115830115830107)
   
    c.save()


create_overlay("form_overlay.pdf")

然后您可以使用任何工具/pdf 库（例如 pdfrw ）将两者合并在一个页面中：（

import pdfrw

def merge_pdfs(form_pdf, overlay_pdf, output):
    """
    Merge the specified fillable form PDF with the
    overlay PDF and save the output
    """
    form = pdfrw.PdfReader(form_pdf)
    olay = pdfrw.PdfReader(overlay_pdf)

    for form_page, overlay_page in zip(form.pages, olay.pages):
        merge_obj = pdfrw.PageMerge()
        overlay = merge_obj.add(overlay_page)[0]
        pdfrw.PageMerge(form_page).add(overlay).render()

    writer = pdfrw.PdfWriter()
    writer.write(output, form)


merge_pdfs("form.pdf", "form_overlay.pdf", "form_filled.pdf")

创建覆盖和合并的代码的最后一部分来自优秀博客“Mouse vs Python”：https://www.blog.pythonlibrary.org/2018/05/22/filling-pdf-forms-with-python /)

If you want to use existing PDF as template, without altering original document, you can use Dhek template editor, which allows to define area (bounds, name, type) in a separate template file.

Template is saved in JSON format so that it can be parsed in Python, to fill areas over PDF and generate the final document (e.g. with values from Web form).

See documentation at https://github.com/applicius/dhek .

[EDIT]

Initial answer was from the author of dhek.
I have used this tool and this is great if your form has not been generated in the usual way (it even works on PDF done from images).

After you downloaded, unzipped, and run DHEK (no install needed, it is portable), you can select areas and given them a name:

You can then save the "mapping" to JSON so you can get the positions and dimensions of the areas:

{
    "pages": [
        {
            "areas": [
                {
                    "name": "FirstName",
                    "x": 198.48648648648648,
                    "type": "text",
                    "y": 151.22779922779924,
                    "height": 15.75289575289574,
                    "width": 181.15830115830119
                },
                {
                    "name": "LastName",
                    "x": 195.33590733590734,
                    "type": "text",
                    "y": 176.43243243243245,
                    "height": 18.115830115830107,
                    "width": 185.0965250965251
                }
            ]
        }
    ],
    "format": "dhek-1.0.13"
}

You can then use these positions with reportlab to create a PDF that contains the text:

from reportlab.pdfgen.canvas import Canvas


def write_text(
    canvas: Canvas, txt: str, x: float, y: float, height: float, in_middle: bool = True
) -> None:
    """Write text in a form (in middle of height)"""
    if canvas.bottomup:
        y = canvas._pagesize[1] - y
    canvas.drawString(x, y + height / 2, txt)


def create_overlay(overlay_path: str):
    """
    Create the data that will be overlayed on top
    of the form that we want to fill
    """
    c = Canvas(overlay_path, bottomup=0)  # DHEK has (0,0) at top-left
    write_text(c, "Mike", 198.48648648648648, 151.22779922779924, 15.75289575289574)
    write_text(c, "Jagger", 195.33590733590734, 176.43243243243245, 18.115830115830107)
   
    c.save()


create_overlay("form_overlay.pdf")

You can then use any tool / pdf library (e.g. pdfrw) to merge the two in a single page:

import pdfrw

def merge_pdfs(form_pdf, overlay_pdf, output):
    """
    Merge the specified fillable form PDF with the
    overlay PDF and save the output
    """
    form = pdfrw.PdfReader(form_pdf)
    olay = pdfrw.PdfReader(overlay_pdf)

    for form_page, overlay_page in zip(form.pages, olay.pages):
        merge_obj = pdfrw.PageMerge()
        overlay = merge_obj.add(overlay_page)[0]
        pdfrw.PageMerge(form_page).add(overlay).render()

    writer = pdfrw.PdfWriter()
    writer.write(output, form)


merge_pdfs("form.pdf", "form_overlay.pdf", "form_filled.pdf")

(last part of code to create Overlay and Merge is coming from the Excellent blog "Mouse vs Python": https://www.blog.pythonlibrary.org/2018/05/22/filling-pdf-forms-with-python/)

回复收藏 0 原文