将多页PDF文件拆分为使用Python的多个PDF文件?
我想使用一个多页PDF文件,并每页创建单独的PDF文件。
我已经下载了 reportlab 并浏览了文档,但似乎旨在PDF生成。我还没有看到有关处理PDF文件本身的任何信息。
在Python中有一种简单的方法吗?
I would like to take a multi-page pdf file and create separate pdf files per page.
I have downloaded reportlab and have browsed the documentation, but it seems aimed at pdf generation. I haven't yet seen anything about processing PDF files themselves.
Is there an easy way to do this in python?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
ETC。
etc.
更新了PYPDF(3.0.0)最新版本的解决方案,并分配一系列页面。
Updated solution for the latest release of PyPDF (3.0.0) and to split a range of pages.
我在这里错过了一个解决方案,您将PDF分为由所有页面组成的两个部分,因此如果有人在寻找相同的问题,我将附加解决方案:
I missed here a solution where you split the PDF to two parts consisting of all pages so I append my solution if somebody was looking for the same:
PYPDF2软件包使您能够将单个PDF拆分为多个PDF。
PYPDF2 3.0.0的更改
来源: https://www.blog.pythonlibrary.org/2018/04/11/splitting-and-merging-p.-merging-pdfs-with-python/
The PyPDF2 package gives you the ability to split up a single PDF into multiple ones.
Changes for PyPDF2 3.0.0
Source: https://www.blog.pythonlibrary.org/2018/04/11/splitting-and-merging-pdfs-with-python/
pypdf2
用于拆分PDF的较早答案不再使用最新版本更新。作者建议改用pypdf
,此版本的pypdf2 == 3.0.1
将是pypdf2
的最后版本。该功能需要修改如下:注意:相同的功能也将与
pypdf
一起使用。导入pdfreader
和pdfwriter
来自pypdf
而不是pypdf2
。The earlier answers with
PyPDF2
for splitting pdfs are not working anymore with the latest version update. The authors recommend usingpypdf
instead and this version ofPyPDF2==3.0.1
will be the last version ofPyPDF2
. The function needs to be modified as follows:Note: The same function will work with
pypdf
as well. ImportPdfReader
andPdfWriter
frompypdf
rather thanPyPDF2
.我知道该代码与Python无关,但是我觉得自己要张贴这件R代码,该R代码简单,灵活并且可以工作。 R中的PDFTOOLS包在将PDF拆分时令人惊讶。
I know that the code is not related to python, however i felt like posting this piece of R code which is simple, flexible and works amazingly. The PDFtools package in R is amazing in splitting merging PDFs at ease.
fitz
库用于使用PDF文件和图像。“ 2021.pdf”
的PDF文件。如果该文件在指定的路径上不存在,则将捕获filenotfounderror
并打印“未找到文件”。pdf [6:]
是指第七页开始。pixmap
,这是页面的图像表示。alpha = false
参数用于指定图像不应包括透明度。page_ {i+1} .jpg
)。枚举从6开始,但我们将其增加1,以反映从7开始的实际页码。fitz
library is used for working with PDF files and images."2021.pdf"
. If the file doesn't exist at the specified path, it catches aFileNotFoundError
and prints "File not found".pdf[6:]
refers to the 7th page onwards.pixmap
, which is an image representation of the page. Thealpha=False
parameter is used to specify that the image should not include transparency.page_{i+1}.jpg
). The enumeration starts at 6, but we increment it by 1 to reflect the actual page numbers starting from 7.a>
Link to article