当前位置：文江博客话题详情

使用 pypdf 更改 pdf 文件的元数据

发布于 2024-08-27 19:43:38 字数 99 浏览 6 评论 0原文

我想使用 pypdf 创建/修改 pdf 文档的标题。看来标题是只读的。有没有办法以读写方式访问此元数据？

如果答案是肯定的，一段代码将不胜感激。

谢谢

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

想你只要分分秒秒 2024-09-03 19:43:38

您可以使用 pyPDF（某种程度）操作标题。我在reportlab-users列表中发现了这篇文章：

http: //two.pairlist.net/pipermail/reportlab-users/2009-November/009033.html

您还可以使用 pypdf。
https://pypi.org/project/pypdf/
这不会让您编辑元数据
本身，但会让您阅读一个或
更多 pdf 文件并将其吐回去
出来，可能带有新的元数据。

这是相关代码：

from pyPdf import PdfFileWriter, PdfFileReader
from pyPdf.generic import NameObject, createStringObject

OUTPUT = 'output.pdf'
INPUTS = ['test1.pdf', 'test2.pdf', 'test3.pdf']

# There is no interface through pyPDF with which to set this other then getting
# your hands dirty like so:
infoDict = output._info.getObject()
infoDict.update({
    NameObject('/Title'): createStringObject(u'title'),
    NameObject('/Author'): createStringObject(u'author'),
    NameObject('/Subject'): createStringObject(u'subject'),
    NameObject('/Creator'): createStringObject(u'a script')
})

inputs = [PdfFileReader(i) for i in INPUTS]
for input in inputs:
    for page in range(input.getNumPages()):
        output.addPage(input.getPage(page))

outputStream = file(OUTPUT, 'wb')
output.write(outputStream)
outputStream.close()

You can manipulate the title with pyPDF (sort of). I came across this post on the reportlab-users listing:

http://two.pairlist.net/pipermail/reportlab-users/2009-November/009033.html

You can also use pypdf.
https://pypi.org/project/pypdf/
This won't let you edit the metadata
per se, but will let you read one or
more pdf file(s) and spit them back
out, possibly with new metadata.

Here's the relevant code:

from pyPdf import PdfFileWriter, PdfFileReader
from pyPdf.generic import NameObject, createStringObject

OUTPUT = 'output.pdf'
INPUTS = ['test1.pdf', 'test2.pdf', 'test3.pdf']

# There is no interface through pyPDF with which to set this other then getting
# your hands dirty like so:
infoDict = output._info.getObject()
infoDict.update({
    NameObject('/Title'): createStringObject(u'title'),
    NameObject('/Author'): createStringObject(u'author'),
    NameObject('/Subject'): createStringObject(u'subject'),
    NameObject('/Creator'): createStringObject(u'a script')
})

inputs = [PdfFileReader(i) for i in INPUTS]
for input in inputs:
    for page in range(input.getNumPages()):
        output.addPage(input.getPage(page))

outputStream = file(OUTPUT, 'wb')
output.write(outputStream)
outputStream.close()

回复收藏 0 原文

~没有更多了~