使用 pypdf 更改 pdf 文件的元数据

发布于 2024-08-27 19:43:38 字数 99 浏览 6 评论 0原文

我想使用 pypdf 创建/修改 pdf 文档的标题。看来标题是只读的。有没有办法以读写方式访问此元数据?

如果答案是肯定的,一段代码将不胜感激。

谢谢

I'd like to create/modify the title of a pdf document using pypdf. It seems that the title is readonly. Is there a way to access this metadata r/w?

If answer positive, a piece of code would be appreciated.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

想你只要分分秒秒 2024-09-03 19:43:38

您可以使用 pyPDF(某种程度)操作标题。我在reportlab-users列表中发现了这篇文章:

http: //two.pairlist.net/pipermail/reportlab-users/2009-November/009033.html

您还可以使用 pypdf。
https://pypi.org/project/pypdf/

这不会让您编辑元数据
本身,但会让您阅读一个或
更多 pdf 文件并将其吐回去
出来,可能带有新的元数据。

这是相关代码:

from pyPdf import PdfFileWriter, PdfFileReader
from pyPdf.generic import NameObject, createStringObject

OUTPUT = 'output.pdf'
INPUTS = ['test1.pdf', 'test2.pdf', 'test3.pdf']

# There is no interface through pyPDF with which to set this other then getting
# your hands dirty like so:
infoDict = output._info.getObject()
infoDict.update({
    NameObject('/Title'): createStringObject(u'title'),
    NameObject('/Author'): createStringObject(u'author'),
    NameObject('/Subject'): createStringObject(u'subject'),
    NameObject('/Creator'): createStringObject(u'a script')
})

inputs = [PdfFileReader(i) for i in INPUTS]
for input in inputs:
    for page in range(input.getNumPages()):
        output.addPage(input.getPage(page))

outputStream = file(OUTPUT, 'wb')
output.write(outputStream)
outputStream.close()

You can manipulate the title with pyPDF (sort of). I came across this post on the reportlab-users listing:

http://two.pairlist.net/pipermail/reportlab-users/2009-November/009033.html

You can also use pypdf.
https://pypi.org/project/pypdf/

This won't let you edit the metadata
per se, but will let you read one or
more pdf file(s) and spit them back
out, possibly with new metadata.

Here's the relevant code:

from pyPdf import PdfFileWriter, PdfFileReader
from pyPdf.generic import NameObject, createStringObject

OUTPUT = 'output.pdf'
INPUTS = ['test1.pdf', 'test2.pdf', 'test3.pdf']

# There is no interface through pyPDF with which to set this other then getting
# your hands dirty like so:
infoDict = output._info.getObject()
infoDict.update({
    NameObject('/Title'): createStringObject(u'title'),
    NameObject('/Author'): createStringObject(u'author'),
    NameObject('/Subject'): createStringObject(u'subject'),
    NameObject('/Creator'): createStringObject(u'a script')
})

inputs = [PdfFileReader(i) for i in INPUTS]
for input in inputs:
    for page in range(input.getNumPages()):
        output.addPage(input.getPage(page))

outputStream = file(OUTPUT, 'wb')
output.write(outputStream)
outputStream.close()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文