Python PDF2Image“可能不是PDF文件”。错误

发布于 2025-01-20 07:15:00 字数 1318 浏览 4 评论 0原文

在CentOS 8操作系统上,使用Python将PDF页面转换为JPG文件时会遇到错误。

from pdf2image import convert_from_path
import sys

images = convert_from_path("test.pdf",500)
for i in range(len(images)):
    images[i].save('page'+ str(i) +'.jpg', 'JPEG')

结果,它给出了这个错误。我可以在本地运行PDF文件,但是当我想将其保存为JPG时,它不起作用。

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
    raise ValueError
ValueError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pdf_conv.py", line 7, in <module>
    images = convert_from_path(pdf_path,500)
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
    page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 489, in pdfinfo_from_path
    "Unable to get page count.\n%s" % err.decode("utf8", "ignore")
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

On Centos 8 operating system, I get an error when converting pdf pages to jpg files with Python.

from pdf2image import convert_from_path
import sys

images = convert_from_path("test.pdf",500)
for i in range(len(images)):
    images[i].save('page'+ str(i) +'.jpg', 'JPEG')

As a result it gives this error. I can run the PDF file locally, but it doesn't work when I want to save it as a jpg.

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
    raise ValueError
ValueError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pdf_conv.py", line 7, in <module>
    images = convert_from_path(pdf_path,500)
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
    page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
  File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 489, in pdfinfo_from_path
    "Unable to get page count.\n%s" % err.decode("utf8", "ignore")
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

给我一枪 2025-01-27 07:15:00

PDF != PDF - 它有不同的版本。也许你的 python pdf2image 不喜欢/不知道你提供给它的 PDF种类。使用 AcrobatReader 或类似工具检查您要转换的内容并查看 pdf2image 是否支持它。

请参阅 pdf2image 支持哪些 ISO 标准(简称:pdf2image 支持 poppler 的所有 PDF 标准支持。)

PDF != PDF - there are different Versions of it. Mayhap your python pdf2image does not like/know the kind of PDF you feed it. Use AcrobatReader or something alike to check what you are trying to convert and see if pdf2image supports it.

See Which ISO standards does pdf2image support (short: pdf2image supports all PDF standards that poppler supports.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文