Python PDF2Image“可能不是PDF文件”。错误
在CentOS 8操作系统上,使用Python将PDF页面转换为JPG文件时会遇到错误。
from pdf2image import convert_from_path
import sys
images = convert_from_path("test.pdf",500)
for i in range(len(images)):
images[i].save('page'+ str(i) +'.jpg', 'JPEG')
结果,它给出了这个错误。我可以在本地运行PDF文件,但是当我想将其保存为JPG时,它不起作用。
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
raise ValueError
ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pdf_conv.py", line 7, in <module>
images = convert_from_path(pdf_path,500)
File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 489, in pdfinfo_from_path
"Unable to get page count.\n%s" % err.decode("utf8", "ignore")
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
On Centos 8 operating system, I get an error when converting pdf pages to jpg files with Python.
from pdf2image import convert_from_path
import sys
images = convert_from_path("test.pdf",500)
for i in range(len(images)):
images[i].save('page'+ str(i) +'.jpg', 'JPEG')
As a result it gives this error. I can run the PDF file locally, but it doesn't work when I want to save it as a jpg.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 479, in pdfinfo_from_path
raise ValueError
ValueError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pdf_conv.py", line 7, in <module>
images = convert_from_path(pdf_path,500)
File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 98, in convert_from_path
page_count = pdfinfo_from_path(pdf_path, userpw, poppler_path=poppler_path)["Pages"]
File "/usr/local/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 489, in pdfinfo_from_path
"Unable to get page count.\n%s" % err.decode("utf8", "ignore")
pdf2image.exceptions.PDFPageCountError: Unable to get page count.
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
PDF != PDF - 它有不同的版本。也许你的 python
pdf2image
不喜欢/不知道你提供给它的 PDF种类。使用 AcrobatReader 或类似工具检查您要转换的内容并查看pdf2image
是否支持它。请参阅 pdf2image 支持哪些 ISO 标准(简称:pdf2image 支持 poppler 的所有 PDF 标准支持。)
PDF != PDF - there are different Versions of it. Mayhap your python
pdf2image
does not like/know the kind of PDF you feed it. Use AcrobatReader or something alike to check what you are trying to convert and see ifpdf2image
supports it.See Which ISO standards does pdf2image support (short: pdf2image supports all PDF standards that poppler supports.)