通用PDF转换器

发布于 2025-01-29 20:42:38 字数 757 浏览 3 评论 0原文

我正在寻找“任何文档转换器”的帮助,其中任何文档文件[DOC,DOCX,PPT,PPTX]将转换为PDF。 DOCX和PPTX易于使用Python库处理,但是Doc和PPT有点棘手。
答案我有7个月前很难处理。尤其是使用Unoconv的人(现在已弃用并更改为Unoserv)。

初始代码示例:

import os
import shutil

src = ".../srcpaths"
dst = ".../dstpaths"
ext = ['ppt', 'pptx', 'doc', 'docx']

for root, subfolders, filenames in os.walk(src):
    for filename in filenames:
        if os.path.splitext(filename)[1] in ext:
            shutil.copy2(os.path.join(root, filename), os.path.join(dst, filename))            
        
def ConvertToPDF(ext):
    #some code#

ConvertToPDF('.ppt')
ConvertToPDF('.pptx')
ConvertToPDF('.doc')
ConvertToPDF('.docx')

I am looking for a help with "any document converter", where any document file [doc, docx, ppt, pptx] will be converted to pdf. DOCX and PPTX are easy to handle with python libraries, but DOC and PPT is a bit tricky.
The answers I've got 7 month ago was quite a bit hard to deal with. Especially the one with use of Unoconv (now its deprecated and changed to Unoserv).

Initial code example:

import os
import shutil

src = ".../srcpaths"
dst = ".../dstpaths"
ext = ['ppt', 'pptx', 'doc', 'docx']

for root, subfolders, filenames in os.walk(src):
    for filename in filenames:
        if os.path.splitext(filename)[1] in ext:
            shutil.copy2(os.path.join(root, filename), os.path.join(dst, filename))            
        
def ConvertToPDF(ext):
    #some code#

ConvertToPDF('.ppt')
ConvertToPDF('.pptx')
ConvertToPDF('.doc')
ConvertToPDF('.docx')

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

如痴如狂 2025-02-05 20:42:38

以下是我对解决方案的评论和最后一个答案:

1。 PANDOC:

  • 需要PDF乳胶处理器
  • 不保留文件的形状,即
  • 损失
  • 在图形问题上,图形问题
  • 的格式问题
  • 图表问题
  • 了格式较低的

2。 Unoconv/Unoserver

  • 很难安装和处理,
  • 需要Libre Office作为发动机
  • 良好的转换结果(不是完美)

3。基于云的解决方案:

  • 不自由
  • 开源友好的
  • 隐私问题

4。 Google Drive API转换器:

  • 使用某人的帐户
  • 上传文档 - 将其转换 - 将其保存为PDF
  • 隐私问题

5。 Librelambda

  • 使用Amazon Web Services(AWS)
  • 隐私问题

简单解决方案:

通过在CMD子过程中运行该软件。

需求:安装libreoffice。
最大的优势:可以在Windows和Linux上运行(应该为Linux修改),

这是我的Windows Python代码:

import os
import subprocess

# path to the engine
path_to_office = r"C:\Program Files\LibreOffice\program\soffice.exe"

# path with files to convert
source_folder = r"C:\ConvertToPDF\input_files"

# path with pdf files
output_folder = r"C:\ConvertToPDF\output_files"

# changing directory to source
os.chdir(source_folder)

# assign and running the command of converting files through LibreOffice
command = f"\"{path_to_office}\" --convert-to pdf  --outdir \"{output_folder}\" *.*"
subprocess.run(command)

print('Converted')

如果您可以将其修改为Linux,请随时共享您的解决方案

Below is my review of solutions and an answer at the end:

1. Pandoc:

  • requires pdf latex processor
  • not preserving the shape of files well
  • loss of formatting
  • problems with graphics
  • problems with charts
  • problems with fonts
  • low on formats choice

2. Unoconv/Unoserver

  • hard to install and deal with
  • requires Libre Office as engine
  • good conversion results (not perfect)

3. Cloud-based solutions:

  • not free
  • not open-source friendly
  • privacy concerns

4. Google Drive API converter:

  • using someone’s account
  • upload document – Convert it – Save it as PDF
  • privacy concerns

5. LibreLambda

  • uses Amazon Web Services (AWS)
  • privacy concerns

Simple solution:

Use the software straightly by running it in a cmd subprocess.

Needs: installation of LibreOffice.
Biggest advantage: can run both on Windows and Linux (should be modified for linux)

Here is my Python code for Windows:

import os
import subprocess

# path to the engine
path_to_office = r"C:\Program Files\LibreOffice\program\soffice.exe"

# path with files to convert
source_folder = r"C:\ConvertToPDF\input_files"

# path with pdf files
output_folder = r"C:\ConvertToPDF\output_files"

# changing directory to source
os.chdir(source_folder)

# assign and running the command of converting files through LibreOffice
command = f"\"{path_to_office}\" --convert-to pdf  --outdir \"{output_folder}\" *.*"
subprocess.run(command)

print('Converted')

If you can modify it to Linux, please feel free to share your solution

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文