如何对 DOC/DOCX 转换为 HTML 进行后期格式化?
我目前正在使用 OpenOffice (命令行)和 JODConvertor 将 Word 文档(.doc 和 .docx)转换为我托管的 Web 应用程序的 HTML。它工作得很好,除了一个问题——HTML 文件的边距格式不正确。更糟糕的是,不同操作系统(MacOS 和 Windows)和浏览器之间的边距不一致。
是否有另一种工具可以进行后格式化(我认为它涉及重写转换后的 HTML 文档的 CSS),就像 Google Docs 一样?
我并不是想成为另一个 Google 文档,我只是想模仿他们的后期格式化过程(更具体地说,边距宽度格式化),这样我就可以让用户在我自己的服务上上传和存储 HTML 文档。我需要它是一个独立于任何第三方网站的自动化过程(我知道谷歌有一个 API,称为 googlecl,但它需要身份验证,并且您变得依赖于他们的服务器和服务;更不用说您有配额)。
如果有人知道除 OpenOffice 路线之外的任何其他方法,我愿意接受建议。
I am currently using OpenOffice (command-line) and JODConvertor to convert Word Documents (both .doc and .docx) to HTML for a web application I'm hosting. It works great except for one problem--the HTML files are not formatted properly in terms of the margins. Even worse, the margins are inconsistent across operating systems (MacOS & Windows) and browsers.
Is there another tool out there that does the post-formatting (I think it involves re-writing the CSS of the converted HTML document), much like Google Docs?
I'm not trying to be another Google Docs, I just want to imitate their post-formatting process (more specifically, the margin width formatting) only, so I can have users upload and store HTML docs on my own service. I need it to be an automated process independent of any third party sites (I'm aware that Google has an API, called googlecl, but it requires authentication, and you become dependent on their servers and services; not to mention you have a quota).
If anyone knows of any other method other than the OpenOffice route, I'm open to suggestions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
看来你最好的选择是向 JODConverter 添加一项功能,允许你在导出过程中插入自己的 CSS。所有页面都类似于以下内容:
要么说服 JODConverter 的维护者,要么获取代码并自己将其组合在一起。祝你好运。
It seems your best bet would be to add a feature to JODConverter that allows you to insert your own CSS during the export. Something like the following for all pages:
Either persuade the maintainer of JODConverter, or grab the code and hack it together yourself. Best of luck.