通用文档格式转换器

发布于 2024-07-11 06:35:48 字数 1539 浏览 5 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

浅沫记忆 2024-07-18 06:35:48

OpenOffice.org

从这个链接

OpenOffice.org 鲜为人知的功能之一是它作为服务运行的能力。 你可以巧妙地利用这种能力。 例如,您可以将 OpenOffice.og 转变为转换引擎,并使用它通过基于 Web 的界面或命令行工具将文档从一种格式转换为另一种格式。 JODConverter 可以帮助您释放 OpenOffice.org 的文件转换功能。

这听起来像是您正在寻找的。 这也是全部用 Java 编写的。

此链接向您介绍有关上述 JODConverter 的更多信息。

OpenOffice.org

From this link:

One of the less well-known features of OpenOffice.org is its ability to run as a service. You can put that ability to some clever use. For example, you can turn OpenOffice.og into a conversion engine and use it to convert documents from one format to another via a Web-based interface or a command-line tool. JODConverter can help you to unleash OpenOffice.org's file conversion capabilities.

This sounds like what you're looking for. It's all in Java too.

This link tells you a little more about JODConverter mentioned above.

浴红衣 2024-07-18 06:35:48

我不相信这样的实用程序/转换器已经存在,因为合理地进行某些转换相当困难。 例如,您将如何处理 HTML 到 TXT 到 HTML 的转换? 你会剥夺什么? 您将如何用纯文本表示不同的 HTML 元素? 此外,您将如何处理内容中的内容,例如 TXT 中的 XML 转换为 DOCX,然后转换为 XHTML?

也就是说,如果我要为此目的制作一个转换器,我会从 Apache POI 开始这是一个处理 Office 文档的库。 然后我会使用 iText 进行 PDF 连接,确保 [Office 格式] <-> PDF 转换将像我希望的那样强大,然后添加 JDOM 用于 XML 处理、测试[Office 格式] <-> XML 和 PDF <-> XML 可以按照我想要的方式工作,等等,您明白了。 我会特别避免自己实现文件类型处理程序,因为那时我很可能会重新发明轮子。

I don't believe such utility/converter exists already since it's rather hard to do certain conversions reasonably. For example, how would YOU handle HTML-to-TXT-to-HTML conversion? What would you strip away? How would you represent different HTML elements in plain text? Furthermore, how would you handle content within content like XML inside TXT transformed to DOCX and then to XHTML?

That said, if I were to make a converter for this kind of purpose, I'd start with Apache POI which is a library for handling Office documents. Then I'd use iText for PDF connectivity, make sure [Office formats] <-> PDF conversion would work as robust as I'd want it to work and then add JDOM for XML handling, test that [Office formats] <-> XML and PDF <-> XML would work as I want to and so on and so forth, you get the picture. I would specifically avoid implementing file type handlers myself since it's very much likely that I'd be reinventing the wheel at that point.

蓝颜夕 2024-07-18 06:35:48

这是一个不平凡的问题。 例如,我一直在寻找强大的 HTML+CSS 到 PDF 转换PHP 上个月,只成功地让一个可靠地工作,尽管速度非常慢(html2pdf),尽管我(从那个问题中)发现了 Prince XML,我的初步测试表明它是一个 sperb 产品。 然而它很昂贵。

This is a non-trivial problem. For example, I've been looking for a robust HTML+CSS to PDF conversion in PHP for the last month and have only managed to get one working reliably albeit incredibly slowly (html2pdf) although I've discovered (from that question) Prince XML, which my initial testing has shown to be a sperb product. It is however expensive.

久光 2024-07-18 06:35:48

看看 Freemarker

我建议 XML 作为“中心”格式,然后将样式信息分离到一个 XSLT。

Have a look at Freemarker

I would suggest XML as the "hub" format, then separate out your styling information into an XSLT.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文