Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 10 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(4)
OpenOffice.org
从这个链接:
这听起来像是您正在寻找的。 这也是全部用 Java 编写的。
此链接向您介绍有关上述 JODConverter 的更多信息。
OpenOffice.org
From this link:
This sounds like what you're looking for. It's all in Java too.
This link tells you a little more about JODConverter mentioned above.
我不相信这样的实用程序/转换器已经存在,因为合理地进行某些转换相当困难。 例如,您将如何处理 HTML 到 TXT 到 HTML 的转换? 你会剥夺什么? 您将如何用纯文本表示不同的 HTML 元素? 此外,您将如何处理内容中的内容,例如 TXT 中的 XML 转换为 DOCX,然后转换为 XHTML?
也就是说,如果我要为此目的制作一个转换器,我会从 Apache POI 开始这是一个处理 Office 文档的库。 然后我会使用 iText 进行 PDF 连接,确保 [Office 格式] <-> PDF 转换将像我希望的那样强大,然后添加 JDOM 用于 XML 处理、测试[Office 格式] <-> XML 和 PDF <-> XML 可以按照我想要的方式工作,等等,您明白了。 我会特别避免自己实现文件类型处理程序,因为那时我很可能会重新发明轮子。
I don't believe such utility/converter exists already since it's rather hard to do certain conversions reasonably. For example, how would YOU handle HTML-to-TXT-to-HTML conversion? What would you strip away? How would you represent different HTML elements in plain text? Furthermore, how would you handle content within content like XML inside TXT transformed to DOCX and then to XHTML?
That said, if I were to make a converter for this kind of purpose, I'd start with Apache POI which is a library for handling Office documents. Then I'd use iText for PDF connectivity, make sure [Office formats] <-> PDF conversion would work as robust as I'd want it to work and then add JDOM for XML handling, test that [Office formats] <-> XML and PDF <-> XML would work as I want to and so on and so forth, you get the picture. I would specifically avoid implementing file type handlers myself since it's very much likely that I'd be reinventing the wheel at that point.
这是一个不平凡的问题。 例如,我一直在寻找强大的 HTML+CSS 到 PDF 转换PHP 上个月,只成功地让一个可靠地工作,尽管速度非常慢(html2pdf),尽管我(从那个问题中)发现了 Prince XML,我的初步测试表明它是一个 sperb 产品。 然而它很昂贵。
This is a non-trivial problem. For example, I've been looking for a robust HTML+CSS to PDF conversion in PHP for the last month and have only managed to get one working reliably albeit incredibly slowly (html2pdf) although I've discovered (from that question) Prince XML, which my initial testing has shown to be a sperb product. It is however expensive.
看看 Freemarker
我建议 XML 作为“中心”格式,然后将样式信息分离到一个 XSLT。
Have a look at Freemarker
I would suggest XML as the "hub" format, then separate out your styling information into an XSLT.