在 Apache POI 中使用 WordToHtmlConverter 转换器

发布于 2024-12-17 21:09:55 字数 178 浏览 0 评论 0原文

我正在尝试使用 WordToHtmlConverter 类将 Word 文档转换为 HTML,但文档不清楚。

WordToHtmlConverter 有一个采用 org.w3c.dom.Document 的构造函数,但我不认为它是单词文档。

有谁有一个关于如何加载word文档并将其转换为html的示例程序。

I am trying to use WordToHtmlConverter class to convert a word document in HTML, but the documentation is not clear.

The WordToHtmlConverter has a constructor taking org.w3c.dom.Document, but I don't think it is the word document.

Does anyone have a sample program on how to load a word document and convert it into html.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

魔法唧唧 2024-12-24 21:09:55

现在最好的选择可能是查看单元测试,例如 TestWordToHtmlConverter。这将向您展示如何执行此操作。

一般来说,您传入要填充的 xml 文档,让 WordToHtmlConverter 从 Word 文档生成 HTML 到其中,然后将 xml 文档转换为适当的输出(缩进、换行等

)代码看起来像这样:

    Document newDocument = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().newDocument();
    WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
            newDocument );

    wordToHtmlConverter.processDocument( hwpfDocument );

    StringWriter stringWriter = new StringWriter();
    Transformer transformer = TransformerFactory.newInstance()
            .newTransformer();
    transformer.setOutputProperty( OutputKeys.INDENT, "yes" );
    transformer.setOutputProperty( OutputKeys.ENCODING, "utf-8" );
    transformer.setOutputProperty( OutputKeys.METHOD, "html" );
    transformer.transform(
            new DOMSource( wordToHtmlConverter.getDocument() ),
            new StreamResult( stringWriter ) );

    String html = stringWriter.toString();

You best bet for now is probably to look at the unit tests, eg TestWordToHtmlConverter. That will show you how to do it

In general though, you pass in the xml document to be populated, have WordToHtmlConverter generate the HTML into it from the Word document, then transform the xml document into appropriate output (indenting, new lines etc)

Your code would want to look something like:

    Document newDocument = DocumentBuilderFactory.newInstance()
            .newDocumentBuilder().newDocument();
    WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
            newDocument );

    wordToHtmlConverter.processDocument( hwpfDocument );

    StringWriter stringWriter = new StringWriter();
    Transformer transformer = TransformerFactory.newInstance()
            .newTransformer();
    transformer.setOutputProperty( OutputKeys.INDENT, "yes" );
    transformer.setOutputProperty( OutputKeys.ENCODING, "utf-8" );
    transformer.setOutputProperty( OutputKeys.METHOD, "html" );
    transformer.transform(
            new DOMSource( wordToHtmlConverter.getDocument() ),
            new StreamResult( stringWriter ) );

    String html = stringWriter.toString();
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文