使用 Java:替换 MS Word 文件中的字符串

发布于 2024-08-13 17:14:10 字数 1536 浏览 4 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

‘画卷フ 2024-08-20 17:14:10

虽然 Apache POI 中有 MS Word 支持,但不是很好。加载然后保存除最基本格式之外的任何文件都可能会导致布局混乱。不过你应该尝试一下,也许它对你有用。

还有很多商业图书馆,但我不知道是否有更好的。

最近在处理类似的需求时,我不得不解决的蹩脚“解决方案”是使用 DOCX格式,打开 ZIP 容器,读取文档 XML,然后用正确的文本替换我的标记。这确实适用于替换没有段落等的简单文本。

private static final String WORD_TEMPLATE_PATH = "word/word_template.docx";
private static final String DOCUMENT_XML = "word/document.xml";

/*....*/

final Resource templateFile = new ClassPathResource(WORD_TEMPLATE_PATH);

final ZipInputStream zipIn = new ZipInputStream(templateFile.getInputStream());
final ZipOutputStream zipOut = new ZipOutputStream(output);

ZipEntry inEntry;
while ((inEntry = zipIn.getNextEntry()) != null) {
    final ZipEntry outEntry = new ZipEntry(inEntry.getName());
    zipOut.putNextEntry(outEntry);

    if (inEntry.getName().equals(DOCUMENT_XML)) {
        final String contentIn = IOUtils.toString(zipIn, UTF_8);
        final String outContent = this.processContent(new StringReader(contentIn));
        IOUtils.write(outContent, zipOut, UTF_8);
    } else {
        IOUtils.copy(zipIn, zipOut);
    }

    zipOut.closeEntry();
}

zipIn.close();
zipOut.finish();

我并不为此感到自豪,但它确实有效。

While there is MS Word support in Apache POI, it is not very good. Loading and then saving any file with other than the most basic formatting will likely garble the layout. You should try it out though, maybe it works for you.

There are a number of commercial libraries as well, but I don't know if any of them are any better.

The crappy "solution" I had to settle for when working on a similar requirement recently was using the DOCX format, opening the ZIP container, reading the document XML, and then replacing my markers with the right texts. This does work for replacing simple bits of text without paragraphs etc.

private static final String WORD_TEMPLATE_PATH = "word/word_template.docx";
private static final String DOCUMENT_XML = "word/document.xml";

/*....*/

final Resource templateFile = new ClassPathResource(WORD_TEMPLATE_PATH);

final ZipInputStream zipIn = new ZipInputStream(templateFile.getInputStream());
final ZipOutputStream zipOut = new ZipOutputStream(output);

ZipEntry inEntry;
while ((inEntry = zipIn.getNextEntry()) != null) {
    final ZipEntry outEntry = new ZipEntry(inEntry.getName());
    zipOut.putNextEntry(outEntry);

    if (inEntry.getName().equals(DOCUMENT_XML)) {
        final String contentIn = IOUtils.toString(zipIn, UTF_8);
        final String outContent = this.processContent(new StringReader(contentIn));
        IOUtils.write(outContent, zipOut, UTF_8);
    } else {
        IOUtils.copy(zipIn, zipOut);
    }

    zipOut.closeEntry();
}

zipIn.close();
zipOut.finish();

I'm not proud of it, but it works.

清泪尽 2024-08-20 17:14:10

我建议使用 Apache POI 库:

http://poi.apache.org/

查看更多 - 它看起来就像它没有保持最新一样 - 嘘!不过,它现在可能已经足够完整,可以满足您的需要了。

I would suggest the Apache POI library:

http://poi.apache.org/

Looking more - it looks like it hasn't been kept up to date - Boo! It may be complete enough now to do what you need however.

轻拂→两袖风尘 2024-08-20 17:14:10

试试这个:http://www.dancrintea.ro/doc-to-pdf/< /a>

除了替换 ms word 文件中的字符串之外,还可以:
- 使用简化的 API 读取/写入 Excel 文件,例如:getCell(x,y) 和 setCell(x,y,string)
- 隐藏Excel表格(例如二次计算)
- 替换 DOC、ODT 和 SXW 文件中的图像
- 并转换:

doc --> pdf、html、txt、rtf
xls--> pdf、html、csv
PPT--> pdf、swf

Try this one: http://www.dancrintea.ro/doc-to-pdf/

Besides replacing strings in ms word files can also:
- read/write Excel files using simplified API like: getCell(x,y) and setCell(x,y,string)
- hide Excel sheets(secondary calculations for example)
- replace images in DOC, ODT and SXW files
- and convert:

doc --> pdf, html, txt, rtf
xls --> pdf, html, csv
ppt --> pdf, swf

与风相奔跑 2024-08-20 17:14:10

我会看一下 Apache POI 项目。这就是我过去用来与 MS 文档交互的方法。

http://poi.apache.org/

I would take a look at the Apache POI project. This is what I have used to interact with MS documents in the past.

http://poi.apache.org/

撩起发的微风 2024-08-20 17:14:10

谢谢大家。我要尝试 http://www.dancrintea.ro/doc-to-pdf/< /a>

因为我需要转换经典的 DOC 文件(二进制)而不是 DOCX(zip 格式)。

Thanks all. I am gonna try http://www.dancrintea.ro/doc-to-pdf/

because I need to convert classic DOC file(binary) and not DOCX(zip format).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文