以编程方式搜索 +替换为 .doc

发布于 2024-09-14 03:00:02 字数 309 浏览 0 评论 0原文

如果我得到一个包含特殊标签(例如 [first_name])的 .doc 文件,我该如何将所有出现的地方替换为“Clark”之类的内容?仅当替换字符串的长度完全相同时,简单的二进制替换才有效。

Haskell、C 和 C++ 答案是最好的,但任何编译语言都可以。我也更喜欢在没有外部库的情况下执行此操作,因为它必须部署在 Windows 和 Linux 上,并且跨平台依赖项处理很麻烦。

总结一下...

.doc -> magic program -> .doc with strings replaced

If I'm given a .doc file with special tags in it such as [first_name], how do I go about replacing all occurrences of it with something like "Clark"? A simple binary replacement only works if the replacement string is the exact same length.

Haskell, C, and C++ answers would be best, but any compiled language would do. I'd also prefer to do this without an external library since it has to be deployed on Windows and Linux and cross-platform dependency handling is a bitch.

To summarize...

.doc -> magic program -> .doc with strings replaced

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

云胡 2024-09-21 03:00:03

首先阅读 Word 文档规范

如果这没有吓到您,那么您应该会发现弄清楚如何读写它相当简单。这一定是可能的;大多数情况下,Word 都能做到这一点。

First read the Word Document Specification.

If that hasn't terrified you, then you should find it fairly straightforward to figure out how to read and write it. It must be possible; Word manages to do it most of the time.

遥远的绿洲 2024-09-21 03:00:03

您可能必须使用 .Net 编程(VB 或 C#)来创建 Word.Application 对象,然后使用 MS Word 对象模型来操作您的文档。

You probably have to use .Net programming (VB or C#) to create an object of Word.Application and then use the MS Word object model to manipulate your document.

篱下浅笙歌 2024-09-21 03:00:03

为什么要使用 C/C++/Haskell 或其他编译语言?我对 Haskell 不太熟悉,但总的来说,我会说 C 并不是执行文本处理的好语言。许多解释语言(Perl、Python 等)也具有强大的正则表达式库,适合查找和替换短语。

话虽如此,正如其他发帖者所指出的,您仍然需要处理 .doc 格式的怪癖。

Why do you want to be using C/C++/Haskell or another compiled language? I'm not too familiar with Haskell, but in general I would say that C is not a great language for performing text processing. A lot of interpreted languages (Perl, Python, etc.) also have powerful regular expression libraries that are suited for finding and replacing phrases.

With that said, as the other posters have noted, you will still have to deal with the eccentricities of the .doc format.

毁梦 2024-09-21 03:00:02

您可以使用 Windows 上的 Word COM 组件(“Word.Application”)来打开文件、进行替换、保存文件并关闭它。然而,这仅适用于 Windows,并且可能存在错误。

您可以做的另一件事是使用 OpenOffice.org 命令行界面将文件转换为 ODF 格式,解压缩文件(ODF 主要是压缩的 XML),用里面的文件进行替换,重新压缩文件,然后重新- 将其转换为 .doc 格式。但是,OpenOffice.org 并不总是能够正确读取 Word 文件(尤其是存在大量复杂格式的情况下),并且可能会导致分发变得更加困难(用户必须拥有 OpenOffice.org,或者您必须将其与您的程序一起分发)。

另外,如果您有 .docx 格式的文件,您可以将其解压缩、进行替换,然后重新压缩。

You could use the Word COM component ("Word.Application") on Windows to open the file, do the replacements, save the file, and close it. However, this is Windows-only and can be buggy.

Another thing you could do is use the OpenOffice.org command line interface to convert the file to the ODF format, unzip the file (ODF is mostly zipped XML), do the replacements with the files inside, re-zip the file, and re-convert it to .doc format. However, OpenOffice.org doesn't always read Word files correctly (especially if there is a lot of complex formatting) and it can make it harder to distribute (users must either have OpenOffice.org or you must distribute it with your program).

Also, if you have a file in the .docx format, you can unzip it, do the replacements, and re-zip it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文