在 Ruby 中创建 Microsoft Word (.docx) 文档
有没有一种简单的方法可以在 Ruby 应用程序中创建 Word 文档 (.docx)?实际上,就我而言,它是一个由 Linux 服务器提供服务的 Rails 应用程序。
Is there an easy way to create Word documents (.docx) in a Ruby application? Actually, in my case it's a Rails application served from a Linux server.
A gem similar to Prawn but for DOCX instead of PDF would be great!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
正如已经指出的,似乎没有任何库可以在 Ruby 中操作 Open XML 文档,但是 OpenXML Developer 有有关 Open XML 文档格式的完整文档。
如果您想要发送为每个用户定制的标准文档(如套用信函)的副本,那么考虑到 DOCX 是包含目录层次结构中各个部分的 ZIP 文件,这应该相当简单。拥有一个 DOCX“模板”,其中包含要发送给
所有
用户(没有实际内容)的所有部件和树结构,然后只需创建包含该用户的新部件(或修改现有部件) - 您想要的特定内容,并将其注入 ZIP(DOCX 文件)中,然后再发送给用户。例如:您可以拥有包含亲爱的[USER-PLACEHOLDER]:的
document-template.xml
。当用户请求文档时,您可以将[USER-PLACEHOLDER]
替换为用户名,然后将生成的document.xml
添加到your-template.docx
ZIP 文件(其中包含 Word 文档中所需的所有图像和其他部分)并将生成的文档发送给用户。请注意,如果将
.docx
文件重命名为.zip
,那么探索内部各部分的结构和格式就很简单了。您可以使用任何 ZIP 操作工具或以编程方式使用代码轻松删除或替换图像或其他部分。如果不访问 API 来使工作变得更容易,那么从原始 XML 生成具有完全自定义内容的全新 Word 文档将非常困难。如果您确实需要这样做,您可以考虑安装 Mono,然后使用 VB.NET、C#或 IronRuby 使用 开放 XML 格式 SDK 1.0。因为您只是使用 Microsoft.Office。 DocumentFormat.OpenXml.Packaging Namespace 来操作 Open XML 文档,它在 Mono 中应该可以正常工作,它似乎支持 SDK 所需的一切。
As has been noted, there don't appear to be any libraries to manipulate Open XML documents in Ruby, but OpenXML Developer has complete documentation on the format of Open XML documents.
If what you want is to send a copy of a standard document (like a form letter) customized for each user, it should be fairly simple given that a DOCX is a ZIP file that contains various parts in a directory hierarchy. Have a DOCX "template" that contains all the parts and tree structure that you want to send to
all
users (with no real content), then simply create new (or modify existing) pieces that contain the user-specific content you want and inject it into the ZIP (DOCX file) before sending it to the user.For example: You could have
document-template.xml
that contains Dear [USER-PLACEHOLDER]:. When a user requests the document, you replace[USER-PLACEHOLDER]
with the user's name, then add the resultingdocument.xml
to theyour-template.docx
ZIP file (which would contain all the images and other parts you want in the Word document) and send that resulting document to the user.Note that if you rename a
.docx
file to.zip
it is trivial to explore the structure and format of the parts inside. You can remove or replace images or other parts very easily with any ZIP manipulation tools or programmatically with code.Generating a brand new Word document with completely custom content from raw XML would be very difficult without access to an API to make the job easier. If you really need to do that, you might consider installing Mono, then use VB.NET, C# or IronRuby to create your Open XML documents using the Open XML Format SDK 1.0. Since you would just be using the Microsoft.Office.DocumentFormat.OpenXml.Packaging Namespace to manipulate Open XML documents, it should work okay in Mono, which seems to support everything the SDK requires.
也许这颗宝石对您来说很有趣。
https://github.com/trade-informatics/caracal/
它像虾,但带有 docx 。
Maybe this gem is interesting for you.
https://github.com/trade-informatics/caracal/
It like prawn but with docx.
您可以使用 Apache POI。它是用 Java 编写的,但与 Ruby 作为扩展集成
You can use Apache POI. It is written in Java, but integrates with Ruby as an extension
这是一个老问题,但有一个新答案。如果您想将 HTML 文档转换为 Word (docx) 文档,只需使用 'htmltoword' gem:
https://github.com/karnov/htmltoword
我不确定为什么会有答案蔓延,每个人都开始发布模板解决方案,但这回答了OP的问题。就像Prawn一样,只是Word而不是PDF。
更新:
还有 pandoc 和一个名为 docverter。由于 pandoc 是一个 haskell 库,因此两者的安装都稍微复杂一些。
This is an old question but there's a new answer. If you'd like to turn an HTML doc into a Word (docx) doc, just use the 'htmltoword' gem:
https://github.com/karnov/htmltoword
I'm not sure why there was answer creep and everyone started posting templating solutions, but this answers the OP's question. Just like Prawn, except Word instead of PDF.
UPDATE:
There's also pandoc and an API wrapper for pandoc called docverter. Both have slightly complicated installs since pandoc is a haskell library.
我知道如果您将 HTML 文档作为带有 .doc 扩展名的 Word 文档提供,则它可以在 Word 中正常打开。只是不要做任何花哨的事情。
编辑:这是一个使用经典 ASP 的示例。 http://www.aspdev.org/asp/asp-export-word/
I know if you serve a HTML document as a word document with the .doc extension, it will open in Word just fine. Just don't do anything fancy.
Edit: Here is an example using classic ASP. http://www.aspdev.org/asp/asp-export-word/
使用与 Grant Wagner 建议的技术非常相似的技术,我创建了一个 Ruby html 到 word gem,它应该允许您轻松地从 ruby 应用程序输出 Word docx 文件。您可以在 http://github.com/nickfrandsen/htmltoword 查看 - 只需传递一个 html字符串,它将创建相应的 word docx 文件。
希望您觉得它有用。如果您有任何问题,请随时打开 github 问题。
Using a technique very similar to that suggested by Grant Wagner I have created a Ruby html to word gem that should allow you to easily output Word docx files from your ruby app. You can check it out at http://github.com/nickfrandsen/htmltoword - Simply pass it a html string and it will create a corresponding word docx file.
Hope you find it useful. If you have any problems with it feel free to open a github issue.
披露:我是 docxtemplater 项目的领导者。
我知道您正在寻找 ruby 解决方案,但因为所有其他解决方案只告诉您如何在全局范围内执行此操作,而没有为您提供一个完全满足您需求的库,因此这里有一个基于 JS 或 NodeJS 的解决方案(适用于两者)
DocxTemplater 库
库的演示
您也可以在命令行中使用它:
Disclosure: I'm the leader of the docxtemplater project.
I know you're looking for a ruby solution, but because all other solutions only tell you how to do it globally, without giving you a library that does exactly what you want, here's a solution based on JS or NodeJS (works in both)
DocxTemplater Library
Demo of the library
You can also use it in the commandline:
这是 Doccy (doccyapp.com) 具有 api 的方式,它可以完成您可以使用的操作。支持 docx、odt 和页面,如果您愿意,也可以转换为 PDF
This is a way Doccy (doccyapp.com) has a api that does just that which you can use. Supports docx, odt and pages and converts to PDF as well if you like
除了 Grant 的回答之外,您还可以向 Word 发送一个“Flat OPC”文件,该文件本质上是解压缩并连接以创建单个 xml 文件的 docx。这样,您就可以在一个文件中替换 [USER-PLACEHOLDER] 并完成它(即无需压缩或解压缩)。
Further to Grant's answer, you can also send Word a "Flat OPC" file, which is essentially the docx unzipped and concatenated to create a single xml file. This way, you can replace [USER-PLACEHOLDER] in one file and be done with it (ie no zipping or unzipping).
如果有人仍在关注此内容,这篇文章将解释如何使用 XML 数据源。这对我来说效果很好。
http://seroter.wordpress。 com/2009/12/23/populate-word-2007-templates-through-open-xml/
If anyone is still looking at this, this post explains how to use an XML data source. This works nicely for me.
http://seroter.wordpress.com/2009/12/23/populating-word-2007-templates-through-open-xml/
查看此 github 存储库:https://github.com/jawspeak/ruby-docx-templater
它允许您从 Word 模板创建文档。
Check out this github repo: https://github.com/jawspeak/ruby-docx-templater
It allows you to create a document from a word template.
如果您在 Windows 上运行,当然,这是 WIN32OLE 的问题以及 Word COM 对象的一些问题。
不过,您的服务很可能来自 *nix 环境。 Word 2007 使用可打开的“Microsoft Office Open XML”格式 (*.docx)使用适当的兼容包微软。
一些较新的 Office 应用程序(至少 2002/XP 和 2003)具有自己的 XML 格式 这也可能有用。
遗憾的是,我不知道有任何 Ruby 工具可以使这个过程变得更容易。
如果它可以被接受,我想我会倾向于走重命名的 html 文件路线。我刚刚从 WordXP 将文档另存为 HTML,将其重命名为 .doc,然后打开它,没有问题。
If you're running on Windows, of course, it's a matter of WIN32OLE and some pain with the Word COM objects.
Chances are that your serving from a *nix environment, though. Word 2007 uses the "Microsoft Office Open XML" format (*.docx) which can be opened using the appropriate compatibility pack from Microsoft.
Some of the more recent Office apps (2002/XP and 2003 at least) had their own XML formats which may also be useable.
I'm not aware of any Ruby tools to make the process easier, sadly.
If it can be made acceptable, I think I'd be inclined to go down the renamed-html file route. I just saved a document as HTML from WordXP, renamed it to a .doc and opened it without problem.
我遇到了同样的问题。不幸的是我无法操作 xml,因为我的客户应该自己填写模板。并且做到这一点并不总是可能的(例如,office for mac 不允许这样做)。
作为这个问题的解决方案,我做了一个简单的gem,它可以用作嵌入ruby的rtf文档模板: https://github.com/eicca/rtf-templater
我测试了它,它可以正常填充报告和文档。但是,对于复杂的循环和条件,格式显示效果很差。
I encountered the same problem. Unfortunately I could not manipulate the xml because my clients should themselves to fill in templates. And to do this is not always possible (for example, office for mac does not allow this).
As a solution to this problem, I made a simple gem, which can be used as an rtf document template with embedded ruby: https://github.com/eicca/rtf-templater
I tested it and it works ok for filling reports and documents. However, formatting badly displays for complex loops and conditions.