将 HTML 插入 OpenXML Word 文档 (.Net)

发布于 2024-07-07 04:48:08 字数 179 浏览 15 评论 0原文

使用 OpenXML SDK,我想将基本的 HTML 片段插入到 Word 文档中。

您将如何执行此操作:

  • 直接操作 XML?
  • 使用 XSLT ?
  • 使用 AltChunk ?

此外,C# 或 VB 示例也非常受欢迎:)

Using OpenXML SDK, I want to insert basic HTML snippets into a Word document.

How would you do this:

  • Manipulating XML directly ?
  • Using an XSLT ?
  • using AltChunk ?

Moreover, C# or VB examples are more than welcome :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

三生殊途 2024-07-14 04:48:08

这是另一个(相对较新的)替代方案

http://notesforhtml2openxml.codeplex.com/

Here is another (relatively new) alternative

http://notesforhtml2openxml.codeplex.com/

岁月无声 2024-07-14 04:48:08

好吧,很难给出一般性建议,因为这很大程度上取决于您的输入,什么是最好的。

下面是一个简单的示例,使用 OpenXML SDK v2.0 和 XPathDocument 将 (X)HTML 文档中的每个段落插入到 DOCX 文档中:

    void ConvertHTML(string htmlFileName, string docFileName)
    {
        // Create a Wordprocessing document. 
        using (WordprocessingDocument package = WordprocessingDocument.Create(docFileName, WordprocessingDocumentType.Document))
        {
            // Add a new main document part. 
            package.AddMainDocumentPart();

            // Create the Document DOM. 
            package.MainDocumentPart.Document = new Document(new Body());
            Body body = package.MainDocumentPart.Document.Body;

            XPathDocument htmlDoc = new XPathDocument(htmlFileName);

            XPathNavigator navigator = htmlDoc.CreateNavigator();
            XmlNamespaceManager mngr = new XmlNamespaceManager(navigator.NameTable);
            mngr.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml");

            XPathNodeIterator ni = navigator.Select("//xhtml:p", mngr);
            while (ni.MoveNext())
            {
                body.AppendChild<Paragraph>(new Paragraph(new Run(new Text(ni.Current.Value))));
            }

            // Save changes to the main document part. 
            package.MainDocumentPart.Document.Save();
        }
    }

该示例要求您的输入是有效的 XML,否则在创建XPath文档。

请注意,这是一个非常基本的示例,未考虑任何格式、标题、列表等。

Well, hard to give general advice, because it depends strongly on your input what is best.

Here's a simple example inserting a paragraph into a DOCX document for each paragraph in an (X)HTML document using OpenXML SDK v2.0 and an XPathDocument:

    void ConvertHTML(string htmlFileName, string docFileName)
    {
        // Create a Wordprocessing document. 
        using (WordprocessingDocument package = WordprocessingDocument.Create(docFileName, WordprocessingDocumentType.Document))
        {
            // Add a new main document part. 
            package.AddMainDocumentPart();

            // Create the Document DOM. 
            package.MainDocumentPart.Document = new Document(new Body());
            Body body = package.MainDocumentPart.Document.Body;

            XPathDocument htmlDoc = new XPathDocument(htmlFileName);

            XPathNavigator navigator = htmlDoc.CreateNavigator();
            XmlNamespaceManager mngr = new XmlNamespaceManager(navigator.NameTable);
            mngr.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml");

            XPathNodeIterator ni = navigator.Select("//xhtml:p", mngr);
            while (ni.MoveNext())
            {
                body.AppendChild<Paragraph>(new Paragraph(new Run(new Text(ni.Current.Value))));
            }

            // Save changes to the main document part. 
            package.MainDocumentPart.Document.Save();
        }
    }

The example requires your input to be valid XML, otherwise you will get an exception when creating the XPathDocument.

Please note that this is a very basic example not taking any formatting, headings, lists etc into account.

起风了 2024-07-14 04:48:08

我不确定您实际上想要实现什么。 OpenXML 文档对于格式元素(如段落、粗体文本等)有自己的类似 html (WordprocessingML) 表示法。 如果您想使用基本格式向文档添加一些文本,我宁愿建议使用 OpenXML 语法并用它来格式化插入的文本。

如果您有一个 html 片段,必须将其原样包含到文档中,则可以使用 OpenXML 的“外部内容”功能。 对于外部内容,您可以将 HTML 文档包含到包中,并在文档中要包含此内容的位置中创建引用 (altChunk)。 此解决方案的缺点是,并非所有工具都支持(或正确支持)生成的文档,因此我不推荐此解决方案,除非您确实无法更改 HTML 源。

恕我直言,如何将任何内容(wordml)包含到 openxml word 文档中是一个独立的问题,答案很大程度上取决于您想要应用的修改的复杂程度以及文档有多大。 对于一个简单的文档,我只需从包中读出文档部分,获取它的流并将其加载到 XmlDocument 中。 您可以轻松地将其他内容插入到 XmlDocument 中,然后将其保存回包中。 如果文档很大,或者需要在多个地方进行复杂的修改,XSLT 是一个不错的选择。

I'm not sure, what you actually would like to achieve. The OpenXML documents have an own html-like (WordprocessingML) notation for the formatting elements (like paragraph, bold text, etc.). If you would like to add some text to a doc, with basic formatting, than I rather suggest to use the OpenXML syntax and format the inserted text with that.

If you have a html snippet, that you must include into the doc as it is, you can use the "external content" feature of OpenXML. With external content, you can include the HTML document to the package, and create a reference (altChunk) in the doc in the position, where you want to include this. The disadvantage of this solution, that not all tools will support (or support properly) the generated document, therefore I don't recommend this solution, unless you really cannot change the HTML source.

How to include any content (the wordml) to a openxml word doc is an independent question IMHO, and the answer depends very much on how complex modifications you want to apply, and how big the document is. For a simple document, I would simply read out the document part from the package, obtain it's stream and load it to an XmlDocument. You can insert additional content to the XmlDocument quite easily, and then save it back to the package. If the document is big, or you need complex modifications in multiple places, XSLT is a good option.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文