OpenXml Sdk - 将 docx 的部分复制到另一个 docx 中

发布于 2024-10-11 04:01:43 字数 3039 浏览 4 评论 0原文

我正在尝试以下代码。它需要一个文件名(包含许多部分的 docx 文件),我尝试迭代每个部分以获取部分名称。问题是我最终得到了无法读取的 docx 文件。它没有错误,但我认为我在获取该部分中的元素时做错了什么。

public void Split(string fileName) {
            using (WordprocessingDocument myDoc =
                WordprocessingDocument.Open(fileName, true)) {
                string curCliCode = "";
                MainDocumentPart mdp = myDoc.MainDocumentPart;

                foreach (var element in mdp.Document.Body.ChildElements) {
                    if (element.Descendants().OfType<SectionProperties>().Count() == 1) {
                        //get the name of the section from the footer
                        var footer = (FooterPart) mdp.GetPartById(
                                                      element.Descendants().OfType<SectionProperties>().First().OfType
                                                          <FooterReference>().First().
                                                          Id.Value);
                        foreach (Paragraph p in footer.Footer.ChildElements.OfType<Paragraph>()) {
                            if (p.InnerText != "") {
                                curCliCode = p.InnerText;
                            }
                        }
                        if (curCliCode != "") {
                            var forFile = new List<OpenXmlElement>();
                            var els = element.ElementsBefore();
                            if (els != null) {
                                foreach (var e in els) {
                                    if (e != null) {
                                        forFile.Add(e);
                                    }
                                }
                                for (int i = 0; i < els.Count(); i++) {
                                    els.ElementAt(i).Remove();
                                }
                            }
                            Create(curCliCode, forFile);
                        }
                    }
                }

            }
        }
        private void Create(string cliCode,IEnumerable<OpenXmlElement> docParts) {
            var parts = from e in docParts select e.Clone();
            const string template = @"\Test\toSplit\blank.docx";
            string destination = string.Format(@"\Test\{0}.docx", cliCode);
            File.Copy(template, destination,true);
            /* Create the package and main document part */
            using (WordprocessingDocument myDoc =
                WordprocessingDocument.Open(destination, true)) {
                MainDocumentPart mainPart = myDoc.MainDocumentPart;
                /* Create the contents */
                foreach(var part in parts) {
                    mainPart.Document.Body.Append((OpenXmlElement)part);
                }

                /* Save the results and close */
                mainPart.Document.Save();
                myDoc.Close();
            }
        }

有谁知道问题可能是什么(或者如何正确地将一个文档的一部分复制到另一个文档)?

I am trying the following code. It takes a fileName (docx file with many sections) and I try to iterate through each section getting the section name. The problem is that I end up with unreadable docx files. It does not error, but I think I am doing something wrong with getting the elements in the section.

public void Split(string fileName) {
            using (WordprocessingDocument myDoc =
                WordprocessingDocument.Open(fileName, true)) {
                string curCliCode = "";
                MainDocumentPart mdp = myDoc.MainDocumentPart;

                foreach (var element in mdp.Document.Body.ChildElements) {
                    if (element.Descendants().OfType<SectionProperties>().Count() == 1) {
                        //get the name of the section from the footer
                        var footer = (FooterPart) mdp.GetPartById(
                                                      element.Descendants().OfType<SectionProperties>().First().OfType
                                                          <FooterReference>().First().
                                                          Id.Value);
                        foreach (Paragraph p in footer.Footer.ChildElements.OfType<Paragraph>()) {
                            if (p.InnerText != "") {
                                curCliCode = p.InnerText;
                            }
                        }
                        if (curCliCode != "") {
                            var forFile = new List<OpenXmlElement>();
                            var els = element.ElementsBefore();
                            if (els != null) {
                                foreach (var e in els) {
                                    if (e != null) {
                                        forFile.Add(e);
                                    }
                                }
                                for (int i = 0; i < els.Count(); i++) {
                                    els.ElementAt(i).Remove();
                                }
                            }
                            Create(curCliCode, forFile);
                        }
                    }
                }

            }
        }
        private void Create(string cliCode,IEnumerable<OpenXmlElement> docParts) {
            var parts = from e in docParts select e.Clone();
            const string template = @"\Test\toSplit\blank.docx";
            string destination = string.Format(@"\Test\{0}.docx", cliCode);
            File.Copy(template, destination,true);
            /* Create the package and main document part */
            using (WordprocessingDocument myDoc =
                WordprocessingDocument.Open(destination, true)) {
                MainDocumentPart mainPart = myDoc.MainDocumentPart;
                /* Create the contents */
                foreach(var part in parts) {
                    mainPart.Document.Body.Append((OpenXmlElement)part);
                }

                /* Save the results and close */
                mainPart.Document.Save();
                myDoc.Close();
            }
        }

Does anyone know what the problem could be (or how to properly copy a section from one document to another)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

睡美人的小仙女 2024-10-18 04:01:43

我在这个领域做了一些工作,我发现非常有价值的是将已知的好文件与预期的文件进行比较;这个错误通常是相当明显的。

我要做的就是获取一个您知道有效的文件,并将所有部分复制到模板中。理论上,这两个文件应该是相同的。对 docx 文件内的 document.xml 运行 diff,您将看到差异。

顺便说一句,我假设您知道 docx 实际上是一个 zip;将扩展名更改为“zip”,您将能够获取构成该格式的实际 xml 文件。

至于比较工具,我使用 Scooter Software 的 Beyond Compare。

I've done some work in this area, and what I have found invaluable is diffing a known good file with a prospective file; the error is usually fairly obvious.

What I would do is take a file that you know works, and copy all of the sections into the template. Theoretically, the two files should be identical. Run a diff on them the document.xml inside the docx file, and you'll see the difference.

BTW, I'm assuming that you know that the docx is actually a zip; change the extension to "zip", and you'll be able to get at the actual xml files which compose the format.

As far as diff tools, I use Beyond Compare from Scooter Software.

无风消散 2024-10-18 04:01:43

与您正在做的事情类似的方法仅适用于简单文档(即那些不包含图像、超链接、注释等的文档)。要处理这些更复杂的文档,请查看 http://blogs.msdn.com/b/ericwhite/archive/2009/02/05/move-insert-delete- paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx 和生成的 DocumentBuilder API(CodePlex 上 PowerTools for Open XML 项目的一部分)。

为了使用 DocumentBuilder 将 docx 拆分为多个部分,您仍然需要首先找到包含 sectPr 元素的段落的索引。

An approach along the lines of what you are doing will work only for simple documents (ie those not containing images, hyperlinks, comments etc). To handle these more complex documents, take a look at http://blogs.msdn.com/b/ericwhite/archive/2009/02/05/move-insert-delete-paragraphs-in-word-processing-documents-using-the-open-xml-sdk.aspx and the resulting DocumentBuilder API (part of the PowerTools for Open XML project on CodePlex).

In order to split a docx into sections using DocumentBuilder, you'll still need to first find the index of the paragraphs containing sectPr elements.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文