如何使用 OpenXml SDK 2.0 更改 Word 2007 中内容控件的内容?

发布于 2024-08-17 16:06:44 字数 750 浏览 3 评论 0原文

快要被这个问题搞疯了。我确信它是如此简单,我只是想念它,但我一生都无法找出如何使用 C# 中的 OpenXml SDK v2.0 更改 Word 2007 中的内容控件的内容。

我创建了一个带有纯文本内容控件的 Word 文档。该控件的标签是“FirstName”。在代码中,我想打开Word文档,找到此内容控件,然后更改内容而不丢失格式。

我最终找到的解决方案涉及找到内容控件,在其后面插入一个运行,然后删除内容控件,如下所示:

using (WordprocessingDocument wordProcessingDocument = WordprocessingDocument.Open(filePath, true)) {
MainDocumentPart mainDocumentPart = wordProcessingDocument.MainDocumentPart;
SdtRun sdtRun = mainDocumentPart.Document.Descendants<SdtRun>()
 .Where(run => run.SdtProperties.GetFirstChild<Tag>().Val == "FirstName").Single();

if (sdtRun != null) {
 sdtRun.Parent.InsertAfter(new Run(new Text("John")), sdtRun);
 sdtRun.Remove();
}

这确实更改了文本,但我丢失了所有格式。有谁知道我该怎么做?

About to go mad with this problem. I'm sure it's so simple I'm just missing it, but I cannot for the life of me find out how to change the content of a content control in Word 2007 with the OpenXml SDK v2.0 in C#.

I have created a Word document with a plain text content control. The tag for this control is "FirstName". In code, I'd like to open up the Word document, find this content control, and change the content without losing the formatting.

The solution I finally got to work involved finding the content control, inserting a run after it, then removing the content control as such:

using (WordprocessingDocument wordProcessingDocument = WordprocessingDocument.Open(filePath, true)) {
MainDocumentPart mainDocumentPart = wordProcessingDocument.MainDocumentPart;
SdtRun sdtRun = mainDocumentPart.Document.Descendants<SdtRun>()
 .Where(run => run.SdtProperties.GetFirstChild<Tag>().Val == "FirstName").Single();

if (sdtRun != null) {
 sdtRun.Parent.InsertAfter(new Run(new Text("John")), sdtRun);
 sdtRun.Remove();
}

This does change the text, but I lose all formatting. Does anyone know how I can do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

指尖凝香 2024-08-24 16:06:44

我找到了一种更好的方法来使用 http:// /wiki. Threewill.com/display/enterprise/SharePoint+and+Open+XML#SharePointandOpenXML-UsingWord2007ContentControls 作为参考。您的结果可能会有所不同,但我认为这会给您一个良好的开端:

using (WordprocessingDocument wordprocessingDocument = WordprocessingDocument.Open(filePath, true)) {
    var sdtRuns = mainDocumentPart.Document.Descendants<SdtRun>()
        .Where(run => run.SdtProperties.GetFirstChild<Tag>().Val.Value == contentControlTagValue);

    foreach (SdtRun sdtRun in sdtRuns) {
        sdtRun.Descendants<Text>().First().Text = replacementText;
    }

    wordprocessingDocument.MainDocumentPart.Document.Save();
}

我认为上述内容仅适用于纯文本内容控件。不幸的是,它并没有摆脱最终文档中的内容控制。如果我能做到这一点,我会发布它。

http://msdn.microsoft.com/en-us/library/cc197932.aspx如果您想找到富文本内容控件,也是一个很好的参考。本节讨论向放置在富文本内容控件中的表添加行。

I found a better way to do the above using http://wiki.threewill.com/display/enterprise/SharePoint+and+Open+XML#SharePointandOpenXML-UsingWord2007ContentControls as a reference. Your results may vary but I think this will get you off to a good start:

using (WordprocessingDocument wordprocessingDocument = WordprocessingDocument.Open(filePath, true)) {
    var sdtRuns = mainDocumentPart.Document.Descendants<SdtRun>()
        .Where(run => run.SdtProperties.GetFirstChild<Tag>().Val.Value == contentControlTagValue);

    foreach (SdtRun sdtRun in sdtRuns) {
        sdtRun.Descendants<Text>().First().Text = replacementText;
    }

    wordprocessingDocument.MainDocumentPart.Document.Save();
}

I think the above will only work for Plain Text content controls. Unfortunately, it doesn't get rid of the content control in the final document. If I get that working I'll post it.

http://msdn.microsoft.com/en-us/library/cc197932.aspx is also a good reference if you want to find a rich text content control. This one talks about adding rows to a table that was placed in a rich text content control.

独自唱情﹋歌 2024-08-24 16:06:44

删除 sdtRun 并添加新的第一种方法显然会删除格式,因为您只添加 Run 而不是 RunStyle。要保留格式,您应该创建 run 元素,例如

new Run( new RunProperties(new RunStyle(){ Val = "MyStyle" }),
                            new Text("Replacement Text"));

您的第二种替换所有后代的方法仅适用于纯文本内容控件,因为富文本内容控件没有 SdtRun 元素。富文本内容控件是带有 SdtContent 元素的 SdtBlock。富文本内容控件可以具有多个段落、多个运行和多个文本。因此,您的代码 sdtRun.Descendants().First().Text = replacementText 对于富文本内容控件来说将存在缺陷。没有一行代码可以替换丰富内容控件的整个文本,同时保留所有格式。

我不明白你所说的“它没有摆脱最终文档中的内容控制”是什么意思?我认为您的要求是仅通过保留内容控件和格式来更改文本(内容)。

Your first approach to remove the sdtRun and adding a new one will obviously remove the formatting because you are only adding a Run but not the RunStyle. To preserve the formatting you should create run elements like

new Run( new RunProperties(new RunStyle(){ Val = "MyStyle" }),
                            new Text("Replacement Text"));

Your second approach to replace all Decendants<Text> will work for Plain Text Content Control only because a Rich Text Content Control does not have SdtRun element. Rich Text Content Control is SdtBlock with SdtContent elements. A rich text content control can have multiple paragraphs, multiple Runs and multiple Texts. So your code, sdtRun.Descendants<Text>().First().Text = replacementText, will be flawed for a Rich Text Content Control. There is no one line code to replace the entire text of a rich content control and yet preserve all the formatting.

I did not understand what you mean by "it doesn't get rid of the content control in the final document"? I thought your requirement here is to change the text (content) only by preserving the content control and the formatting.

清风无影 2024-08-24 16:06:44

实现所需结果的一种绝佳方法是使用 Open XML SDK 2.0 附带的文档反射器工具...

例如,您可以:

  1. 在文档中每个内容控件的“属性”对话框中,勾选“编辑内容时移除内容控制”。
  2. 填写它们并将其另存为新文档。
  3. 使用反射器比较原始版本和保存的版本。
  4. 点击显示/隐藏代码按钮,它将显示将原始版本转换为填充版本所需的代码。

它并不完美,但非常有用。您还可以直接查看任一文档的标记,并查看填充控件所引起的更改。

这是一种有点脆弱的方法,因为字处理 ML 可能很复杂;很容易搞砸。对于简单的文本控件,我只使用这种方法:

private void FillSimpleTextCC(SdtRun simpleTextCC, string replacementText)
    {
        // remove the showing place holder element      
        SdtProperties ccProperties = simpleTextCC.SdtProperties;
        ccProperties.RemoveAllChildren<ShowingPlaceholder>();

        // fetch content block Run element            
        SdtContentRun contentRun = simpleTextCC.SdtContentRun;
        var ccRun = contentRun.GetFirstChild<Run>();

        // if there was no placeholder text in the content control, then the SdtContentRun
        // block will be empty -> ccRun will be null, so create a new instance
        if (ccRun == null)
        {
            ccRun = new Run(
                new RunProperties() { RunStyle = null },
                new Text());
            contentRun.Append(ccRun);
        }

        // remove revision identifier & replace text
        ccRun.RsidRunProperties = null;
        ccRun.GetFirstChild<Text>().Text = replacementText;

        // set the run style to that stored in the SdtProperties block, if there was
        // one. Otherwise the existing style will be used.            
        var props = ccProperties.GetFirstChild<RunProperties>();
        if (props != null)
        if (props != null)
        {
            RunStyle runStyle = props.RunStyle;
            if (runStyle != null)
            {
                // set the run style to the same as content block property style.
                var runProps = ccRun.RunProperties;
                runProps.RunStyle = new RunStyle() { Val = runStyle.Val };
                runProps.RunFonts = null;
            }
        }
    }

希望能以某种方式有所帮助。 :D

One EXCELLENT way to work out how to achieve the desired result is to use the document reflector tool that comes with the Open XML SDK 2.0....

For example, you could:

  1. In the Properties dialog for each of the content controls in your document, check the "Remove content control when the contents are edited".
  2. Fill them in and save it as a new doc.
  3. Use the reflector to compare the original and the saved version.
  4. Hit the show/hide code button and it will show you the code required to turn the original into the filled in version.

It's not perfect, but it's amazingly useful. You can also just look directly at the markup of either document and see the changes that filling in the controls caused.

This is a somewhat brittle way to do it because Wordprocessing ML is can be complicated; it's easy to mess it up. For simple text controls, I just use this method:

private void FillSimpleTextCC(SdtRun simpleTextCC, string replacementText)
    {
        // remove the showing place holder element      
        SdtProperties ccProperties = simpleTextCC.SdtProperties;
        ccProperties.RemoveAllChildren<ShowingPlaceholder>();

        // fetch content block Run element            
        SdtContentRun contentRun = simpleTextCC.SdtContentRun;
        var ccRun = contentRun.GetFirstChild<Run>();

        // if there was no placeholder text in the content control, then the SdtContentRun
        // block will be empty -> ccRun will be null, so create a new instance
        if (ccRun == null)
        {
            ccRun = new Run(
                new RunProperties() { RunStyle = null },
                new Text());
            contentRun.Append(ccRun);
        }

        // remove revision identifier & replace text
        ccRun.RsidRunProperties = null;
        ccRun.GetFirstChild<Text>().Text = replacementText;

        // set the run style to that stored in the SdtProperties block, if there was
        // one. Otherwise the existing style will be used.            
        var props = ccProperties.GetFirstChild<RunProperties>();
        if (props != null)
        if (props != null)
        {
            RunStyle runStyle = props.RunStyle;
            if (runStyle != null)
            {
                // set the run style to the same as content block property style.
                var runProps = ccRun.RunProperties;
                runProps.RunStyle = new RunStyle() { Val = runStyle.Val };
                runProps.RunFonts = null;
            }
        }
    }

Hope that helps in some way. :D

青朷 2024-08-24 16:06:44

我还必须查找并替换页脚中的文本。您可以使用以下代码找到它们:

using (WordprocessingDocument wordprocessingDocument = WordprocessingDocument.Open(file.PhysicalFile.FullName, true)) {
    foreach (FooterPart footerPart in wordprocessingDocument.MainDocumentPart.FooterParts) {
        var footerPartSdtRuns = footerPart.Footer.Descendants<SdtRun>()
            .Where(run => run.SdtProperties.GetFirstChild<Tag>().Val.Value == contentControlTag);

        foreach (SdtRun sdtRun in footerPartSdtRuns) {
           sdtRun.Descendants<Text>().First().Text = replacementTerm;
        }
    }

    wordprocessingDocument.MainDocumentPart.Document.Save();
}

I also had to find and replace text in the footers. You can find them using the following code:

using (WordprocessingDocument wordprocessingDocument = WordprocessingDocument.Open(file.PhysicalFile.FullName, true)) {
    foreach (FooterPart footerPart in wordprocessingDocument.MainDocumentPart.FooterParts) {
        var footerPartSdtRuns = footerPart.Footer.Descendants<SdtRun>()
            .Where(run => run.SdtProperties.GetFirstChild<Tag>().Val.Value == contentControlTag);

        foreach (SdtRun sdtRun in footerPartSdtRuns) {
           sdtRun.Descendants<Text>().First().Text = replacementTerm;
        }
    }

    wordprocessingDocument.MainDocumentPart.Document.Save();
}
完美的未来在梦里 2024-08-24 16:06:44

另一种解决方案是

        SdtRun rOld = p.Elements<SdtRun>().First();

        string OldNodeXML = rOld.OuterXml;
        string NewNodeXML = OldNodeXML.Replace("SearchString", "ReplacementString");

        SdtRun rNew = new SdtRun(NewNodeXML);


        p.ReplaceChild<SdtRun>(rNew, rOld);

Another solution would be

        SdtRun rOld = p.Elements<SdtRun>().First();

        string OldNodeXML = rOld.OuterXml;
        string NewNodeXML = OldNodeXML.Replace("SearchString", "ReplacementString");

        SdtRun rNew = new SdtRun(NewNodeXML);


        p.ReplaceChild<SdtRun>(rNew, rOld);
丶情人眼里出诗心の 2024-08-24 16:06:44

内容控件类型

根据 Word 文档中的插入点,创建两种类型的内容控件:

  • 顶级(与段落处于同一级别)

  • 嵌套(通常在现有段落内)

令人困惑的是,在 XML 中,两种类型都被标记为 ... 但底层的 openXML 类不同。
对于顶层,根为 SdtBlock,内容为 SdtContentBlock。对于嵌套,它是 SdtRun & SdtContentRun

要获取这两种类型,即所有内容控件,最好通过公共基类 SdtElement 进行迭代,然后检查类型:

List<SdtElement> sdtList = document.Descendants<SdtElement>().ToList();

foreach( SdtElement sdt in sdtList )
{
   if( sdt is SdtRun )
   {
      ; // process nested sdts
   }

   if( sdt is SdtBlock )
   {
      ; // process top-level sdts
   }
}

对于文档模板,应处理所有内容控件 -多个内容控件具有相同的标签名称(例如客户名称)是很常见的,所有这些通常都需要替换为实际的客户名称。

内容控制标记名称

内容控制标记名称永远不会被拆分。

在 XML 中,这是:

<w:sdt>
...
<w:sdtPr>
...
<w:tag w:val="customer-name"/>

因为标记名称永远不会被分割,所以总是可以通过直接匹配找到它:

   List<SdtElement> sdtList = document.Descendants<SdtElement>().ToList();
        
   foreach( SdtElement sdt in sdtList )
   {
       if( sdt is SdtRun )
       {
         String tagName = sdt.SdtProperties.GetFirstChild<Tag>().Val;

         if( tagName == "customer-name" )
         {
            ; // get & replace placeholder with actual value
         }

显然,在上面的代码中,需要有一个更优雅的机制来检索与每个标记对应的实际值不同的标签名称。

内容控制文本

在内容控件中,将呈现的文本分为多个运行(尽管每个运行具有相同的属性)是很常见的。

除其他外,这是由拼写/语法检查器和语法检查器引起的。编辑尝试的次数。
当使用分隔符时,文本分割更为常见,例如 [customer-name] 等。

这很重要的原因是,如果不检查 XML,就不可能保证占位符文本没有被分割,因此无法找到它并更换。

一种建议的方法

一种建议的方法是仅使用纯文本内容控件、顶级和/或嵌套,然后:

  • 通过标记名称查找内容控件

  • 在内容控件后插入格式化段落或运行

  • 删除内容控件

     列表 sdtList = document.Descendants().ToList();
    
     foreach( sdtList 中的 SdtElement sdt )
     {
        if( sdt 是 SdtRun )
        {
           String tagName = sdt.SdtProperties.GetFirstChild().Val;
    
           String newText = "新文本"; // 例如 GetTextByTag( tagName );
    
           // 应该使用样式或常见的 run props
    
           RunProperties runProps = new RunProperties();
    
           runProps.Color = new Color() { Val = "000000" };
           runProps.FontSize = new FontSize() { Val = "23" };
           runProps.RunFonts = new RunFonts() { Ascii = "Calibri" };
    
           运行运行=新的运行();
    
           run.Append( runProps );
           运行.Append( 新文本( newText ) );
    
           sdt.InsertAfterSelf( 运行 );
    
           sdt.Remove();
        }
    
        if( sdt 是 SdtBlock )
        {
           ; // 添加段落
        }
     }
    

对于顶级类型,需要插入一个段落。

在这种方法中,内容控件仅用作占位符,可以保证找到(通过标记名称),然后完全替换为适当的文本(格式一致)。

此外,这还消除了格式化内容控制文本的需要(该文本可能会被分割,因此无法找到)。

对标签名称使用合适的命名约定,例如 Xpath 表达式,可以实现更多的可能性,例如使用其他 XML 文档来填充模板。

CONTENT-CONTROL TYPES

Depending on the insertion point in the Word document, there are two types of content-controls that are created:

  • Top-level (at the same level as paragraphs)

  • Nested (typically within an existing paragraph)

Confusingly, in the XML, both types are tagged as <sdt>...</sdt> but the underlying openXML classes are different.
For top-level, the root is SdtBlock and the content is SdtContentBlock. For nested, it is SdtRun & SdtContentRun.

To get both types, ie all content-controls, it is better to iterate via the common base class which is SdtElement and then check the type:

List<SdtElement> sdtList = document.Descendants<SdtElement>().ToList();

foreach( SdtElement sdt in sdtList )
{
   if( sdt is SdtRun )
   {
      ; // process nested sdts
   }

   if( sdt is SdtBlock )
   {
      ; // process top-level sdts
   }
}

For a document template, all content-controls should be processed - it is common for more than one content-control to have the same tag-name eg customer-name, all of which typically need to be replaced with the actual customer name.

CONTENT-CONTROL TAG NAME

The content-control tag-name will never be split.

In the XML, this is:

<w:sdt>
...
<w:sdtPr>
...
<w:tag w:val="customer-name"/>

Because the tag-name is never split, it can always be found with a direct match:

   List<SdtElement> sdtList = document.Descendants<SdtElement>().ToList();
        
   foreach( SdtElement sdt in sdtList )
   {
       if( sdt is SdtRun )
       {
         String tagName = sdt.SdtProperties.GetFirstChild<Tag>().Val;

         if( tagName == "customer-name" )
         {
            ; // get & replace placeholder with actual value
         }

Obviously, in the above code, there would need to be a more elegant mechanism to retrieve the actual value corresponding to each different tag-name.

CONTENT-CONTROL TEXT

Within a content-control, it is very common for the rendered text to be split into multiple runs (despite each run having the same properties).

Among other things, this is caused by the spelling/grammar checker & number of editing attempts.
Text splitting is more common when de-limiters are used eg [customer-name] etc.

The reason why this is important is that without checking the XML, it is not possible to guarantee that placeholder text has not been split so it cannot be found and replaced.

ONE SUGGESTED APPROACH

One suggested approach is to use only plain-text content-controls, top-level and/or nested, then:

  • Find the content-control by tag-name

  • Insert a formatted paragraph or run after the content-control

  • Delete the content-control

     List<SdtElement> sdtList = document.Descendants<SdtElement>().ToList();
    
     foreach( SdtElement sdt in sdtList )
     {
        if( sdt is SdtRun )
        {
           String tagName = sdt.SdtProperties.GetFirstChild<Tag>().Val;
    
           String newText = "new text"; // eg GetTextByTag( tagName );
    
           // should use a style or common run props
    
           RunProperties runProps = new RunProperties();
    
           runProps.Color    = new Color   () { Val   = "000000" };
           runProps.FontSize = new FontSize() { Val   = "23" };
           runProps.RunFonts = new RunFonts() { Ascii = "Calibri" };
    
           Run run = new Run();
    
           run.Append( runProps );
           run.Append( new Text( newText ) );
    
           sdt.InsertAfterSelf( run );
    
           sdt.Remove();
        }
    
        if( sdt is SdtBlock )
        {
           ; // add paragraph
        }
     }
    

For top-level types, a paragraph would need to be inserted.

In this approach, content-controls are used only as placeholders that can guaranteed to be found (by tag-name) and then entirely replaced with the appropriate text (that is consistently formatted).

Also, this removes the need to format the content-control text (which then may be split so cannot be found.)

Using a suitable naming convention for the tag-names, eg Xpath expressions, enables further possibilities such as using other XML documents to populate templates.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文