如何使用office interop API枚举word文档?

发布于 2024-09-15 01:44:07 字数 575 浏览 1 评论 0原文

我想一一遍历word文档的所有元素,并根据元素的类型(标题,句子,表格,图像,文本框,形状等)我想处理该元素。我尝试搜索任何可以代表 Office 互操作 API 中文档元素的枚举器或对象,但未能找到任何枚举器或对象。 API提供句子、段落、形状集合,但不提供可以指向下一个元素的通用对象。 例如:(

<header of document>
<plain text sentences>
<table with many rows,columns>
<text box>
<image>
<footer>

请将其想象为一个Word文档)


所以,现在我想要一些枚举器,它首先给我<文档标题>,然后在下一次迭代时给我,然后<具有许多行、列的表格>等等。 有谁知道我们如何才能实现这一目标?是否可以?

我正在使用 C#、Visual Studio 2005 和 Word 2003。

非常感谢

I want to traverse through all the elements of an word document one by one and according to type of element (header, sentence, table,image,textbox, shape, etc.) I want to process that element. I tried to search any enumerator or object which can represent elements of document in office interop API but failed to find any. API offers sentences, paragraphs, shapes collections but doesnt provide generic object which can point to next element.
For example :

<header of document>
<plain text sentences>
<table with many rows,columns>
<text box>
<image>
<footer>

(Please imagine it as a word document)


So, now I want some enumerator which will first give me <header of document>, then on next iteration give me <plain text sentences>, then <table with many rows,columns> and so on.
Does anyone knows how we can achieve this? Is it possible?

I am using C#, visual studio 2005 and Word 2003.

Thanks a lot

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

野生奥特曼 2024-09-22 01:44:08

例如:

        // open the file
        Word.ApplicationClass app = new Word.ApplicationClass();
        object path = @"c:\Users\name\Desktop\Весь набор.docx";
        object missing = System.Reflection.Missing.Value;

        Word.Document doc = null;
        try
        {
            doc = app.Documents.Open(ref path,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing);

            // index
            foreach ( Word.Section section in doc.Sections)
            {
                Debug.WriteLine("Section index:" + section.Index);
                Debug.WriteLine("section start: " + section.Range.Start + ", section end: " + section.Range.End);

            }


            bool processNextTable = false;
            foreach (Word.Paragraph paragraph in doc.Paragraphs)
            {
                string toWrite = paragraph.Range.Text;
                System.Diagnostics.Debug.WriteLine(toWrite);
            }

            foreach (Word.Table table in doc.Tables)
            {
                foreach (Word.Row wRow in table.Rows)
                    foreach (Word.Cell cell in wRow.Cells)
                    {
                    }
            }

        }
        finally
        {
            if (doc != null)
            {
                bool saveChanges = false; // temporary not save any changes
                app.Quit(ref saveChanges, ref missing, ref missing);
            }
        }

for example:

        // open the file
        Word.ApplicationClass app = new Word.ApplicationClass();
        object path = @"c:\Users\name\Desktop\Весь набор.docx";
        object missing = System.Reflection.Missing.Value;

        Word.Document doc = null;
        try
        {
            doc = app.Documents.Open(ref path,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing,
                ref missing, ref missing, ref missing);

            // index
            foreach ( Word.Section section in doc.Sections)
            {
                Debug.WriteLine("Section index:" + section.Index);
                Debug.WriteLine("section start: " + section.Range.Start + ", section end: " + section.Range.End);

            }


            bool processNextTable = false;
            foreach (Word.Paragraph paragraph in doc.Paragraphs)
            {
                string toWrite = paragraph.Range.Text;
                System.Diagnostics.Debug.WriteLine(toWrite);
            }

            foreach (Word.Table table in doc.Tables)
            {
                foreach (Word.Row wRow in table.Rows)
                    foreach (Word.Cell cell in wRow.Cells)
                    {
                    }
            }

        }
        finally
        {
            if (doc != null)
            {
                bool saveChanges = false; // temporary not save any changes
                app.Quit(ref saveChanges, ref missing, ref missing);
            }
        }
够运 2024-09-22 01:44:07

您没有简单迭代器的原因是 Word 文档可能比问题中概述的简单结构复杂得多。

例如,文档的第一页以及偶数页和奇数页可能有多个页眉和页脚,包含多个具有不同页眉和页脚设置的部分,包含脚注、注释和修订以及表格、文本框等对象、图像和形状可能会与文本内嵌或浮动显示。简而言之,元素没有固定的顺序。

您必须检查输入文档的复杂程度,并根据分析结果决定如何迭代段落以及附加的图像和形状等。

The reason that you don't have a simple iterator is that Word documents can be far more complex than the simple structure outlined in your question.

For example, a document may have multiple headers and footers for the first page as well as even and odd pages, contains more than one section with different header and footer setup, contain footnotes, comments and revisions, and objects such as tables, text boxes, images and shapes may appear inline with text or floating. In short, there is no fix sequence of elements.

You would have to check how complex your input documents are and based on the result of that analysis decide how to iterate over paragraphs and attached images and shapes etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文