从Word文档中获取标题

发布于 2024-07-07 22:08:01 字数 31 浏览 7 评论 0原文

如何使用VBA获取Word文档中所有标题的列表?

How do I get a list of all the headings in a word document by using VBA?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

妄司 2024-07-14 22:08:01

您的意思是这样 createOutline 函数(实际上复制了将源Word文档转换为新的Word文档):(

我相信astrHeadings = docSource。GetCrossReferenceItems(wdRefTypeHeading) 函数是此程序的关键,应该允许您检索您所要求的内容)

Public Sub CreateOutline()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range
    
    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer
        
    Set docSource = ActiveDocument
    Set docOutline = Documents.Add
    
    ' Content returns only the main body of the document, not the headers/footer.        
    Set rng = docOutline.Content
    ' GetCrossReferenceItems(wdRefTypeHeading) returns an array with references to all headings in the document
    astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))
        
        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine
        
        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.
    
    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
        
    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer
    
    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)
    
    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)
    
    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

@kol 于 2018 年 3 月 6 日更新

虽然 astrHeadings 是一个数组(IsArray 返回 True,并且 TypeName< /code> 返回 String()) 当我尝试在 VBScript 中访问其元素(Windows 10 Pro 1709 16299.248 上的 v5.8.16384)时,出现类型不匹配 错误。 这肯定是 VBScript 特定的问题,因为如果我在 Word 的 VBA 编辑器中运行相同的代码,我就可以访问这些元素。 我最终迭代了目录中的各行,因为它甚至可以在 VBScript 中工作:

For Each Paragraph In Doc.TablesOfContents(1).Range.Paragraphs
  WScript.Echo Paragraph.Range.Text
Next

You mean like this createOutline function (which actually copy all headings from a source word document into a new word document):

(I believe the astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading) function is the key in this program, and should allow you to retrieve what you are asking for)

Public Sub CreateOutline()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range
    
    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer
        
    Set docSource = ActiveDocument
    Set docOutline = Documents.Add
    
    ' Content returns only the main body of the document, not the headers/footer.        
    Set rng = docOutline.Content
    ' GetCrossReferenceItems(wdRefTypeHeading) returns an array with references to all headings in the document
    astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))
        
        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine
        
        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.
    
    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
        
    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer
    
    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)
    
    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)
    
    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

UPDATE by @kol on March 6, 2018

Although astrHeadings is an array (IsArray returns True, and TypeName returns String()) I get a type mismatch error when I try to access its elements in VBScript (v5.8.16384 on Windows 10 Pro 1709 16299.248). This must be a VBScript-specific problem, because I can access the elements if I run the same code in Word's VBA editor. I ended up iterating the lines of the TOC, because it works even from VBScript:

For Each Paragraph In Doc.TablesOfContents(1).Range.Paragraphs
  WScript.Echo Paragraph.Range.Text
Next
煮酒 2024-07-14 22:08:01

获取标题列表的最简单方法是循环浏览文档中的段落,例如:

 Sub ReadPara()

    Dim DocPara As Paragraph

    For Each DocPara In ActiveDocument.Paragraphs

     If Left(DocPara.Range.Style, Len("Heading")) = "Heading" Then

       Debug.Print DocPara.Range.Text

     End If

    Next


End Sub

顺便说一句,我发现删除段落范围的最后一个字符是个好主意。 否则,如果将字符串发送到消息框或文档,Word 将显示额外的控制字符。 例如:

Left(DocPara.Range.Text, len(DocPara.Range.Text)-1)

The easiest way to get a list of headings, is to loop through the paragraphs in the document, for example:

 Sub ReadPara()

    Dim DocPara As Paragraph

    For Each DocPara In ActiveDocument.Paragraphs

     If Left(DocPara.Range.Style, Len("Heading")) = "Heading" Then

       Debug.Print DocPara.Range.Text

     End If

    Next


End Sub

By the way, I find it is a good idea to remove the final character of the paragraph range. Otherwise, if you send the string to a message box or a document, Word displays an extra control character. For example:

Left(DocPara.Range.Text, len(DocPara.Range.Text)-1)
习ぎ惯性依靠 2024-07-14 22:08:01

这个宏对我来说非常有效(Word 2010)。 我稍微扩展了该功能:现在它提示用户输入最低级别,并禁止该级别以下的副标题。

Public Sub CreateOutline()
' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer
    Dim minLevel As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    minLevel = 1  'levels above this value won't be copied.
    minLevel = CInt(InputBox("This macro will generate a new document that contains only the headers from the existing document. What is the lowest level heading you want?", "2"))

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        If intLevel <= minLevel Then

            ' Add the text to the document.
            rng.InsertAfter strText & vbNewLine

            ' Set the style of the selected range and
            ' then collapse the range for the next entry.
            rng.Style = "Heading " & intLevel
            rng.Collapse wdCollapseEnd
        End If
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

This macro worked beautifully for me (Word 2010). I've extended the functionality slightly: now it prompts the user to enter a minimum level, and supresses subheadings below that level.

Public Sub CreateOutline()
' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer
    Dim minLevel As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    minLevel = 1  'levels above this value won't be copied.
    minLevel = CInt(InputBox("This macro will generate a new document that contains only the headers from the existing document. What is the lowest level heading you want?", "2"))

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        If intLevel <= minLevel Then

            ' Add the text to the document.
            rng.InsertAfter strText & vbNewLine

            ' Set the style of the selected range and
            ' then collapse the range for the next entry.
            rng.Style = "Heading " & intLevel
            rng.Collapse wdCollapseEnd
        End If
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function
成熟的代价 2024-07-14 22:08:01

提取所有标题的最快方法(至 LEVEL5)。

Sub EXTRACT_HDNGS()
Dim WDApp As Word.Application    'WORD APP
Dim WDDoc As Word.Document       'WORD DOC

Set WDApp = Word.Application
Set WDDoc = WDApp.ActiveDocument

For Head_n = 1 To 5
Head = ("Heading " & Head_n)
WDApp.Selection.HomeKey wdStory, wdMove

    Do
       With WDApp.selection
      .MoveStart Unit:=wdLine, Count:=1    
      .Collapse Direction:=wdCollapseEnd
       End with
        With WDApp.Selection.Find
          .ClearFormatting:          .text = "":     
          .MatchWildcards = False:   .Forward = True
          .Style = WDDoc.Styles(Head)
         If .Execute = False Then GoTo Level_exit
            .ClearFormatting
        End With

       Heading_txt = RemoveSpecialChar(WDApp.Selection.Range.text, 1):              Debug.Print Heading_txt
       Heading_lvl = WDApp.Selection.Range.ListFormat.ListLevelNumber:              Debug.Print Heading_lvl
       Heading_lne = WDDoc.Range(0, WDApp.Selection.Range.End).Paragraphs.Count:    Debug.Print Heading_lne
       Heading_pge = WDApp.Selection.Information(wdActiveEndPageNumber):            Debug.Print Heading_pge

       If Wdapp.Selection.Style = "Heading 1" Then GoTo Level_exit
       Wdapp.Selection.Collapse Direction:=wdCollapseStart
   Loop
Level_exit:
Next Head_n

End Sub

Fastest method for extracting of all headings (to LEVEL5).

Sub EXTRACT_HDNGS()
Dim WDApp As Word.Application    'WORD APP
Dim WDDoc As Word.Document       'WORD DOC

Set WDApp = Word.Application
Set WDDoc = WDApp.ActiveDocument

For Head_n = 1 To 5
Head = ("Heading " & Head_n)
WDApp.Selection.HomeKey wdStory, wdMove

    Do
       With WDApp.selection
      .MoveStart Unit:=wdLine, Count:=1    
      .Collapse Direction:=wdCollapseEnd
       End with
        With WDApp.Selection.Find
          .ClearFormatting:          .text = "":     
          .MatchWildcards = False:   .Forward = True
          .Style = WDDoc.Styles(Head)
         If .Execute = False Then GoTo Level_exit
            .ClearFormatting
        End With

       Heading_txt = RemoveSpecialChar(WDApp.Selection.Range.text, 1):              Debug.Print Heading_txt
       Heading_lvl = WDApp.Selection.Range.ListFormat.ListLevelNumber:              Debug.Print Heading_lvl
       Heading_lne = WDDoc.Range(0, WDApp.Selection.Range.End).Paragraphs.Count:    Debug.Print Heading_lne
       Heading_pge = WDApp.Selection.Information(wdActiveEndPageNumber):            Debug.Print Heading_pge

       If Wdapp.Selection.Style = "Heading 1" Then GoTo Level_exit
       Wdapp.Selection.Collapse Direction:=wdCollapseStart
   Loop
Level_exit:
Next Head_n

End Sub
ˇ宁静的妩媚 2024-07-14 22:08:01

在 Wiki 对 VonC 答案的评论之后,这是对我有用的代码。 它使该功能更快。

Public Sub CopyHeadingsInNewDoc()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim longLevel As Integer
    Dim longItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim longDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    longDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (longDiff / 2) + 1
End Function

Following Wikis comment on VonC answer, here is the code that worked for me. It makes the function faster.

Public Sub CopyHeadingsInNewDoc()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim longLevel As Integer
    Dim longItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim longDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    longDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (longDiff / 2) + 1
End Function
葬シ愛 2024-07-14 22:08:01

为什么要重复发明轮子这么多次?!?

“所有标题列表”就是标准的Word文档索引!

这是我在向文档添加索引时录制宏所得到的结果:

Sub Macro1()
    ActiveDocument.TablesOfContents.Add Range:=Selection.Range, _
        RightAlignPageNumbers:=True, _
        UseHeadingStyles:=True, _
        UpperHeadingLevel:=1, _
        LowerHeadingLevel:=5, _
        IncludePageNumbers:=True, _
        AddedStyles:="", _
        UseHyperlinks:=True, _
        HidePageNumbersInWeb:=True, _
        UseOutlineLevels:=True
End Sub

Why reinventing the wheel so many times?!?

A "list of all headings" is just the standard Word index of document!

This is what I got by recording a macro while adding index to the document:

Sub Macro1()
    ActiveDocument.TablesOfContents.Add Range:=Selection.Range, _
        RightAlignPageNumbers:=True, _
        UseHeadingStyles:=True, _
        UpperHeadingLevel:=1, _
        LowerHeadingLevel:=5, _
        IncludePageNumbers:=True, _
        AddedStyles:="", _
        UseHyperlinks:=True, _
        HidePageNumbersInWeb:=True, _
        UseOutlineLevels:=True
End Sub
笑看君怀她人 2024-07-14 22:08:01

您还可以在文档中创建目录并复制它。 这将参考文献与标题分开,如果您需要在另一个上下文中呈现它,这会很方便。
如果您不想在文档中包含目录,只需在复制粘贴后将其删除即可。 JK。

You can also create a Table of Contents in the doc and copy that. This separates out the para ref from the title, which is handy if you need to present that in another context.
If you do not want the ToC in your doc, just delete that after the Copy n Paste. JK.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文