vba:从数组中获取唯一值

发布于 2024-09-05 00:05:11 字数 173 浏览 5 评论 0原文

中是否有任何内置功能可以获取一维数组中的唯一值?只删除重复项怎么样?

如果没有,那么我如何从数组中获取唯一值?

Is there any built-in functionality in to get unique values from a one-dimensional array? What about just getting rid of duplicates?

If not, then how would I get the unique values from an array?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

旧情别恋 2024-09-12 00:05:12

没有内置功能可以从数组中删除重复项。 Raj的回答看起来很优雅,但我更喜欢使用字典。

Dim d As Object
Set d = CreateObject("Scripting.Dictionary")
'Set d = New Scripting.Dictionary

Dim i As Long
For i = LBound(myArray) To UBound(myArray)
    d(myArray(i)) = 1
Next i

Dim v As Variant
For Each v In d.Keys()
    'd.Keys() is a Variant array of the unique values in myArray.
    'v will iterate through each of them.
Next v

编辑:我按照 Tomalak 的建议答案更改了循环以使用 LBoundUBound
编辑: d.Keys() 是一个 Variant 数组,而不是一个 Collection。

There's no built-in functionality to remove duplicates from arrays. Raj's answer seems elegant, but I prefer to use dictionaries.

Dim d As Object
Set d = CreateObject("Scripting.Dictionary")
'Set d = New Scripting.Dictionary

Dim i As Long
For i = LBound(myArray) To UBound(myArray)
    d(myArray(i)) = 1
Next i

Dim v As Variant
For Each v In d.Keys()
    'd.Keys() is a Variant array of the unique values in myArray.
    'v will iterate through each of them.
Next v

EDIT: I changed the loop to use LBound and UBound as per Tomalak's suggested answer.
EDIT: d.Keys() is a Variant array, not a Collection.

旧时光的容颜 2024-09-12 00:05:12

更新(2016 年 6 月 15 日)

我创建了更全面的基准测试。首先,正如 @ChaimG 指出的,早期绑定有很大的不同(我最初逐字使用上面的 @eksortso 的代码,它使用后期绑定)。其次,我原来的基准测试只包括创建唯一对象的时间,但是并没有测试使用该对象的效率。我这样做的目的是,如果我创建的对象很笨重并且减慢了我前进的速度,那么我是否可以非常快地创建一个对象并不重要。

旧评论:事实证明,对集合对象进行循环效率非常低

事实证明,如果您知道如何执行此操作(我不知道),则对集合进行循环可能会非常高效。正如@ChaimG(再次)在评论中指出的那样,使用 For Each 构造比简单地使用 For 循环优越得多。为了让您了解一下,在更改循环构造之前,Test Case Size = 10^6Collection2 的时间超过 1400 秒(即约 23 分钟)。现在仅为 0.195 秒(快了 7000 多倍)。

对于 Collection 方法有两次。第一个(我的原始基准 Collection1)显示创建唯一对象的时间。第二部分 (Collection2) 显示了循环对象(这是非常自然的)以创建可返回数组的时间,就像其他函数一样。

在下图中,黄色背景表示该测试用例最快,红色表示最慢(不包括“未测试”算法)。 Collection 方法的总时间是 Collection1Collection2 的总和。绿松石色表示无论原始顺序如何,它都是最快的。

Benchmarks5

下面是我创建的原始算法(我对其进行了轻微修改,例如我不再实例化我自己的数据类型)。它会在非常短的时间内返回具有原始顺序的数组的唯一值,并且可以对其进行修改以采用任何数据类型。除了 IndexMethod 之外,它是处理超大型数组的最快算法。

以下是该算法背后的主要思想:

  1. 对数组进行索引
  2. 按值排序
  3. 将相同的值放在数组的末尾,然后将它们“砍掉”。
  4. 最后按索引排序。

下面是一个示例:

Let myArray = (86, 100, 33, 19, 33, 703, 19, 100, 703, 19)

    1.  (86, 100, 33, 19, 33, 703, 19, 100, 703, 19)
        (1 ,   2,  3,  4,  5,   6,  7,   8,   9, 10)   <<-- Indexing

    2.  (19, 19, 19, 33, 33, 86, 100, 100, 703, 703)   <<-- sort by values     
        (4,   7, 10,  3,  5,  1,   2,   8,   6,   9)

    3.  (19, 33,  86, 100, 703)   <<-- remove duplicates    
        (4,   3,   1,   2,   6)

    4.  (86, 100,  33, 19, 703)   
        ( 1,   2,   3,  4,   6)   <<-- sort by index

这是代码:

Function SortingUniqueTest(ByRef myArray() As Long, bOrigIndex As Boolean) As Variant
    Dim MyUniqueArr() As Long, i As Long, intInd As Integer
    Dim StrtTime As Double, Endtime As Double, HighB As Long, LowB As Long

    LowB = LBound(myArray): HighB = UBound(myArray)

    ReDim MyUniqueArr(1 To 2, LowB To HighB)
    intInd = 1 - LowB  'Guarantees the indices span 1 to Lim

    For i = LowB To HighB
        MyUniqueArr(1, i) = myArray(i)
        MyUniqueArr(2, i) = i + intInd
    Next i

    QSLong2D MyUniqueArr, 1, LBound(MyUniqueArr, 2), UBound(MyUniqueArr, 2), 2
    Call UniqueArray2D(MyUniqueArr)
    If bOrigIndex Then QSLong2D MyUniqueArr, 2, LBound(MyUniqueArr, 2), UBound(MyUniqueArr, 2), 2

    SortingUniqueTest = MyUniqueArr()
End Function

Public Sub UniqueArray2D(ByRef myArray() As Long)
    Dim i As Long, j As Long, Count As Long, Count1 As Long, DuplicateArr() As Long
    Dim lngTemp As Long, HighB As Long, LowB As Long
    LowB = LBound(myArray, 2): Count = LowB: i = LowB: HighB = UBound(myArray, 2)

    Do While i < HighB
        j = i + 1
        If myArray(1, i) = myArray(1, j) Then
            Do While myArray(1, i) = myArray(1, j)
                ReDim Preserve DuplicateArr(1 To Count)
                DuplicateArr(Count) = j
                Count = Count + 1
                j = j + 1
                If j > HighB Then Exit Do
            Loop

            QSLong2D myArray, 2, i, j - 1, 2
        End If
        i = j
    Loop

    Count1 = HighB

    If Count > 1 Then
        For i = UBound(DuplicateArr) To LBound(DuplicateArr) Step -1
            myArray(1, DuplicateArr(i)) = myArray(1, Count1)
            myArray(2, DuplicateArr(i)) = myArray(2, Count1)
            Count1 = Count1 - 1
            ReDim Preserve myArray(1 To 2, LowB To Count1)
        Next i
    End If
End Sub

这是我使用的排序算法(有关此算法的更多信息此处)。

Sub QSLong2D(ByRef saArray() As Long, bytDim As Byte, lLow1 As Long, lHigh1 As Long, bytNum As Byte)
    Dim lLow2 As Long, lHigh2 As Long
    Dim sKey As Long, sSwap As Long, i As Byte

On Error GoTo ErrorExit

    If IsMissing(lLow1) Then lLow1 = LBound(saArray, bytDim)
    If IsMissing(lHigh1) Then lHigh1 = UBound(saArray, bytDim)
    lLow2 = lLow1
    lHigh2 = lHigh1

    sKey = saArray(bytDim, (lLow1 + lHigh1) \ 2)

    Do While lLow2 < lHigh2
        Do While saArray(bytDim, lLow2) < sKey And lLow2 < lHigh1: lLow2 = lLow2 + 1: Loop
        Do While saArray(bytDim, lHigh2) > sKey And lHigh2 > lLow1: lHigh2 = lHigh2 - 1: Loop

        If lLow2 < lHigh2 Then
            For i = 1 To bytNum
                sSwap = saArray(i, lLow2)
                saArray(i, lLow2) = saArray(i, lHigh2)
                saArray(i, lHigh2) = sSwap
            Next i
        End If

        If lLow2 <= lHigh2 Then
            lLow2 = lLow2 + 1
            lHigh2 = lHigh2 - 1
        End If
    Loop

    If lHigh2 > lLow1 Then QSLong2D saArray(), bytDim, lLow1, lHigh2, bytNum
    If lLow2 < lHigh1 Then QSLong2D saArray(), bytDim, lLow2, lHigh1, bytNum

ErrorExit:

End Sub

下面是一种特殊的算法,如果您的数据包含整数,该算法的速度会非常快。它利用索引和布尔数据类型。

Function IndexSort(ByRef myArray() As Long, bOrigIndex As Boolean) As Variant
'' Modified to take both positive and negative integers
    Dim arrVals() As Long, arrSort() As Long, arrBool() As Boolean
    Dim i As Long, HighB As Long, myMax As Long, myMin As Long, OffSet As Long
    Dim LowB As Long, myIndex As Long, count As Long, myRange As Long

    HighB = UBound(myArray)
    LowB = LBound(myArray)

    For i = LowB To HighB
        If myArray(i) > myMax Then myMax = myArray(i)
        If myArray(i) < myMin Then myMin = myArray(i)
    Next i

    OffSet = Abs(myMin)  '' Number that will be added to every element
                         '' to guarantee every index is non-negative

    If myMax > 0 Then
        myRange = myMax + OffSet  '' E.g. if myMax = 10 & myMin = -2, then myRange = 12
    Else
        myRange = OffSet
    End If

    If bOrigIndex Then
        ReDim arrSort(1 To 2, 1 To HighB)
        ReDim arrVals(1 To 2, 0 To myRange)
        ReDim arrBool(0 To myRange)

        For i = LowB To HighB
            myIndex = myArray(i) + OffSet
            arrBool(myIndex) = True
            arrVals(1, myIndex) = myArray(i)
            If arrVals(2, myIndex) = 0 Then arrVals(2, myIndex) = i
        Next i

        For i = 0 To myRange
            If arrBool(i) Then
                count = count + 1
                arrSort(1, count) = arrVals(1, i)
                arrSort(2, count) = arrVals(2, i)
            End If
        Next i

        QSLong2D arrSort, 2, 1, count, 2
        ReDim Preserve arrSort(1 To 2, 1 To count)
    Else
        ReDim arrSort(1 To HighB)
        ReDim arrVals(0 To myRange)
        ReDim arrBool(0 To myRange)

        For i = LowB To HighB
            myIndex = myArray(i) + OffSet
            arrBool(myIndex) = True
            arrVals(myIndex) = myArray(i)
        Next i

        For i = 0 To myRange
            If arrBool(i) Then
                count = count + 1
                arrSort(count) = arrVals(i)
            End If
        Next i

        ReDim Preserve arrSort(1 To count)
    End If

    ReDim arrVals(0)
    ReDim arrBool(0)

    IndexSort = arrSort
End Function

以下是集合(由@DocBrown)和字典(由@eksortso)函数。

Function CollectionTest(ByRef arrIn() As Long, Lim As Long) As Variant
    Dim arr As New Collection, a, i As Long, arrOut() As Variant, aFirstArray As Variant
    Dim StrtTime As Double, EndTime1 As Double, EndTime2 As Double, count As Long
On Error Resume Next

    ReDim arrOut(1 To UBound(arrIn))
    ReDim aFirstArray(1 To UBound(arrIn))

    StrtTime = Timer
    For i = 1 To UBound(arrIn): aFirstArray(i) = CStr(arrIn(i)): Next i '' Convert to string
    For Each a In aFirstArray               ''' This part is actually creating the unique set
        arr.Add a, a
    Next
    EndTime1 = Timer - StrtTime

    StrtTime = Timer         ''' This part is writing back to an array for return
    For Each a In arr: count = count + 1: arrOut(count) = a: Next a
    EndTime2 = Timer - StrtTime
    CollectionTest = Array(arrOut, EndTime1, EndTime2)
End Function

Function DictionaryTest(ByRef myArray() As Long, Lim As Long) As Variant
    Dim StrtTime As Double, Endtime As Double
    Dim d As Scripting.Dictionary, i As Long  '' Early Binding
    Set d = New Scripting.Dictionary
    For i = LBound(myArray) To UBound(myArray): d(myArray(i)) = 1: Next i
    DictionaryTest = d.Keys()
End Function

这是@IsraelHoletz 提供的直接方法。

Function ArrayUnique(ByRef aArrayIn() As Long) As Variant
    Dim aArrayOut() As Variant, bFlag As Boolean, vIn As Variant, vOut As Variant
    Dim i As Long, j As Long, k As Long
    ReDim aArrayOut(LBound(aArrayIn) To UBound(aArrayIn))
    i = LBound(aArrayIn)
    j = i

    For Each vIn In aArrayIn
        For k = j To i - 1
            If vIn = aArrayOut(k) Then bFlag = True: Exit For
        Next
        If Not bFlag Then aArrayOut(i) = vIn: i = i + 1
        bFlag = False
    Next

    If i <> UBound(aArrayIn) Then ReDim Preserve aArrayOut(LBound(aArrayIn) To i - 1)
    ArrayUnique = aArrayOut
End Function

Function DirectTest(ByRef aArray() As Long, Lim As Long) As Variant
    Dim aReturn() As Variant
    Dim StrtTime As Long, Endtime As Long, i As Long
    aReturn = ArrayUnique(aArray)
    DirectTest = aReturn
End Function

这是比较所有函数的基准函数。您应该注意,由于内存问题,最后两种情况的处理方式略有不同。另请注意,我没有测试 Test Case Size = 10,000,000Collection 方法。由于某种原因,它返回了不正确的结果并且行为异常(我猜测集合对象对可以放入其中的东西有限制。我进行了搜索,但找不到任何关于此的文献)。

Function UltimateTest(Lim As Long, bTestDirect As Boolean, bTestDictionary, bytCase As Byte) As Variant

    Dim dictionTest, collectTest, sortingTest1, indexTest1, directT '' all variants
    Dim arrTest() As Long, i As Long, bEquality As Boolean, SizeUnique As Long
    Dim myArray() As Long, StrtTime As Double, EndTime1 As Variant
    Dim EndTime2 As Double, EndTime3 As Variant, EndTime4 As Double
    Dim EndTime5 As Double, EndTime6 As Double, sortingTest2, indexTest2

    ReDim myArray(1 To Lim): Rnd (-2)   '' If you want to test negative numbers, 
    '' insert this to the left of CLng(Int(Lim... : (-1) ^ (Int(2 * Rnd())) *
    For i = LBound(myArray) To UBound(myArray): myArray(i) = CLng(Int(Lim * Rnd() + 1)): Next i
    arrTest = myArray

    If bytCase = 1 Then
        If bTestDictionary Then
            StrtTime = Timer: dictionTest = DictionaryTest(arrTest, Lim): EndTime1 = Timer - StrtTime
        Else
            EndTime1 = "Not Tested"
        End If

        arrTest = myArray
        collectTest = CollectionTest(arrTest, Lim)

        arrTest = myArray
        StrtTime = Timer: sortingTest1 = SortingUniqueTest(arrTest, True): EndTime2 = Timer - StrtTime
        SizeUnique = UBound(sortingTest1, 2)

        If bTestDirect Then
            arrTest = myArray: StrtTime = Timer: directT = DirectTest(arrTest, Lim): EndTime3 = Timer - StrtTime
        Else
            EndTime3 = "Not Tested"
        End If

        arrTest = myArray
        StrtTime = Timer: indexTest1 = IndexSort(arrTest, True): EndTime4 = Timer - StrtTime

        arrTest = myArray
        StrtTime = Timer: sortingTest2 = SortingUniqueTest(arrTest, False): EndTime5 = Timer - StrtTime

        arrTest = myArray
        StrtTime = Timer: indexTest2 = IndexSort(arrTest, False): EndTime6 = Timer - StrtTime

        bEquality = True
        For i = LBound(sortingTest1, 2) To UBound(sortingTest1, 2)
            If Not CLng(collectTest(0)(i)) = sortingTest1(1, i) Then
                bEquality = False
                Exit For
            End If
        Next i

        For i = LBound(dictionTest) To UBound(dictionTest)
            If Not dictionTest(i) = sortingTest1(1, i + 1) Then
                bEquality = False
                Exit For
            End If
        Next i

        For i = LBound(dictionTest) To UBound(dictionTest)
            If Not dictionTest(i) = indexTest1(1, i + 1) Then
                bEquality = False
                Exit For
            End If
        Next i

        If bTestDirect Then
            For i = LBound(dictionTest) To UBound(dictionTest)
                If Not dictionTest(i) = directT(i + 1) Then
                    bEquality = False
                    Exit For
                End If
            Next i
        End If

        UltimateTest = Array(bEquality, EndTime1, EndTime2, EndTime3, EndTime4, _
                        EndTime5, EndTime6, collectTest(1), collectTest(2), SizeUnique)
    ElseIf bytCase = 2 Then
        arrTest = myArray
        collectTest = CollectionTest(arrTest, Lim)
        UltimateTest = Array(collectTest(1), collectTest(2))
    ElseIf bytCase = 3 Then
        arrTest = myArray
        StrtTime = Timer: sortingTest1 = SortingUniqueTest(arrTest, True): EndTime2 = Timer - StrtTime
        SizeUnique = UBound(sortingTest1, 2)
        UltimateTest = Array(EndTime2, SizeUnique)
    ElseIf bytCase = 4 Then
        arrTest = myArray
        StrtTime = Timer: indexTest1 = IndexSort(arrTest, True): EndTime4 = Timer - StrtTime
        UltimateTest = EndTime4
    ElseIf bytCase = 5 Then
        arrTest = myArray
        StrtTime = Timer: sortingTest2 = SortingUniqueTest(arrTest, False): EndTime5 = Timer - StrtTime
        UltimateTest = EndTime5
    ElseIf bytCase = 6 Then
        arrTest = myArray
        StrtTime = Timer: indexTest2 = IndexSort(arrTest, False): EndTime6 = Timer - StrtTime
        UltimateTest = EndTime6
    End If

End Function

最后,这是生成上表的子程序。

Sub GetBenchmarks()
    Dim myVar, i As Long, TestCases As Variant, j As Long, temp

    TestCases = Array(1000, 5000, 10000, 20000, 50000, 100000, 200000, 500000, 1000000, 2000000, 5000000, 10000000)

    For j = 0 To 11
        If j < 6 Then
            myVar = UltimateTest(CLng(TestCases(j)), True, True, 1)
        ElseIf j < 10 Then
            myVar = UltimateTest(CLng(TestCases(j)), False, True, 1)
        ElseIf j < 11 Then
            myVar = Array("Not Tested", "Not Tested", 0.1, "Not Tested", 0.1, 0.1, 0.1, 0, 0, 0)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 2)
            myVar(7) = temp(0): myVar(8) = temp(1)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 3)
            myVar(2) = temp(0): myVar(9) = temp(1)
            myVar(4) = UltimateTest(CLng(TestCases(j)), False, False, 4)
            myVar(5) = UltimateTest(CLng(TestCases(j)), False, False, 5)
            myVar(6) = UltimateTest(CLng(TestCases(j)), False, False, 6)
        Else
            myVar = Array("Not Tested", "Not Tested", 0.1, "Not Tested", 0.1, 0.1, 0.1, "Not Tested", "Not Tested", 0)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 3)
            myVar(2) = temp(0): myVar(9) = temp(1)
            myVar(4) = UltimateTest(CLng(TestCases(j)), False, False, 4)
            myVar(5) = UltimateTest(CLng(TestCases(j)), False, False, 5)
            myVar(6) = UltimateTest(CLng(TestCases(j)), False, False, 6)
        End If

        Cells(4 + j, 6) = TestCases(j)
        For i = 1 To 9: Cells(4 + j, 6 + i) = myVar(i - 1): Next i
        Cells(4 + j, 17) = myVar(9)
    Next j
End Sub

总结
从结果表中,我们可以看到 Dictionary 方法对于小于 500,000 的情况非常有效,但是,在那之后,IndexMethod 真正开始占据主导地位。您会注意到,当顺序无关紧要并且数据由正整数组成时,与 IndexMethod 算法没有比较(它从包含 1000 万个元素的数组中返回唯一值,时间更短)不到 1 秒!!!)。下面我详细介绍了在各种情况下首选哪种算法。

案例1
您的数据包含整数(即整数,包括正数和负数):IndexMethod

情况 2
您的数据包含元素少于 200000 个的非整数(即变体、双精度、字符串等):字典方法

情况 3
您的数据包含超过 200000 个元素的非整数(即变体、双精度、字符串等): Collection Method

如果您必须选择一种算法,在我看来,Collection< /code> 方法仍然是最好的,因为它只需要几行代码,它非常通用,而且足够快。

Update (6/15/16)

I have created much more thorough benchmarks. First of all, as @ChaimG pointed out, early binding makes a big difference (I originally used @eksortso's code above verbatim which uses late binding). Secondly, my original benchmarks only included the time to create the unique object, however, it did not test the efficiency of using the object. My point in doing this is, it doesn't really matter if I can create an object really fast if the object I create is clunky and slows me down moving forward.

Old Remark: It turns out, that looping over a collection object is highly inefficient

It turns out that looping over a collection can be quite efficient if you know how to do it (I didn't). As @ChaimG (yet again), pointed out in the comments, using a For Each construct is ridiculously superior to simply using a For loop. To give you an idea, before changing the loop construct, the time for Collection2 for the Test Case Size = 10^6 was over 1400s (i.e. ~23 minutes). It is now a meager 0.195s (over 7000x faster).

For the Collection method there are two times. The first (my original benchmark Collection1) show the time to create the unique object. The second part (Collection2) shows the time to loop over the object (which is very natural) to create a returnable array as the other functions do.

In the chart below, a yellow background indicates that it was the fastest for that test case, and red indicates the slowest ("Not Tested" algorithms are excluded). The total time for the Collection method is the sum of Collection1 and Collection2. Turquoise indicates that is was the fastest regardless of original order.

Benchmarks5

Below is the original algorithm I created (I have modified it slightly e.g. I no longer instantiate my own data type). It returns the unique values of an array with the original order in a very respectable time and it can be modified to take on any data type. Outside of the IndexMethod, it is the fastest algorithm for very large arrays.

Here are the main ideas behind this algorithm:

  1. Index the array
  2. Sort by values
  3. Place identical values at the end of the array and subsequently "chop" them off.
  4. Finally, sort by index.

Below is an example:

Let myArray = (86, 100, 33, 19, 33, 703, 19, 100, 703, 19)

    1.  (86, 100, 33, 19, 33, 703, 19, 100, 703, 19)
        (1 ,   2,  3,  4,  5,   6,  7,   8,   9, 10)   <<-- Indexing

    2.  (19, 19, 19, 33, 33, 86, 100, 100, 703, 703)   <<-- sort by values     
        (4,   7, 10,  3,  5,  1,   2,   8,   6,   9)

    3.  (19, 33,  86, 100, 703)   <<-- remove duplicates    
        (4,   3,   1,   2,   6)

    4.  (86, 100,  33, 19, 703)   
        ( 1,   2,   3,  4,   6)   <<-- sort by index

Here is the code:

Function SortingUniqueTest(ByRef myArray() As Long, bOrigIndex As Boolean) As Variant
    Dim MyUniqueArr() As Long, i As Long, intInd As Integer
    Dim StrtTime As Double, Endtime As Double, HighB As Long, LowB As Long

    LowB = LBound(myArray): HighB = UBound(myArray)

    ReDim MyUniqueArr(1 To 2, LowB To HighB)
    intInd = 1 - LowB  'Guarantees the indices span 1 to Lim

    For i = LowB To HighB
        MyUniqueArr(1, i) = myArray(i)
        MyUniqueArr(2, i) = i + intInd
    Next i

    QSLong2D MyUniqueArr, 1, LBound(MyUniqueArr, 2), UBound(MyUniqueArr, 2), 2
    Call UniqueArray2D(MyUniqueArr)
    If bOrigIndex Then QSLong2D MyUniqueArr, 2, LBound(MyUniqueArr, 2), UBound(MyUniqueArr, 2), 2

    SortingUniqueTest = MyUniqueArr()
End Function

Public Sub UniqueArray2D(ByRef myArray() As Long)
    Dim i As Long, j As Long, Count As Long, Count1 As Long, DuplicateArr() As Long
    Dim lngTemp As Long, HighB As Long, LowB As Long
    LowB = LBound(myArray, 2): Count = LowB: i = LowB: HighB = UBound(myArray, 2)

    Do While i < HighB
        j = i + 1
        If myArray(1, i) = myArray(1, j) Then
            Do While myArray(1, i) = myArray(1, j)
                ReDim Preserve DuplicateArr(1 To Count)
                DuplicateArr(Count) = j
                Count = Count + 1
                j = j + 1
                If j > HighB Then Exit Do
            Loop

            QSLong2D myArray, 2, i, j - 1, 2
        End If
        i = j
    Loop

    Count1 = HighB

    If Count > 1 Then
        For i = UBound(DuplicateArr) To LBound(DuplicateArr) Step -1
            myArray(1, DuplicateArr(i)) = myArray(1, Count1)
            myArray(2, DuplicateArr(i)) = myArray(2, Count1)
            Count1 = Count1 - 1
            ReDim Preserve myArray(1 To 2, LowB To Count1)
        Next i
    End If
End Sub

Here is the sorting algorithm I use (more about this algo here).

Sub QSLong2D(ByRef saArray() As Long, bytDim As Byte, lLow1 As Long, lHigh1 As Long, bytNum As Byte)
    Dim lLow2 As Long, lHigh2 As Long
    Dim sKey As Long, sSwap As Long, i As Byte

On Error GoTo ErrorExit

    If IsMissing(lLow1) Then lLow1 = LBound(saArray, bytDim)
    If IsMissing(lHigh1) Then lHigh1 = UBound(saArray, bytDim)
    lLow2 = lLow1
    lHigh2 = lHigh1

    sKey = saArray(bytDim, (lLow1 + lHigh1) \ 2)

    Do While lLow2 < lHigh2
        Do While saArray(bytDim, lLow2) < sKey And lLow2 < lHigh1: lLow2 = lLow2 + 1: Loop
        Do While saArray(bytDim, lHigh2) > sKey And lHigh2 > lLow1: lHigh2 = lHigh2 - 1: Loop

        If lLow2 < lHigh2 Then
            For i = 1 To bytNum
                sSwap = saArray(i, lLow2)
                saArray(i, lLow2) = saArray(i, lHigh2)
                saArray(i, lHigh2) = sSwap
            Next i
        End If

        If lLow2 <= lHigh2 Then
            lLow2 = lLow2 + 1
            lHigh2 = lHigh2 - 1
        End If
    Loop

    If lHigh2 > lLow1 Then QSLong2D saArray(), bytDim, lLow1, lHigh2, bytNum
    If lLow2 < lHigh1 Then QSLong2D saArray(), bytDim, lLow2, lHigh1, bytNum

ErrorExit:

End Sub

Below is a special algorithm that is blazing fast if your data contains integers. It makes use of indexing and the Boolean data type.

Function IndexSort(ByRef myArray() As Long, bOrigIndex As Boolean) As Variant
'' Modified to take both positive and negative integers
    Dim arrVals() As Long, arrSort() As Long, arrBool() As Boolean
    Dim i As Long, HighB As Long, myMax As Long, myMin As Long, OffSet As Long
    Dim LowB As Long, myIndex As Long, count As Long, myRange As Long

    HighB = UBound(myArray)
    LowB = LBound(myArray)

    For i = LowB To HighB
        If myArray(i) > myMax Then myMax = myArray(i)
        If myArray(i) < myMin Then myMin = myArray(i)
    Next i

    OffSet = Abs(myMin)  '' Number that will be added to every element
                         '' to guarantee every index is non-negative

    If myMax > 0 Then
        myRange = myMax + OffSet  '' E.g. if myMax = 10 & myMin = -2, then myRange = 12
    Else
        myRange = OffSet
    End If

    If bOrigIndex Then
        ReDim arrSort(1 To 2, 1 To HighB)
        ReDim arrVals(1 To 2, 0 To myRange)
        ReDim arrBool(0 To myRange)

        For i = LowB To HighB
            myIndex = myArray(i) + OffSet
            arrBool(myIndex) = True
            arrVals(1, myIndex) = myArray(i)
            If arrVals(2, myIndex) = 0 Then arrVals(2, myIndex) = i
        Next i

        For i = 0 To myRange
            If arrBool(i) Then
                count = count + 1
                arrSort(1, count) = arrVals(1, i)
                arrSort(2, count) = arrVals(2, i)
            End If
        Next i

        QSLong2D arrSort, 2, 1, count, 2
        ReDim Preserve arrSort(1 To 2, 1 To count)
    Else
        ReDim arrSort(1 To HighB)
        ReDim arrVals(0 To myRange)
        ReDim arrBool(0 To myRange)

        For i = LowB To HighB
            myIndex = myArray(i) + OffSet
            arrBool(myIndex) = True
            arrVals(myIndex) = myArray(i)
        Next i

        For i = 0 To myRange
            If arrBool(i) Then
                count = count + 1
                arrSort(count) = arrVals(i)
            End If
        Next i

        ReDim Preserve arrSort(1 To count)
    End If

    ReDim arrVals(0)
    ReDim arrBool(0)

    IndexSort = arrSort
End Function

Here are the Collection (by @DocBrown) and Dictionary (by @eksortso) Functions.

Function CollectionTest(ByRef arrIn() As Long, Lim As Long) As Variant
    Dim arr As New Collection, a, i As Long, arrOut() As Variant, aFirstArray As Variant
    Dim StrtTime As Double, EndTime1 As Double, EndTime2 As Double, count As Long
On Error Resume Next

    ReDim arrOut(1 To UBound(arrIn))
    ReDim aFirstArray(1 To UBound(arrIn))

    StrtTime = Timer
    For i = 1 To UBound(arrIn): aFirstArray(i) = CStr(arrIn(i)): Next i '' Convert to string
    For Each a In aFirstArray               ''' This part is actually creating the unique set
        arr.Add a, a
    Next
    EndTime1 = Timer - StrtTime

    StrtTime = Timer         ''' This part is writing back to an array for return
    For Each a In arr: count = count + 1: arrOut(count) = a: Next a
    EndTime2 = Timer - StrtTime
    CollectionTest = Array(arrOut, EndTime1, EndTime2)
End Function

Function DictionaryTest(ByRef myArray() As Long, Lim As Long) As Variant
    Dim StrtTime As Double, Endtime As Double
    Dim d As Scripting.Dictionary, i As Long  '' Early Binding
    Set d = New Scripting.Dictionary
    For i = LBound(myArray) To UBound(myArray): d(myArray(i)) = 1: Next i
    DictionaryTest = d.Keys()
End Function

Here is the Direct approach provided by @IsraelHoletz.

Function ArrayUnique(ByRef aArrayIn() As Long) As Variant
    Dim aArrayOut() As Variant, bFlag As Boolean, vIn As Variant, vOut As Variant
    Dim i As Long, j As Long, k As Long
    ReDim aArrayOut(LBound(aArrayIn) To UBound(aArrayIn))
    i = LBound(aArrayIn)
    j = i

    For Each vIn In aArrayIn
        For k = j To i - 1
            If vIn = aArrayOut(k) Then bFlag = True: Exit For
        Next
        If Not bFlag Then aArrayOut(i) = vIn: i = i + 1
        bFlag = False
    Next

    If i <> UBound(aArrayIn) Then ReDim Preserve aArrayOut(LBound(aArrayIn) To i - 1)
    ArrayUnique = aArrayOut
End Function

Function DirectTest(ByRef aArray() As Long, Lim As Long) As Variant
    Dim aReturn() As Variant
    Dim StrtTime As Long, Endtime As Long, i As Long
    aReturn = ArrayUnique(aArray)
    DirectTest = aReturn
End Function

Here is the benchmark function that compares all of the functions. You should note that the last two cases are handled a little bit different because of memory issues. Also note, that I didn't test the Collection method for the Test Case Size = 10,000,000. For some reason, it was returning incorrect results and behaving unusual (I'm guessing the collection object has a limit on how many things you can put in it. I searched and I couldn't find any literature on this).

Function UltimateTest(Lim As Long, bTestDirect As Boolean, bTestDictionary, bytCase As Byte) As Variant

    Dim dictionTest, collectTest, sortingTest1, indexTest1, directT '' all variants
    Dim arrTest() As Long, i As Long, bEquality As Boolean, SizeUnique As Long
    Dim myArray() As Long, StrtTime As Double, EndTime1 As Variant
    Dim EndTime2 As Double, EndTime3 As Variant, EndTime4 As Double
    Dim EndTime5 As Double, EndTime6 As Double, sortingTest2, indexTest2

    ReDim myArray(1 To Lim): Rnd (-2)   '' If you want to test negative numbers, 
    '' insert this to the left of CLng(Int(Lim... : (-1) ^ (Int(2 * Rnd())) *
    For i = LBound(myArray) To UBound(myArray): myArray(i) = CLng(Int(Lim * Rnd() + 1)): Next i
    arrTest = myArray

    If bytCase = 1 Then
        If bTestDictionary Then
            StrtTime = Timer: dictionTest = DictionaryTest(arrTest, Lim): EndTime1 = Timer - StrtTime
        Else
            EndTime1 = "Not Tested"
        End If

        arrTest = myArray
        collectTest = CollectionTest(arrTest, Lim)

        arrTest = myArray
        StrtTime = Timer: sortingTest1 = SortingUniqueTest(arrTest, True): EndTime2 = Timer - StrtTime
        SizeUnique = UBound(sortingTest1, 2)

        If bTestDirect Then
            arrTest = myArray: StrtTime = Timer: directT = DirectTest(arrTest, Lim): EndTime3 = Timer - StrtTime
        Else
            EndTime3 = "Not Tested"
        End If

        arrTest = myArray
        StrtTime = Timer: indexTest1 = IndexSort(arrTest, True): EndTime4 = Timer - StrtTime

        arrTest = myArray
        StrtTime = Timer: sortingTest2 = SortingUniqueTest(arrTest, False): EndTime5 = Timer - StrtTime

        arrTest = myArray
        StrtTime = Timer: indexTest2 = IndexSort(arrTest, False): EndTime6 = Timer - StrtTime

        bEquality = True
        For i = LBound(sortingTest1, 2) To UBound(sortingTest1, 2)
            If Not CLng(collectTest(0)(i)) = sortingTest1(1, i) Then
                bEquality = False
                Exit For
            End If
        Next i

        For i = LBound(dictionTest) To UBound(dictionTest)
            If Not dictionTest(i) = sortingTest1(1, i + 1) Then
                bEquality = False
                Exit For
            End If
        Next i

        For i = LBound(dictionTest) To UBound(dictionTest)
            If Not dictionTest(i) = indexTest1(1, i + 1) Then
                bEquality = False
                Exit For
            End If
        Next i

        If bTestDirect Then
            For i = LBound(dictionTest) To UBound(dictionTest)
                If Not dictionTest(i) = directT(i + 1) Then
                    bEquality = False
                    Exit For
                End If
            Next i
        End If

        UltimateTest = Array(bEquality, EndTime1, EndTime2, EndTime3, EndTime4, _
                        EndTime5, EndTime6, collectTest(1), collectTest(2), SizeUnique)
    ElseIf bytCase = 2 Then
        arrTest = myArray
        collectTest = CollectionTest(arrTest, Lim)
        UltimateTest = Array(collectTest(1), collectTest(2))
    ElseIf bytCase = 3 Then
        arrTest = myArray
        StrtTime = Timer: sortingTest1 = SortingUniqueTest(arrTest, True): EndTime2 = Timer - StrtTime
        SizeUnique = UBound(sortingTest1, 2)
        UltimateTest = Array(EndTime2, SizeUnique)
    ElseIf bytCase = 4 Then
        arrTest = myArray
        StrtTime = Timer: indexTest1 = IndexSort(arrTest, True): EndTime4 = Timer - StrtTime
        UltimateTest = EndTime4
    ElseIf bytCase = 5 Then
        arrTest = myArray
        StrtTime = Timer: sortingTest2 = SortingUniqueTest(arrTest, False): EndTime5 = Timer - StrtTime
        UltimateTest = EndTime5
    ElseIf bytCase = 6 Then
        arrTest = myArray
        StrtTime = Timer: indexTest2 = IndexSort(arrTest, False): EndTime6 = Timer - StrtTime
        UltimateTest = EndTime6
    End If

End Function

And finally, here is the sub that produces the table above.

Sub GetBenchmarks()
    Dim myVar, i As Long, TestCases As Variant, j As Long, temp

    TestCases = Array(1000, 5000, 10000, 20000, 50000, 100000, 200000, 500000, 1000000, 2000000, 5000000, 10000000)

    For j = 0 To 11
        If j < 6 Then
            myVar = UltimateTest(CLng(TestCases(j)), True, True, 1)
        ElseIf j < 10 Then
            myVar = UltimateTest(CLng(TestCases(j)), False, True, 1)
        ElseIf j < 11 Then
            myVar = Array("Not Tested", "Not Tested", 0.1, "Not Tested", 0.1, 0.1, 0.1, 0, 0, 0)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 2)
            myVar(7) = temp(0): myVar(8) = temp(1)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 3)
            myVar(2) = temp(0): myVar(9) = temp(1)
            myVar(4) = UltimateTest(CLng(TestCases(j)), False, False, 4)
            myVar(5) = UltimateTest(CLng(TestCases(j)), False, False, 5)
            myVar(6) = UltimateTest(CLng(TestCases(j)), False, False, 6)
        Else
            myVar = Array("Not Tested", "Not Tested", 0.1, "Not Tested", 0.1, 0.1, 0.1, "Not Tested", "Not Tested", 0)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 3)
            myVar(2) = temp(0): myVar(9) = temp(1)
            myVar(4) = UltimateTest(CLng(TestCases(j)), False, False, 4)
            myVar(5) = UltimateTest(CLng(TestCases(j)), False, False, 5)
            myVar(6) = UltimateTest(CLng(TestCases(j)), False, False, 6)
        End If

        Cells(4 + j, 6) = TestCases(j)
        For i = 1 To 9: Cells(4 + j, 6 + i) = myVar(i - 1): Next i
        Cells(4 + j, 17) = myVar(9)
    Next j
End Sub

Summary
From the table of results, we can see that the Dictionary method works really well for cases less than about 500,000, however, after that, the IndexMethod really starts to dominate. You will notice that when order doesn't matter and your data is made up of positive integers, there is no comparison to the IndexMethod algorithm (it returns the unique values from an array containing 10 million elements in less than 1 sec!!! Incredible!). Below I have a breakdown of which algorithm is preferred in various cases.

Case 1
Your Data contains integers (i.e. whole numbers, both positive and negative): IndexMethod

Case 2
Your Data contains non-integers (i.e. variant, double, string, etc.) with less than 200000 elements: Dictionary Method

Case 3
Your Data contains non-integers (i.e. variant, double, string, etc.) with more than 200000 elements: Collection Method

If you had to choose one algorithm, in my opinion, the Collection method is still the best as it only requires a few lines of code, it's super general, and it's fast enough.

萤火眠眠 2024-09-12 00:05:12

从 365 开始是这样的。您可以使用 UNIQUE,但需要先对其进行 TRANSPOSE,除非您的输入是 RANGE。并排序以获取乐趣。

Option Explicit

Sub testIt()

    Dim arr() As Variant
    
    arr = [{1,2,3,4,1,2}]
    uniquify arr
    
    arr = Array( _
        "Banana", "Apple", "Orange", "Tomato", "Apple", _
        "Lemon", "Lime", "Lime", "Apple" _
    )
    uniquify arr
    
    arr = [{"a", "b", "a", "c", "a", "b", "a"}]
    uniquify arr
    
End Sub

Sub uniquify(arr As Variant)
    
    Dim buffer() As Variant, b As Variant
    
    buffer = WorksheetFunction.Sort( _
        WorksheetFunction.Unique( _
            WorksheetFunction.Transpose(arr) _
        ) _
    )
    
    For Each b In buffer
        Debug.Print b
        Next
    
End Sub


As of 365 it does. You can use UNIQUE, but you'll need to TRANSPOSE it first unless your input is a RANGE. And SORT for fun.

Option Explicit

Sub testIt()

    Dim arr() As Variant
    
    arr = [{1,2,3,4,1,2}]
    uniquify arr
    
    arr = Array( _
        "Banana", "Apple", "Orange", "Tomato", "Apple", _
        "Lemon", "Lime", "Lime", "Apple" _
    )
    uniquify arr
    
    arr = [{"a", "b", "a", "c", "a", "b", "a"}]
    uniquify arr
    
End Sub

Sub uniquify(arr As Variant)
    
    Dim buffer() As Variant, b As Variant
    
    buffer = WorksheetFunction.Sort( _
        WorksheetFunction.Unique( _
            WorksheetFunction.Transpose(arr) _
        ) _
    )
    
    For Each b In buffer
        Debug.Print b
        Next
    
End Sub


茶花眉 2024-09-12 00:05:12

我不知道VBA中有任何内置功能。最好的方法是使用以值作为键的集合,并且仅在值不存在时才添加到集合中。

I don't know of any built-in functionality in VBA. The best would be to use a collection using the value as key and only add to it if a value doesn't exist.

凑诗 2024-09-12 00:05:12

不,没有任何内置的东西。自己动手:

  • 实例化一个 Scripting.Dictionary 对象
  • 在数组上编写一个 For 循环(请务必使用 LBound()UBound() 而不是从 0 到 x 循环!)
  • 在每次迭代中,检查字典上的 Exists()。将每个数组值(尚不存在)添加为字典的键(使用 CStr() 因为键必须是字符串 正如我刚刚了解到的,键可以是 Scripting.Dictionary 中的任何类型),还将数组值本身存储到字典中。
  • 完成后,使用 Keys() (或 Items())将字典的所有值作为新的、现在唯一的数组返回。
  • 在我的测试中,字典保留所有添加值的原始顺序,因此输出将像输入一样排序。不过,我不确定这是否有记录且可靠。

No, nothing built-in. Do it yourself:

  • Instantiate a Scripting.Dictionary object
  • Write a For loop over your array (be sure to use LBound() and UBound() instead of looping from 0 to x!)
  • On each iteration, check Exists() on the dictionary. Add every array value (that doesn't already exist) as a key to the dictionary (use CStr() since keys must be strings as I've just learned, keys can be of any type in a Scripting.Dictionary), also store the array value itself into the dictionary.
  • When done, use Keys() (or Items()) to return all values of the dictionary as a new, now unique array.
  • In my tests, the Dictionary keeps original order of all added values, so the output will be ordered like the input was. I'm not sure if this is documented and reliable behavior, though.
情丝乱 2024-09-12 00:05:12

使用 stdVBA (一个主要由我自己维护的库),您可以使用:

uniqueValues = stdEnumerator.CreateFromArray(myArray).Unique().AsArray()

注释:

您还可以获得某些Collection /任何实现IEnumVARIANT的对象的Unique

uniqueValues = stdEnumerator.CreateFromIEnumVARIANT(myCollection).Unique().AsCollection()

您还可以获得由某个对象的属性唯一

uniqueValues = stdEnumerator.CreateFromIEnumVARIANT(ThisWorkbook.Sheets).Unique(stdLambda("$1.range(""A1"").value")).AsCollection()

With stdVBA (a library largely maintained by myself) you can use:

uniqueValues = stdEnumerator.CreateFromArray(myArray).Unique().AsArray()

Notes:

You can also get Unique for some Collection / any object which implements IEnumVARIANT:

uniqueValues = stdEnumerator.CreateFromIEnumVARIANT(myCollection).Unique().AsCollection()

You can also get Unique by some object's property:

uniqueValues = stdEnumerator.CreateFromIEnumVARIANT(ThisWorkbook.Sheets).Unique(stdLambda("$1.range(""A1"").value")).AsCollection()
花开柳相依 2024-09-12 00:05:12

不,VBA 没有此功能。您可以使用将每个项目作为键添加到集合中的技术。由于集合不允许重复的键,因此结果是不同的值,您可以根据需要将其复制到数组中。

您可能还想要更强大的东西。请参阅 不同值函数,网址为 http://www.cpearson.com/ excel/distinctvalues.aspx

独特值函数

将返回一个 VBA 函数
a 中不同值的数组
输入值的范围或数组。

Excel有一些手动方法,例如
高级过滤器,用于获取列表
与输入范围不同的项目。
使用此类方法的缺点是
您必须手动刷新
当输入数据改变时的结果。
此外,这些方法仅适用于
范围,而不是值数组,并且,不是
是函数,不能被调用
工作表单元格或合并到
数组公式。本页描述了一个
称为 DistinctValues 的 VBA 函数
接受范围作为输入
或数据数组并返回其
结果一个数组包含
与输入列表不同的项目。
也就是说,所有的元素
重复项已删除。的顺序
输入元素被保留。订单
输出数组中的元素是
与输入中的顺序相同
价值观。该函数可以被调用
从数组输入范围
工作表(请参阅本页了解
有关数组公式的信息),或
来自单个数组公式
工作表单元格,或来自另一个 VB
功能。

No, VBA does not have this functionality. You can use the technique of adding each item to a collection using the item as the key. Since a collection does not allow duplicate keys, the result is distinct values that you can copy to an array, if needed.

You may also want something more robust. See Distinct Values Function at http://www.cpearson.com/excel/distinctvalues.aspx

Distinct Values Function

A VBA Function that will return an
array of the distinct values in a
range or array of input values.

Excel has some manual methods, such as
Advanced Filter, for getting a list of
distinct items from an input range.
The drawback of using such methods is
that you must manually refresh the
results when the input data changes.
Moreover, these methods work only with
ranges, not arrays of values, and, not
being functions, cannot be called from
worksheet cells or incorporated into
array formulas. This page describes a
VBA function called DistinctValues
that accepts as input either a range
or an array of data and returns as its
result an array containing the
distinct items from the input list.
That is, the elements with all
duplicates removed. The order of the
input elements is preserved. The order
of the elements in the output array is
the same as the order in the input
values. The function can be called
from an array entered range on a
worksheet (see this page for
information about array formulas), or
from in an array formula in a single
worksheet cell, or from another VB
function.

昔日梦未散 2024-09-12 00:05:12

对于简短的方法来说,集合和字典解决方案都很好,很出色,但是如果您想要速度,请尝试使用更直接的方法:

Function ArrayUnique(ByVal aArrayIn As Variant) As Variant
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' ArrayUnique
' This function removes duplicated values from a single dimension array
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim aArrayOut() As Variant
Dim bFlag As Boolean
Dim vIn As Variant
Dim vOut As Variant
Dim i%, j%, k%

ReDim aArrayOut(LBound(aArrayIn) To UBound(aArrayIn))
i = LBound(aArrayIn)
j = i

For Each vIn In aArrayIn
    For k = j To i - 1
        If vIn = aArrayOut(k) Then bFlag = True: Exit For
    Next
    If Not bFlag Then aArrayOut(i) = vIn: i = i + 1
    bFlag = False
Next

If i <> UBound(aArrayIn) Then ReDim Preserve aArrayOut(LBound(aArrayIn) To i - 1)
ArrayUnique = aArrayOut
End Function

调用它:

Sub Test()
Dim aReturn As Variant
Dim aArray As Variant

aArray = Array(1, 2, 3, 1, 2, 3, "Test", "Test")
aReturn = ArrayUnique(aArray)
End Sub

对于速度比较,这将比字典解决方案快 100 倍到 130 倍,大约快 8000 倍比集合版快 13000 倍。

The Collection and Dictionary solutions are all nice and shine for a short approach, but if you want speed try using a more direct approach:

Function ArrayUnique(ByVal aArrayIn As Variant) As Variant
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' ArrayUnique
' This function removes duplicated values from a single dimension array
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim aArrayOut() As Variant
Dim bFlag As Boolean
Dim vIn As Variant
Dim vOut As Variant
Dim i%, j%, k%

ReDim aArrayOut(LBound(aArrayIn) To UBound(aArrayIn))
i = LBound(aArrayIn)
j = i

For Each vIn In aArrayIn
    For k = j To i - 1
        If vIn = aArrayOut(k) Then bFlag = True: Exit For
    Next
    If Not bFlag Then aArrayOut(i) = vIn: i = i + 1
    bFlag = False
Next

If i <> UBound(aArrayIn) Then ReDim Preserve aArrayOut(LBound(aArrayIn) To i - 1)
ArrayUnique = aArrayOut
End Function

Calling it:

Sub Test()
Dim aReturn As Variant
Dim aArray As Variant

aArray = Array(1, 2, 3, 1, 2, 3, "Test", "Test")
aReturn = ArrayUnique(aArray)
End Sub

For speed comparasion, this will be 100x to 130x faster then the dictionary solution, and about 8000x to 13000x faster than the collection one.

那支青花 2024-09-12 00:05:12

没有用于从数组中删除重复项的 VBA 内置功能,但是您可以使用下一个函数:

Function RemoveDuplicates(MyArray As Variant) As Variant
    With CreateObject("scripting.dictionary")
        For Each item In MyArray
            c00 = .Item(item)
        Next
        sn = .keys ' the array .keys contains all unique keys
        MsgBox Join(.keys, vbLf) ' you can join the array into a string
        RemoveDuplicates = .keys ' return an array without duplicates
    End With
End Function

There is no VBA built in functionality for removing duplicates from an array, however you could use the next function:

Function RemoveDuplicates(MyArray As Variant) As Variant
    With CreateObject("scripting.dictionary")
        For Each item In MyArray
            c00 = .Item(item)
        Next
        sn = .keys ' the array .keys contains all unique keys
        MsgBox Join(.keys, vbLf) ' you can join the array into a string
        RemoveDuplicates = .keys ' return an array without duplicates
    End With
End Function
小鸟爱天空丶 2024-09-12 00:05:12

如果去重数组的顺序对你来说不重要,你可以使用我的实用函数:

Function DeDupArray(ia() As String)
  Dim newa() As String
  ReDim newa(999)
  ni = -1
  For n = LBound(ia) To UBound(ia)
    dup = False
    If n <= UBound(ia) Then
      For k = n + 1 To UBound(ia)
        If ia(k) = ia(n) Then dup = True
      Next k

      If dup = False And Trim(ia(n)) <> "" Then
        ni = ni + 1
        newa(ni) = ia(n)
      End If
    End If
  Next n

  If ni > -1 Then
    ReDim Preserve newa(ni)
  Else
    ReDim Preserve newa(1)
  End If

  DeDupArray = newa
End Function



Sub testdedup()
Dim m(5) As String
Dim m2() As String

m(0) = "Horse"
m(1) = "Cow"
m(2) = "Dear"
m(3) = "Horse"
m(4) = "Joke"
m(5) = "Cow"

m2 = DeDupArray(m)
t = ""
For n = LBound(m2) To UBound(m2)
  t = t & n & "=" & m2(n) & " "
Next n
MsgBox t
End Sub

从测试函数中,它将产生以下去重数组:

“0=Dear 1=Horse 2=Joke 3=Cow”

If the order of the deduplicated array does not matter to you, you can use my pragmatic function:

Function DeDupArray(ia() As String)
  Dim newa() As String
  ReDim newa(999)
  ni = -1
  For n = LBound(ia) To UBound(ia)
    dup = False
    If n <= UBound(ia) Then
      For k = n + 1 To UBound(ia)
        If ia(k) = ia(n) Then dup = True
      Next k

      If dup = False And Trim(ia(n)) <> "" Then
        ni = ni + 1
        newa(ni) = ia(n)
      End If
    End If
  Next n

  If ni > -1 Then
    ReDim Preserve newa(ni)
  Else
    ReDim Preserve newa(1)
  End If

  DeDupArray = newa
End Function



Sub testdedup()
Dim m(5) As String
Dim m2() As String

m(0) = "Horse"
m(1) = "Cow"
m(2) = "Dear"
m(3) = "Horse"
m(4) = "Joke"
m(5) = "Cow"

m2 = DeDupArray(m)
t = ""
For n = LBound(m2) To UBound(m2)
  t = t & n & "=" & m2(n) & " "
Next n
MsgBox t
End Sub

From the test function, it will result in the following deduplicated array:

"0=Dear 1=Horse 2=Joke 3=Cow "

韬韬不绝 2024-09-12 00:05:12

我出于自己的目的而寻找这个,但不喜欢使用循环遍历每个元素、集合/字典或工作表函数。我制作了这个解决方案,仅循环遍历数组以获得相同数量的唯一项目。不过,我还没有运行任何基准测试来测试其与此处解决方案的速度。

我的看法:

Sub GetUniqueItems()
  Dim aFirstArray() As Variant
  Dim TempArray As Variant
  Dim UniqueItem As Variant
  Dim NewList As String
  Dim UniqueArray As Variant
 
  aFirstArray = Array("Banana", "Apple", "Orange", "Tomato", "Apple", _
  "Lemon", "Lime", "Lime", "Apple")
 
 
    TempArray = aFirstArray
    'This was required as the Filter() function caused a Type Mismatch on the original array
    'when assigning the array back to itself. If aFirstArray is defined without the () it would work
    
    Do While UBound(TempArray) >= 0
       UniqueItem = TempArray(0) 'Retrieve the first element for filtering
       NewList = NewList & UniqueItem 'Place element in a string
       TempArray = Filter(TempArray, UniqueItem, False) 'remove UniqueItem from the array
       If UBound(TempArray) >= 0 Then NewList = NewList & "," 'Append a delimiter between elements
    Loop
    
    UniqueArray = Split(NewList, ",") 'Split the List using the delimter to create a new array
    
    For Each UniqueItem In UniqueArray
        Debug.Print UniqueItem
    Next UniqueItem
    
 
End Sub

I was looking for this for my own purposes but didn't like the use of looping through each element, or collections/dictionaries, or the worksheet function. I made this solution that only loops through the array for the same number of unique items. I haven't run any benchmarks to test its speed against the solutions here though.

My take:

Sub GetUniqueItems()
  Dim aFirstArray() As Variant
  Dim TempArray As Variant
  Dim UniqueItem As Variant
  Dim NewList As String
  Dim UniqueArray As Variant
 
  aFirstArray = Array("Banana", "Apple", "Orange", "Tomato", "Apple", _
  "Lemon", "Lime", "Lime", "Apple")
 
 
    TempArray = aFirstArray
    'This was required as the Filter() function caused a Type Mismatch on the original array
    'when assigning the array back to itself. If aFirstArray is defined without the () it would work
    
    Do While UBound(TempArray) >= 0
       UniqueItem = TempArray(0) 'Retrieve the first element for filtering
       NewList = NewList & UniqueItem 'Place element in a string
       TempArray = Filter(TempArray, UniqueItem, False) 'remove UniqueItem from the array
       If UBound(TempArray) >= 0 Then NewList = NewList & "," 'Append a delimiter between elements
    Loop
    
    UniqueArray = Split(NewList, ",") 'Split the List using the delimter to create a new array
    
    For Each UniqueItem In UniqueArray
        Debug.Print UniqueItem
    Next UniqueItem
    
 
End Sub
终止放荡 2024-09-12 00:05:11

这篇文章包含 2 个示例。我喜欢第二个:

Sub unique() 
  Dim arr As New Collection, a 
  Dim aFirstArray() As Variant 
  Dim i As Long 
 
  aFirstArray() = Array("Banana", "Apple", "Orange", "Tomato", "Apple", _ 
  "Lemon", "Lime", "Lime", "Apple") 
 
  On Error Resume Next 
  For Each a In aFirstArray 
     arr.Add a, a 
  Next
  On Error Goto 0 ' added to original example by PEH
 
  For i = 1 To arr.Count 
     Cells(i, 1) = arr(i) 
  Next 
 
End Sub 

This post contains 2 examples. I like the 2nd one:

Sub unique() 
  Dim arr As New Collection, a 
  Dim aFirstArray() As Variant 
  Dim i As Long 
 
  aFirstArray() = Array("Banana", "Apple", "Orange", "Tomato", "Apple", _ 
  "Lemon", "Lime", "Lime", "Apple") 
 
  On Error Resume Next 
  For Each a In aFirstArray 
     arr.Add a, a 
  Next
  On Error Goto 0 ' added to original example by PEH
 
  For i = 1 To arr.Count 
     Cells(i, 1) = arr(i) 
  Next 
 
End Sub 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文