在大型数据表中查找不同的行

发布于 2024-09-26 03:40:57 字数 683 浏览 1 评论 0原文

目前,我们有一个大型 DataTable(约 152k 行),并且正在对每个数据表执行一个操作,以查找不同条目的子集(约 124K 行)。目前运行时间约为 14 分钟,这实在是太长了。

由于我们陷入了 .NET 2.0,因为我们的报告无法与 VS 2008+ 一起使用,所以我无法使用 linq,尽管我不知道这是否会更快。

除了每个循环之外,是否有更好的方法来查找不同的行(在本例中为发票编号)?

这是代码:

Public Shared Function SelectDistinctList(ByVal SourceTable As DataTable, _
                                          ByVal FieldName As String) As List(Of String)
    Dim list As New List(Of String)
    For Each row As DataRow In SourceTable.Rows
        Dim value As String = CStr(row(FieldName))
        If Not list.Contains(value) Then
            list.Add(value)
        End If
    Next
    Return list

End Function

Currently we have a large DataTable (~152k rows) and are doing a for each over this to find a sub set of distinct entries (~124K rows). This is currently taking about 14 minutes to run which is just far too long.

As we are stuck in .NET 2.0 as our reporting won't work with VS 2008+ I can't use linq, though I don't know if this will be any faster in fairness.

Is there a better way to find the distinct lines (invoice numbers in this case) other than this for each loop?

This is the code:

Public Shared Function SelectDistinctList(ByVal SourceTable As DataTable, _
                                          ByVal FieldName As String) As List(Of String)
    Dim list As New List(Of String)
    For Each row As DataRow In SourceTable.Rows
        Dim value As String = CStr(row(FieldName))
        If Not list.Contains(value) Then
            list.Add(value)
        End If
    Next
    Return list

End Function

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

浅听莫相离 2024-10-03 03:40:57

使用 Dictionary 而不是 List 会更快:

    Dim seen As New Dictionary(Of String, String)
    ...
        If Not seen.ContainsKey(value) Then
            seen.Add(value, "")
        End If

当您搜索 List 时,您会将每个条目与 value< /code>,因此在该过程结束时,您将对每条记录进行约 124K 次比较。另一方面,字典使用散列来使查找更快。

当您想要返回唯一值列表时,请使用 seen.Keys

(请注意,您最好使用 Set 类型来实现此目的,但 .NET 2.0 没有这种类型。)

Using a Dictionary rather than a List will be quicker:

    Dim seen As New Dictionary(Of String, String)
    ...
        If Not seen.ContainsKey(value) Then
            seen.Add(value, "")
        End If

When you search a List, you're comparing each entry with value, so by the end of the process you're doing ~124K comparisons for each record. A Dictionary, on the other hand, uses hashing to make the lookups much quicker.

When you want to return the list of unique values, use seen.Keys.

(Note that you'd ideally use a Set type for this, but .NET 2.0 doesn't have one.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文