在大型数据表中查找不同的行
目前,我们有一个大型 DataTable(约 152k 行),并且正在对每个数据表执行一个操作,以查找不同条目的子集(约 124K 行)。目前运行时间约为 14 分钟,这实在是太长了。
由于我们陷入了 .NET 2.0,因为我们的报告无法与 VS 2008+ 一起使用,所以我无法使用 linq,尽管我不知道这是否会更快。
除了每个循环之外,是否有更好的方法来查找不同的行(在本例中为发票编号)?
这是代码:
Public Shared Function SelectDistinctList(ByVal SourceTable As DataTable, _
ByVal FieldName As String) As List(Of String)
Dim list As New List(Of String)
For Each row As DataRow In SourceTable.Rows
Dim value As String = CStr(row(FieldName))
If Not list.Contains(value) Then
list.Add(value)
End If
Next
Return list
End Function
Currently we have a large DataTable (~152k rows) and are doing a for each over this to find a sub set of distinct entries (~124K rows). This is currently taking about 14 minutes to run which is just far too long.
As we are stuck in .NET 2.0 as our reporting won't work with VS 2008+ I can't use linq, though I don't know if this will be any faster in fairness.
Is there a better way to find the distinct lines (invoice numbers in this case) other than this for each loop?
This is the code:
Public Shared Function SelectDistinctList(ByVal SourceTable As DataTable, _
ByVal FieldName As String) As List(Of String)
Dim list As New List(Of String)
For Each row As DataRow In SourceTable.Rows
Dim value As String = CStr(row(FieldName))
If Not list.Contains(value) Then
list.Add(value)
End If
Next
Return list
End Function
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用
Dictionary
而不是List
会更快:当您搜索
List
时,您会将每个条目与value< /code>,因此在该过程结束时,您将对每条记录进行约 124K 次比较。另一方面,字典使用散列来使查找更快。
当您想要返回唯一值列表时,请使用
seen.Keys
。(请注意,您最好使用
Set
类型来实现此目的,但 .NET 2.0 没有这种类型。)Using a
Dictionary
rather than aList
will be quicker:When you search a
List
, you're comparing each entry withvalue
, so by the end of the process you're doing ~124K comparisons for each record. ADictionary
, on the other hand, uses hashing to make the lookups much quicker.When you want to return the list of unique values, use
seen.Keys
.(Note that you'd ideally use a
Set
type for this, but .NET 2.0 doesn't have one.)