DataView.RowFilter 与 DataTable.Select() 与 DataTable.Rows.Find()
考虑下面的代码:
Dataview someView = new DataView(sometable)
someView.RowFilter = someFilter;
if(someView.count > 0) { …. }
很多文章都说 Datatable.Select() 比使用 DataViews 更好,但这些都是 VS2008 之前的。
已解决: DataView 在处理大型记录集时性能不佳之谜
数组DataRecord 与 DataView:性能上的显着差异在谷歌
上搜索这个主题时,我发现一些文章/论坛主题提到 Datatable.Select() 本身有很多错误(对此不确定)并且在各种情况下表现不佳。
关于此(最佳实践 ADO .NET)msdn 上的主题建议如果数据表上定义了主键,则应使用 findrows() 或 find() 方法而不是 Datatable.Select()。
本文此处 (.NET 1.1) 对所有三种方法以及其他几种方法进行了基准测试。但这是针对 1.1 版本的,所以不确定这些现在是否仍然有效。根据此,DataRowCollection.Find() 优于所有方法,Datatable.Select() 优于 DataView.RowFilter。
因此,我对在数据表中查找行的最佳方法是什么感到非常困惑。或者没有单一的好方法可以做到这一点,根据情况存在多种解决方案?
Considering the code below:
Dataview someView = new DataView(sometable)
someView.RowFilter = someFilter;
if(someView.count > 0) { …. }
Quite a number of articles which say Datatable.Select() is better than using DataViews, but these are prior to VS2008.
Solved: The Mystery of DataView's Poor Performance with Large Recordsets
Array of DataRecord vs. DataView: A Dramatic Difference in Performance
Googling on this topic I found some articles/forum topics which mention Datatable.Select() itself is quite buggy(not sure on this) and underperforms in various scenarios.
On this(Best Practices ADO.NET) topic on msdn it is suggested that if there is primary key defined on a datatable the findrows() or find() methods should be used insted of Datatable.Select().
This article here (.NET 1.1) benchmarks all the three approaches plus a couple more. But this is for version 1.1 so not sure if these are valid still now. Accroding to this DataRowCollection.Find() outperforms all approaches and Datatable.Select() outperforms DataView.RowFilter.
So I am quite confused on what might be the best approach on finding rows in a datatable. Or there is no single good way to do this, multiple solutions exist depending upon the scenario?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您正在寻找“在数据表中查找行的最佳方法”,所以我首先要问:“最佳”是什么?我认为,任何技术都有可能比其他技术更适合的场景。
首先,让我们看一下
DataView.RowFilter
:DataView 在数据绑定方面有一些优势。它非常面向视图,因此具有强大的排序、过滤或搜索功能,但会产生一些开销,并且未针对性能进行优化。对于较小的记录集和/或利用其他功能(例如,直接将数据绑定到视图),我会选择 DataView.RowFilter 。有关 DataView 的大多数事实(您可以在旧文章中阅读)仍然适用。
其次,如果您只需要一次点击,您应该更喜欢
DataTable.Rows.Find
而不是DataTable.Select
。为什么? DataTable.Rows.Find 仅返回一行。本质上,当您指定主键时,就会创建一棵二叉树。这会产生一些相关的开销,但会极大地加快检索速度。DataTable.Select 速度较慢,但如果您有多个条件并且不关心索引或未索引行,它会非常方便:它基本上可以找到所有内容,但未针对性能进行优化。本质上,DataTable.Select 必须遍历整个表并将每条记录与您传入的条件进行比较。
我希望您发现这个小概述很有帮助。
我建议看一下这篇文章< /strong>,这对我关于性能问题很有帮助。这篇文章包含其中的一些引述。
一点更新:
顺便说一句,这似乎有点超出您的问题范围,但它几乎总是在后端进行过滤和搜索的最快解决方案。如果您想要简单性并拥有 SQL Server 作为后端并在客户端上使用 .NET3+,请选择 LINQ-to-SQL。搜索 Linq 对象非常方便,并且会创建在服务器端执行的查询。虽然 LINQ-to-Objects 也是一种非常舒适但速度较慢的技术。如果您还不知道的话......
You are looking for the "best approach on finding rows in a datatable", so I first have to ask: "best" for what? I think, any technique has scenarios where it might fit better then the others.
First, let's look at
DataView.RowFilter
: A DataView has some advantages in Data Binding. Its very view-oriented so it has powerful sorting, filtering or searching features, but creates some overhead and is not optimized for performance. I would choose theDataView.RowFilter
for smaller recordsets and/or where you take advantage of the other features (like, a direct data binding to the view).Most facts about the DataView, which you can read in older posts, still apply.
Second, you should prefer
DataTable.Rows.Find
overDataTable.Select
if you want just a single hit. Why? DataTable.Rows.Find returns only a single row. Essentially, when you specify the primary key, a binary tree is created. This has some overhead associated with it, but tremendously speeds up the retrieval.DataTable.Select
is slower, but can come very handy if you have multiple criteria and don't care about indexed or unindexed rows: It can find basically everything but is not optimized for performance. Essentially, DataTable.Select has to walk the entire table and compare every record to the criteria that you passed in.I hope you find this little overview helpful.
I'd suggest to take a look at this article, it was helpful for me regarding performance questions. This post contains some quotes from it.
A little UPDATE:
By the way, this might seem a little out of scope of your question, but its nearly always the fastest solution to do the filtering and searching on the backend. If you want the simplicity and have an SQL Server as backend and .NET3+ on client, go for LINQ-to-SQL. Searching Linq objects is very comfortable and creates queries which are performed on server side. While LINQ-to-Objects is also a very comfortable but also slower technique. In case you didn't know already....
Thomashaid 的帖子总结得很好:
DataView.RowFilter
用于绑定。DataTable.Rows.Find
仅用于按主键搜索。DataTable.Select
用于按多列搜索,也用于指定顺序。避免在循环中创建许多 DataView 并使用它们的 RowFilter 来搜索记录。这将大大降低性能。
我想补充一点,
DataTable.Select
可以利用索引。您可以通过创建 DataView 并指定排序顺序来在 DataTable 上创建索引:然后,当您调用 DataTable.Select() 时,它可以在运行查询时使用该索引。我们使用这种技术来显着提高多次使用相同查询的地方的性能。 (请注意,这是在 Linq 存在之前。)
诀窍是为
Select
语句正确定义排序顺序。因此,如果您的查询是“Col1 = 1 and Col2 = 4”,那么您将需要“Col1, Col2”,如上例所示。请注意,索引创建可能取决于创建 DataView 的实际调用。我们必须使用
new DataView(DataTable dt)
构造函数,然后在单独的步骤中指定 Sort 属性。不同的 .NET 版本的行为可能会略有不同。Thomashaid's post sums it up nicely:
DataView.RowFilter
is for binding.DataTable.Rows.Find
is for searching by primary key only.DataTable.Select
is for searching by multiple columns and also for specifying an order.Avoid creating many DataViews in a loop and using their RowFilters to search for records. This will drastically reduce performance.
I wanted to add that
DataTable.Select
can take advantage of indexes. You can create an index on a DataTable by creating a DataView and specifying a sort order:Then, when you call
DataTable.Select()
, it can use this index when running the query. We have used this technique to seriously improve performance in places where we use the same query many, many times. (Note that this was before Linq existed.)The trick is to define the sort order correctly for the
Select
statement. So if your query is "Col1 = 1 and Col2 = 4", then you'll want "Col1, Col2" like in the example above.Note that the index creation may depend on the actual calls to create the DataView. We had to use the
new DataView(DataTable dt)
constructor, and then specify the Sort property in a separate step. The behavior may change slightly with different .NET versions.