FindAll 与Where 扩展方法
我只想知道“FindAll”是否比“Where”扩展方法更快,为什么?
示例:
myList.FindAll(item=> item.category == 5);
或
myList.Where(item=> item.category == 5);
哪个更好?
I just want know if a "FindAll" will be faster than a "Where" extentionMethod and why?
Example :
myList.FindAll(item=> item.category == 5);
or
myList.Where(item=> item.category == 5);
Which is better ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
好吧,
FindAll
将匹配的元素复制到一个新列表,而Where
仅返回一个延迟计算的序列 - 不需要复制。因此,我希望
Where
比FindAll
稍快,即使结果序列已完全评估 - 当然还有Where
的惰性评估策略> 意味着如果您只查看(比如说)第一个匹配项,则不需要检查列表的其余部分。 (正如 Matthew 指出的,维护Where
的状态机是有工作的。但是,这只会有固定的内存成本 - 而构造一个新列表可能需要多个数组分配等。)基本上,< code>FindAll(predicate) 更接近
Where(predicate).ToList()
而不是Where(predicate)
。只是为了对马修的回答做出更多反应,我认为他没有足够彻底地测试它。他的谓词恰好选择了一半的项目。这是一个简短但完整的程序,它测试相同的列表,但使用三个不同的谓词 - 一个不选择任何项目,一个选择所有项目,一个选择其中的一半。在每种情况下,我都运行测试五十次以获得更长的时间。
我使用
Count()
来确保完全评估Where
结果。结果显示,收集了大约一半的结果,两者并驾齐驱。如果未收集到任何结果,FindAll
获胜。收集所有结果,Where
获胜。我发现这很有趣:随着找到越来越多的匹配项,所有解决方案都变得更慢:FindAll
有更多的复制要做,而Where
必须返回匹配的值而不是只是在MoveNext()
实现中循环。然而,FindAll
的速度比Where
慢,因此失去了早期的领先优势。非常有趣。结果:(
使用 /o+ /debug- 编译并从命令行运行,.NET 3.5。)
代码:
Well,
FindAll
copies the matching elements to a new list, whereasWhere
just returns a lazily evaluated sequence - no copying is required.I'd therefore expect
Where
to be slightly faster thanFindAll
even when the resulting sequence is fully evaluated - and of course the lazy evaluation strategy ofWhere
means that if you only look at (say) the first match, it won't need to check the remainder of the list. (As Matthew points out, there's work in maintaining the state machine forWhere
. However, this will only have a fixed memory cost - whereas constructing a new list may require multiple array allocations etc.)Basically,
FindAll(predicate)
is closer toWhere(predicate).ToList()
than to justWhere(predicate)
.Just to react a bit more to Matthew's answer, I don't think he's tested it quite thoroughly enough. His predicate happens to pick half the items. Here's a short but complete program which tests the same list but with three different predicates - one picks no items, one picks all the items, and one picks half of them. In each case I run the test fifty times to get longer timing.
I'm using
Count()
to make sure that theWhere
result is fully evaluated. The results show that collecting around half the results, the two are neck and neck. Collecting no results,FindAll
wins. Collecting all the results,Where
wins. I find this intriguing: all of the solutions become slower as more and more matches are found:FindAll
has more copying to do, andWhere
has to return the matched values instead of just looping within theMoveNext()
implementation. However,FindAll
gets slower faster thanWhere
does, so loses its early lead. Very interesting.Results:
(Compiled with /o+ /debug- and run from the command line, .NET 3.5.)
Code:
我们测试而不是猜测怎么样?很遗憾看到错误的答案被出来。
编辑:
即使我将上面的代码从
...转为...,
我也会得到这些结果
编辑2.0...
如果您想要一个子集的列表当前列出最快的方法是FindAll()。原因很简单。 FindAll 实例方法使用当前 List 上的索引器而不是枚举器状态机。 Where() 扩展方法是对使用枚举器的不同类的外部调用。如果从列表中的每个节点步进到下一个节点,则必须在幕后调用 MoveNext() 方法。正如您从上面的示例中看到的,使用索引条目创建新列表(指向原始项目,因此内存膨胀将最小)甚至只获取过滤项目的计数甚至更快。
现在,如果您要从枚举器提前中止,Where() 方法可能会更快。当然,如果您将早期中止逻辑移至 FindAll() 方法的谓词,您将再次使用索引器而不是枚举器。
现在还有其他原因需要使用Where() 语句(例如其他linq 方法、foreach 块等等),但问题是FindAll() 比Where() 更快。除非您不执行Where(),否则答案似乎是肯定的。 (在比较苹果与苹果时)
我并不是说不要使用 LINQ 或 .Where() 方法。它们使代码更易于阅读。问题是关于性能,而不是关于阅读和理解代码的难易程度。快速完成这项工作的最快方法是使用 for 块步进每个索引并根据需要执行任何逻辑(甚至提前退出)。 LINQ 之所以如此出色,是因为您可以使用它们获得复杂的表达式树和转换。但是使用 .Where() 方法中的迭代器必须执行大量代码才能找到内存状态机的路径,该状态机只是从列表中获取下一个索引。还应该注意的是,这个 .FindAll() 方法仅对实现它的对象(例如 Array 和 List)有用。
还有更多...
How about we test instead of guess? Shame to see the wrong answer get out.
Edit:
Even if I turn the above code from
... to ...
I get these results
Edit 2.0...
If you want a list of the subset of the current list the fastest method if the FindAll(). The reason for this is simple. The FindAll instance method uses the indexer on the current List instead of the enumerator state machine. The Where() extension method is an external call to a different class that uses the enumerator. If you step from each node in the list to the next node you will have to call the MoveNext() method under the covers. As you can see from the above examples it is even faster to use the index entries to create a new list (that is pointing to the original items, so memory bloat will be minimal) to even just get a count of the filtered items.
Now if you are going to early abort from the Enumerator the Where() method could be faster. Of course if you move the early abort logic to the predicate of the FindAll() method you will again be using the indexer instead of the enumerator.
Now there are other reasons to use the Where() statement (such as the other linq methods, foreach blocks and many more) but the question was is the FindAll() faster than Where(). And unless you don't execute the Where() the answer seems to be yes. (When comparing apples to apples)
I am not say don't use LINQ or the .Where() method. They make for code that is much simpler to read. The question was about performance and not about how easy you can read and understand the code. By fast the fastest way to do this work would be to use a for block stepping each index and doing any logic as you want (even early exits). The reason LINQ is so great is becasue of the complex expression trees and transformation you can get with them. But using the iterator from the .Where() method has to go though tons of code to find it's way to a in memory statemachine that is just getting the next index out of the List. It should also be noted that this .FindAll() method is only useful on objects that implmented it (such as Array and List.)
Yet more...
好吧,至少你可以尝试测量它。
静态
Where
方法是使用迭代器块(yield
关键字)实现的,这基本上意味着执行将被推迟。如果仅比较对这两个方法的调用,第一个方法会更慢,因为它立即意味着将迭代整个集合。但如果您包含所获得结果的完整迭代,情况可能会有所不同。我很确定
yield
解决方案速度较慢,因为它暗示了生成的状态机机制。 (参见@Matthew anwser)Well, at least you can try to measure it.
The static
Where
method is implemented using an iterator bloc (yield
keyword), which basically means that the execution will be deferred. If you only compare the calls to theses two methods, the first one will be slower, since it immediately implies that the whole collection will be iterated.But if you include the complete iteration of the results you get, things can be a bit different. I'm pretty sure the
yield
solution is slower, due to the generated state machine mechanism it implies. (see @Matthew anwser)我可以提供一些线索,但不确定哪一个更快。
FindAll() 立即执行。
其中()被推迟执行。
I can give some clue, but not sure which one faster.
FindAll() is executed right away.
Where() is defferred executed.
where的好处是延迟执行。如果您具有以下功能,请查看差异:
FindAll 已覆盖整个序列,而在大多数序列中,Where 会在找到一个元素后立即停止枚举。
使用 Any()、Take()、Skip() 等会产生相同的效果。我不确定,但我想您在所有具有延迟执行的函数中都会有巨大的优势
The advantage of where is the deferred execution. See the difference if you'd have the following functionality
FindAll has covered the complete sequene, while Where in most sequences will stop enumerating as soon as one element is found.
The same effects will be one using Any(), Take(), Skip(), etc. I'm not sure, but I guess you'll have huge advantages in all functions that have deferred execution