IEnumerable问题:最佳性能?
快速提问:
哪个更快?
foreach (Object obj in Collection)
{
if(obj.Mandatory){ ... }
}
或者
foreach (Object obj in Collection.FindAll(o => o.Mandatory))
{
...
}
,如果您知道更快的建议,我会很高兴知道。
谢谢
Quick question:
Which one is faster?
foreach (Object obj in Collection)
{
if(obj.Mandatory){ ... }
}
or
foreach (Object obj in Collection.FindAll(o => o.Mandatory))
{
...
}
and if you know a faster suggestion, i'd be pleased to know.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果您的
Collection
是List
,则通过创建新的List
来实现FindAll
并复制与谓词匹配的所有项目。这显然比仅仅枚举集合并决定每个项目谓词是否成立要慢。如果您使用的是 .NET 3.5,您可以使用 LINQ,它不会创建副本,并且与您的第一个示例类似:
请注意,这不一定是最快的解决方案。很容易看出,分配内存并枚举集合的方法比仅枚举集合的方法慢。如果性能至关重要:对其进行衡量。
If your
Collection
is aList<T>
thenFindAll
is implemented by creating a newList<T>
and copying all items that match the predicate. This is obviously slower than just enumerating the collection and deciding for each item if the predicate holds.If you're using .NET 3.5 you can use LINQ which will not create a copy and and is similar to your first example:
Note this isn't necessarily the fastest solution. It's just easy to see that a method that allocates memory and enumerates a collection is slower than a method that only enumerates a collection. If performance is critical: measure it.
以下测试代码打印迭代 1000 万个对象的系统刻度(1 刻度 = 100 纳秒)。正如预期的那样,FindAll 最慢,for 循环最快。
但即使在最坏的情况下,迭代的开销也以每个项目纳秒为单位进行测量。如果您在循环中执行任何重要操作(例如,每个项目需要一微秒的时间),则迭代的速度差异完全微不足道。
因此,出于对图灵的热爱,现在不要在您的编码指南中禁止 foreach。它没有任何实际区别,而且 LINQ 语句确实更容易阅读。
The following test code prints the system ticks (1 tick = 100 nanoseconds) for iterating through 10 million objects. The FindAll is slowest and the for loop is fastest as expected.
But the overhead of the iteration is measured in nanoseconds per item even in the worst case. If you're doing anything significant in the loop (e.g. something which takes a microsecond per item), then the speed difference of the iteration is completely insignificant.
So for the love of Turing don't forbid foreach in your coding guidelines now. It doesn't make any practical difference, and the LINQ statements sure are easier to read.
第一个会快一些。
在第二种情况下,您使用的是
List.FindAll< /code>
创建符合您条件的临时列表。这将复制列表,然后对其进行迭代。
但是,您可以通过执行以下操作以与第一个选项相同的速度完成相同的操作:
这是因为 Enumerable.Where 使用流式传输返回一个
IEnumerable
,它是在迭代时生成的。没有复制。The first will be somewhat faster.
In the second case, you're using
List<T>.FindAll
to create a temporary list that matches your criteria. This copies the list, then iterates over it.However, you could accomplish the same thing, with the same speed as your first option, by doing:
This is because Enumerable.Where uses streaming to return an
IEnumerable<T>
, which is generated as you iterate. No copy is made.考虑到处理器数量等,无需将枚举并行化为多个线程即可获得的最快速度:
我建议您始终使用 Linq,而不是编写
for
或foreach
循环,因为将来它将变得如此智能,以至于它实际上能够在处理器上分配工作并考虑硬件特定的事情(请参阅 PLinq),并且它最终会比您自己编写循环更快:声明式 vs命令式编程。The fastest you could ever get without parallelizing the enumeration into multiple threads taking accounts of number of processors, etc:
I would recommend you though to always use Linq instead of writing
for
orforeach
loops because in the future it will become so intelligent that it will actually be capable of distributing the work over processors and take into account hardware specific things (see PLinq) and it will eventually be faster than if you wrote the loops yourself: declarative vs imperative programing.FindAll 只是语法糖。例如:
编译为:
FindAll is just syntactic sugar. For example:
Compiles to:
如果性能有问题,这可能不是瓶颈,但是,您是否考虑过使用并行库或 PLINQ?请参阅下文:
http://msdn.microsoft。 com/en-us/library/dd460688(v=vs.110).aspx
另外,虽然可能有点不相关,但如果您正在处理非常大的数据集(二进制文件),性能似乎会引起您的好奇心搜索可能有用。就我而言,我有两个单独的数据列表。我必须处理数百万条记录的列表,这实际上为我每次执行节省了指数级的时间。唯一的缺点是它仅适用于非常大的集合,并且需要事先排序。您还会注意到,这使用了 ConcurrentDictionary 类,该类提供了大量开销,但它是线程安全的,并且由于我异步管理的要求和线程数量而需要它。
该图片最初发布在一篇很棒的文章中: http://letsalgorithm.blogspot.com/2012/02/intersecting-two-sorted-integer-arrays.html
希望这有帮助!
If performance is in question this probably is not the bottleneck, however, have you considered using the parallel library or PLINQ? see below:
http://msdn.microsoft.com/en-us/library/dd460688(v=vs.110).aspx
Also, although perhaps slightly unrelated it seems as though performance peeks your curiousity, if you are dealing with very large sets of data, a binary search may be useful. In my case I have two separate lists of data. I have to deal with lists of millions of records and this saved me literally an exponential amount of time per execution. The only downside is that it is ONLY useful for very large collections and is required to be sorted beforehand. You will also notice that this makes use of the ConcurrentDictionary class, which provides significant overhead, but it is thread safe and was required due to the requirements and the number of threads I am managing asynchronously.
This image was originally posted in a great article found here: http://letsalgorithm.blogspot.com/2012/02/intersecting-two-sorted-integer-arrays.html
Hope this helps!