var query = from element in list
where element.X > 2
where element.Y < 2
select element.X + element.Y;
foreach (var value in query)
{
Console.WriteLine(value);
}
现在有两个 where 子句和一个 select 子句,因此每个最终项都必须经过三个迭代器。 (显然,在这种情况下可以组合两个 where 子句,但我提出的是一般性观点。)
现在将其与直接代码进行比较:
foreach (var element in list)
{
if (element.X > 2 && element.Y < 2)
{
Console.WriteLine(element.X + element.Y);
}
}
Write the clearest code you can, and then benchmark and profile to discover any performance problems. If you do have performance problems, you can experiment with different code to work out whether it's faster or not (measuring all the time with as realistic data as possible) and then make a judgement call as to whether the improvement in performance is worth the readability hit.
A direct foreach approach will be faster than LINQ in many cases. For example, consider:
var query = from element in list
where element.X > 2
where element.Y < 2
select element.X + element.Y;
foreach (var value in query)
{
Console.WriteLine(value);
}
Now there are two where clauses and a select clause, so every eventual item has to pass through three iterators. (Obviously the two where clauses could be combined in this case, but I'm making a general point.)
Now compare it with the direct code:
foreach (var element in list)
{
if (element.X > 2 && element.Y < 2)
{
Console.WriteLine(element.X + element.Y);
}
}
That will run faster, because it has fewer hoops to run through. Chances are that the console output will dwarf the iterator cost though, and I'd certainly prefer the LINQ query.
EDIT: To answer about "nested foreach" loops... typically those are represented with SelectMany or a second from clause:
var query = from item in firstSequence
from nestedItem in item.NestedItems
select item.BaseCount + nestedItem.NestedCount;
Here we're only adding a single extra iterator, because we'd already be using an extra iterator per item in the first sequence due to the nested foreach loop. There's still a bit of overhead, including the overhead of doing the projection in a delegate instead of "inline" (something I didn't mention before) but it still won't be very different to the nested-foreach performance.
This is not to say you can't shoot yourself in the foot with LINQ, of course. You can write stupendously inefficient queries if you don't engage your brain first - but that's far from unique to LINQ...
foreach(Customer c in Customer)
{
foreach(Order o in Orders)
{
//do something with c and o
}
}
您将执行 Customer.Count * Order.Count 迭代
如果您这样做,
var query =
from c in Customer
join o in Orders on c.CustomerID equals o.CustomerID
select new {c, o}
foreach(var x in query)
{
//do something with x.c and x.o
}
foreach(Customer c in Customer)
{
foreach(Order o in Orders)
{
//do something with c and o
}
}
You will perform Customer.Count * Order.Count iterations
If you do
var query =
from c in Customer
join o in Orders on c.CustomerID equals o.CustomerID
select new {c, o}
foreach(var x in query)
{
//do something with x.c and x.o
}
You will perform Customer.Count + Order.Count iterations, because Enumerable.Join is implemented as a HashJoin.
It is more complex on that. Ultimately, much of LINQ-to-Objects is (behind the scenes) a foreach loop, but with the added overhead of a little abstraction / iterator blocks / etc. However, unless you do very different things in your two versions (foreach vs LINQ), they should both be O(N).
The real question is: is there a better way of writing your specific algorithm that means that foreach would be inefficient? And can LINQ do it for you?
For example, LINQ makes it easy to hash / group / sort data.
Developers never know where the performance bottleneck is until they run performance tests.
The same is true for comparing technique A to technique B. Unless there is a dramatic difference then you just have to test it. It might be obvious if you have an O(n) vs O(n^x) scenario, but since the LINQ stuff is mostly compiler witchcraft, it merits a profiling.
Besides, unless your project is in production and you have profiled the code and found that that loop is slowing down your execution, leave it as whichever is your preference for readability and maintenance. Premature optimization is the devil.
A great benefit is that using Linq-To-Objects queries gives you the ability to easily turn the query over to PLinq and have the system automatically perform he operation on the correct number of threads for the current system.
If you are using this technique on big datasets, that's an easily become a big win for very little trouble.
发布评论
评论(5)
尽可能编写最清晰的代码,然后进行基准测试和分析以发现任何性能问题。 如果您确实遇到性能问题,您可以尝试不同的代码来确定它是否更快(使用尽可能真实的数据始终进行测量),然后做出判断是否性能的提高值得可读性的提高。
在许多情况下,直接的
foreach
方法比 LINQ 更快。 例如,考虑一下:现在有两个
where
子句和一个select
子句,因此每个最终项都必须经过三个迭代器。 (显然,在这种情况下可以组合两个 where 子句,但我提出的是一般性观点。)现在将其与直接代码进行比较:
它将运行得更快,因为它需要运行的环更少。 不过,控制台输出很可能会使迭代器成本相形见绌,而且我当然更喜欢 LINQ 查询。
编辑:要回答“嵌套 foreach”循环...通常用
SelectMany
或第二个from
子句表示:这里我们只添加一个额外的迭代器,因为由于嵌套的
foreach
循环,我们已经在第一个序列中的每个项目上使用了一个额外的迭代器。 仍然有一些开销,包括在委托中而不是“内联”(我之前没有提到的东西)中进行投影的开销,但它仍然不会与嵌套 foreach 性能有太大不同。当然,这并不是说您不能使用 LINQ 搬起石头砸自己的脚。 如果您不首先动动脑筋,您可能会编写出极其低效的查询 - 但这远非 LINQ 所独有...
Write the clearest code you can, and then benchmark and profile to discover any performance problems. If you do have performance problems, you can experiment with different code to work out whether it's faster or not (measuring all the time with as realistic data as possible) and then make a judgement call as to whether the improvement in performance is worth the readability hit.
A direct
foreach
approach will be faster than LINQ in many cases. For example, consider:Now there are two
where
clauses and aselect
clause, so every eventual item has to pass through three iterators. (Obviously the two where clauses could be combined in this case, but I'm making a general point.)Now compare it with the direct code:
That will run faster, because it has fewer hoops to run through. Chances are that the console output will dwarf the iterator cost though, and I'd certainly prefer the LINQ query.
EDIT: To answer about "nested foreach" loops... typically those are represented with
SelectMany
or a secondfrom
clause:Here we're only adding a single extra iterator, because we'd already be using an extra iterator per item in the first sequence due to the nested
foreach
loop. There's still a bit of overhead, including the overhead of doing the projection in a delegate instead of "inline" (something I didn't mention before) but it still won't be very different to the nested-foreach performance.This is not to say you can't shoot yourself in the foot with LINQ, of course. You can write stupendously inefficient queries if you don't engage your brain first - but that's far from unique to LINQ...
如果这样做,
您将执行 Customer.Count * Order.Count 迭代
如果您这样做,
您将执行 Customer.Count + Order.Count 迭代,因为 Enumerable.Join 是作为 HashJoin 实现的。
If you do
You will perform Customer.Count * Order.Count iterations
If you do
You will perform Customer.Count + Order.Count iterations, because Enumerable.Join is implemented as a HashJoin.
这方面的情况比较复杂。 最终,LINQ-to-Objects 的大部分(在幕后)是一个
foreach
循环,但增加了一些抽象/迭代器块/等的开销。但是,除非您在你的两个版本(foreach 与 LINQ),它们都应该是 O(N)。真正的问题是:是否有更好的方法来编写特定算法,这意味着
foreach
效率低下? LINQ 可以为您做到吗?例如,LINQ 可以轻松地对数据进行哈希/分组/排序。
It is more complex on that. Ultimately, much of LINQ-to-Objects is (behind the scenes) a
foreach
loop, but with the added overhead of a little abstraction / iterator blocks / etc. However, unless you do very different things in your two versions (foreach vs LINQ), they should both be O(N).The real question is: is there a better way of writing your specific algorithm that means that
foreach
would be inefficient? And can LINQ do it for you?For example, LINQ makes it easy to hash / group / sort data.
前面已经说过了,但值得重复一遍。
开发人员在进行性能测试之前永远不知道性能瓶颈在哪里。
比较技术 A 和技术 B 也是如此。除非存在显着差异,否则您只需进行测试即可。 如果您有 O(n) 与 O(n^x) 场景,这可能是显而易见的,但由于 LINQ 的内容主要是编译器的魔法,因此值得对其进行分析。
此外,除非您的项目正在生产中,并且您已经分析了代码并发现该循环正在减慢您的执行速度,否则请将其保留为您对可读性和维护的偏好。 过早的优化是魔鬼。
It's been said before, but it merits repeating.
Developers never know where the performance bottleneck is until they run performance tests.
The same is true for comparing technique A to technique B. Unless there is a dramatic difference then you just have to test it. It might be obvious if you have an O(n) vs O(n^x) scenario, but since the LINQ stuff is mostly compiler witchcraft, it merits a profiling.
Besides, unless your project is in production and you have profiled the code and found that that loop is slowing down your execution, leave it as whichever is your preference for readability and maintenance. Premature optimization is the devil.
一个很大的好处是,使用 Linq-To-Objects 查询使您能够轻松地将查询转交给 PLinq,并让系统自动对当前系统的正确数量的线程执行操作。
如果您在大型数据集上使用这种技术,那么很容易就可以轻松获得巨大胜利。
A great benefit is that using Linq-To-Objects queries gives you the ability to easily turn the query over to PLinq and have the system automatically perform he operation on the correct number of threads for the current system.
If you are using this technique on big datasets, that's an easily become a big win for very little trouble.