IEnumerable问题:最佳性能?

发布于 2024-08-11 11:10:30 字数 270 浏览 11 评论 0原文

快速提问:

哪个更快?

foreach (Object obj in Collection)
{
     if(obj.Mandatory){ ... }
}

或者

foreach (Object obj in Collection.FindAll(o => o.Mandatory))
{
...
}

,如果您知道更快的建议,我会很高兴知道。

谢谢

Quick question:

Which one is faster?

foreach (Object obj in Collection)
{
     if(obj.Mandatory){ ... }
}

or

foreach (Object obj in Collection.FindAll(o => o.Mandatory))
{
...
}

and if you know a faster suggestion, i'd be pleased to know.

Thank you

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

我的奇迹 2024-08-18 11:10:30

如果您的 CollectionList,则通过创建新的 List 来实现 FindAll 并复制与谓词匹配的所有项目。这显然比仅仅枚举集合并决定每个项目谓词是否成立要慢。

如果您使用的是 .NET 3.5,您可以使用 LINQ,它不会创建副本,并且与您的第一个示例类似:

foreach (object obj in someCollection.Where(o => o.Mandatory))
{
    ...
}

请注意,这不一定是最快的解决方案。很容易看出,分配内存并枚举集合的方法比仅枚举集合的方法慢。如果性能至关重要:对其进行衡量。

If your Collection is a List<T> then FindAll is implemented by creating a new List<T> and copying all items that match the predicate. This is obviously slower than just enumerating the collection and deciding for each item if the predicate holds.

If you're using .NET 3.5 you can use LINQ which will not create a copy and and is similar to your first example:

foreach (object obj in someCollection.Where(o => o.Mandatory))
{
    ...
}

Note this isn't necessarily the fastest solution. It's just easy to see that a method that allocates memory and enumerates a collection is slower than a method that only enumerates a collection. If performance is critical: measure it.

笑咖 2024-08-18 11:10:30

以下测试代码打印迭代 1000 万个对象的系统刻度(1 刻度 = 100 纳秒)。正如预期的那样,FindAll 最慢,for 循环最快。

但即使在最坏的情况下,迭代的开销也以每个项目纳秒为单位进行测量。如果您在循环中执行任何重要操作(例如,每个项目需要一微秒的时间),则迭代的速度差异完全微不足道

因此,出于对图灵的热爱,现在不要在您的编码指南中禁止 foreach。它没有任何实际区别,而且 LINQ 语句确实更容易阅读。

   public class Test
   {
      public bool Bool { get; set; }
   }

   class Program
   {

      static void Main(string[] args)
      {
         // fill test list
         var list = new List<Test>();
         for (int i=0; i<1e7; i++)
         {
            list.Add(new Test() { Bool = (i % 2 == 0) });
         }

         // warm-up
         int counter = 0;
         DateTime start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }

         // List.FindAll
         counter = 0;
         start = DateTime.Now;
         foreach (var test in list.FindAll(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 7969158

         // IEnumerable.Where
         counter = 0;
          start = DateTime.Now;
         foreach (var test in list.Where(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 5156514

         // for loop
         counter = 0;
         start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 2968902


      }

The following test code prints the system ticks (1 tick = 100 nanoseconds) for iterating through 10 million objects. The FindAll is slowest and the for loop is fastest as expected.

But the overhead of the iteration is measured in nanoseconds per item even in the worst case. If you're doing anything significant in the loop (e.g. something which takes a microsecond per item), then the speed difference of the iteration is completely insignificant.

So for the love of Turing don't forbid foreach in your coding guidelines now. It doesn't make any practical difference, and the LINQ statements sure are easier to read.

   public class Test
   {
      public bool Bool { get; set; }
   }

   class Program
   {

      static void Main(string[] args)
      {
         // fill test list
         var list = new List<Test>();
         for (int i=0; i<1e7; i++)
         {
            list.Add(new Test() { Bool = (i % 2 == 0) });
         }

         // warm-up
         int counter = 0;
         DateTime start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }

         // List.FindAll
         counter = 0;
         start = DateTime.Now;
         foreach (var test in list.FindAll(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 7969158

         // IEnumerable.Where
         counter = 0;
          start = DateTime.Now;
         foreach (var test in list.Where(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 5156514

         // for loop
         counter = 0;
         start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 2968902


      }
另类 2024-08-18 11:10:30

第一个会快一些。

在第二种情况下,您使用的是 List.FindAll< /code>创建符合您条件的临时列表。这将复制列表,然后对其进行迭代。

但是,您可以通过执行以下操作以与第一个选项相同的速度完成相同的操作:

foreach (Object obj in Collection.Where(o => o.Mandatory))
{
}

这是因为 Enumerable.Where 使用流式传输返回一个 IEnumerable,它是在迭代时生成的。没有复制。

The first will be somewhat faster.

In the second case, you're using List<T>.FindAll to create a temporary list that matches your criteria. This copies the list, then iterates over it.

However, you could accomplish the same thing, with the same speed as your first option, by doing:

foreach (Object obj in Collection.Where(o => o.Mandatory))
{
}

This is because Enumerable.Where uses streaming to return an IEnumerable<T>, which is generated as you iterate. No copy is made.

雨的味道风的声音 2024-08-18 11:10:30

考虑到处理器数量等,无需将枚举并行化为多个线程即可获得的最快速度:

for (int i = 0; i < Collection.Count; i++)
{
    var item = Collection[i];
    if (item.Mandatory) { ... }
}

我建议您始终使用 Linq,而不是编写 forforeach 循环,因为将来它将变得如此智能,以至于它实际上能够在处理器上分配工作并考虑硬件特定的事情(请参阅 PLinq),并且它最终会比您自己编写循环更快:声明式 vs命令式编程。

The fastest you could ever get without parallelizing the enumeration into multiple threads taking accounts of number of processors, etc:

for (int i = 0; i < Collection.Count; i++)
{
    var item = Collection[i];
    if (item.Mandatory) { ... }
}

I would recommend you though to always use Linq instead of writing for or foreach loops because in the future it will become so intelligent that it will actually be capable of distributing the work over processors and take into account hardware specific things (see PLinq) and it will eventually be faster than if you wrote the loops yourself: declarative vs imperative programing.

假扮的天使 2024-08-18 11:10:30

FindAll 只是语法糖。例如:

    List<string> myStrings = new List<string>();
    foreach (string str in myStrings.FindAll(o => o.Length > 0))
    {

    }

编译为:

List<string> list = new List<string>();
if (CS
lt;>9__CachedAnonymousMethodDelegate1 == null)
{
    CS
lt;>9__CachedAnonymousMethodDelegate1 = new Predicate<string>(MyClass.<RunSnippet>b__0);
}
using (List<string>.Enumerator enumerator = list.FindAll(CS
lt;>9__CachedAnonymousMethodDelegate1).GetEnumerator())
{
    while (enumerator.MoveNext())
    {
        string current = enumerator.Current;
    }
}

public List<T> FindAll(Predicate<T> match)
{
    if (match == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    List<T> list = new List<T>();
    for (int i = 0; i < this._size; i++)
    {
        if (match(this._items[i]))
        {
            list.Add(this._items[i]);
        }
    }
    return list;
}

private static bool <RunSnippet>b__0(string o)
{
    return (o.Length > 0);
}

FindAll is just syntactic sugar. For example:

    List<string> myStrings = new List<string>();
    foreach (string str in myStrings.FindAll(o => o.Length > 0))
    {

    }

Compiles to:

List<string> list = new List<string>();
if (CS
lt;>9__CachedAnonymousMethodDelegate1 == null)
{
    CS
lt;>9__CachedAnonymousMethodDelegate1 = new Predicate<string>(MyClass.<RunSnippet>b__0);
}
using (List<string>.Enumerator enumerator = list.FindAll(CS
lt;>9__CachedAnonymousMethodDelegate1).GetEnumerator())
{
    while (enumerator.MoveNext())
    {
        string current = enumerator.Current;
    }
}

public List<T> FindAll(Predicate<T> match)
{
    if (match == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    List<T> list = new List<T>();
    for (int i = 0; i < this._size; i++)
    {
        if (match(this._items[i]))
        {
            list.Add(this._items[i]);
        }
    }
    return list;
}

private static bool <RunSnippet>b__0(string o)
{
    return (o.Length > 0);
}
眼角的笑意。 2024-08-18 11:10:30

如果性能有问题,这可能不是瓶颈,但是,您是否考虑过使用并行库或 PLINQ?请参阅下文:

Parallel.ForEach(Collection, obj =>
{
    if (obj.Mandatory)
    {
        DoWork();
    }
});

http://msdn.microsoft。 com/en-us/library/dd460688(v=vs.110).aspx

另外,虽然可能有点不相关,但如果您正在处理非常大的数据集(二进制文件),性能似乎会引起您的好奇心搜索可能有用。就我而言,我有两个单独的数据列表。我必须处理数百万条记录的列表,这实际上为我每次执行节省了指数级的时间。唯一的缺点是它仅适用于非常大的集合,并且需要事先排序。您还会注意到,这使用了 ConcurrentDictionary 类,该类提供了大量开销,但它是线程安全的,并且由于我异步管理的要求和线程数量而需要它。

private ConcurrentDictionary<string, string> items;
private List<string> HashedListSource { get; set; }
private List<string> HashedListTarget { get; set; }

this.HashedListTarget.Sort();
this.items.OrderBy(x => x.Value);

private void SetDifferences()
{
    for (int i = 0; i < this.HashedListSource.Count; i++)
    {
        if (this.HashedListTarget.BinarySearch(this.HashedListSource[i]) < 0)
        {
            this.Mismatch.Add(items.ElementAt(i).Key);
        }
    }
}

显示使用二分查找的好处的示例
该图片最初发布在一篇很棒的文章中: http://letsalgorithm.blogspot.com/2012/02/intersecting-two-sorted-integer-arrays.html

希望这有帮助!

If performance is in question this probably is not the bottleneck, however, have you considered using the parallel library or PLINQ? see below:

Parallel.ForEach(Collection, obj =>
{
    if (obj.Mandatory)
    {
        DoWork();
    }
});

http://msdn.microsoft.com/en-us/library/dd460688(v=vs.110).aspx

Also, although perhaps slightly unrelated it seems as though performance peeks your curiousity, if you are dealing with very large sets of data, a binary search may be useful. In my case I have two separate lists of data. I have to deal with lists of millions of records and this saved me literally an exponential amount of time per execution. The only downside is that it is ONLY useful for very large collections and is required to be sorted beforehand. You will also notice that this makes use of the ConcurrentDictionary class, which provides significant overhead, but it is thread safe and was required due to the requirements and the number of threads I am managing asynchronously.

private ConcurrentDictionary<string, string> items;
private List<string> HashedListSource { get; set; }
private List<string> HashedListTarget { get; set; }

this.HashedListTarget.Sort();
this.items.OrderBy(x => x.Value);

private void SetDifferences()
{
    for (int i = 0; i < this.HashedListSource.Count; i++)
    {
        if (this.HashedListTarget.BinarySearch(this.HashedListSource[i]) < 0)
        {
            this.Mismatch.Add(items.ElementAt(i).Key);
        }
    }
}

Example displaying the benefits of using Binary Search
This image was originally posted in a great article found here: http://letsalgorithm.blogspot.com/2012/02/intersecting-two-sorted-integer-arrays.html

Hope this helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文