C# Iteration compact-framework garbage-collection

允许迭代而不产生任何垃圾

发布于 2024-10-27 23:41:02 字数 608 浏览 1 评论 0原文

我在实现 IEnumerable 接口的对象池中有以下代码。

public IEnumerable<T> ActiveNodes
{
    get
    {
        for (int i = 0; i < _pool.Count; i++)
        {
            if (_pool[i].AvailableInPool)
            {
                yield return _pool[i];
            }
        }
    }
}

据我所知（根据 this 问题），这将生成垃圾，因为 IEnumerable 对象将需要被收集。 _pool 中的任何元素都不会被收集，因为池的目的是保留对所有元素的引用以防止垃圾创建。

任何人都可以建议一种允许迭代 _pool 以便不生成垃圾的方法吗？

迭代池时，应迭代池中具有 AvailableInPool == true 的所有项目。顺序并不重要。

原文

I have the following code in an object pool that implements the IEnumerable interface.

public IEnumerable<T> ActiveNodes
{
    get
    {
        for (int i = 0; i < _pool.Count; i++)
        {
            if (_pool[i].AvailableInPool)
            {
                yield return _pool[i];
            }
        }
    }
}

As far as I know (according to this question), this will generate garbage as the IEnumerable object will need to be collected.
None of the elements in _pool will ever be collected, as the purpose of the pool is to keep references to all of them to prevent garbage creation.

Can anyone suggest a way to allow iteration over _pool so that no garbage is generated?

When iterating over pool, all of the items in pool that have AvailableInPool == true should be iterated over. Order doesn't matter.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

っ〆星空下的拥抱 2024-11-03 23:41:02

首先，许多人反驳奥尔霍夫斯基，认为这没什么值得担心的。在某些环境下的某些应用程序中，避免收集压力实际上非常重要。

紧凑框架垃圾收集器有一个不复杂的策略；每次分配 1000KB 内存时，它都会触发一次收集。现在假设您正在编写一个在紧凑框架上运行的游戏，并且物理引擎每次运行时都会生成 1KB 的垃圾。物理引擎通常以每秒 20 次的速度运行。因此，每分钟有 1200KB 的压力，嘿，这已经是物理引擎每分钟不止一次的收集了。如果收集导致游戏出现明显的卡顿，那么这可能是不可接受的。在这种情况下，您可以采取任何措施来降低收集压力。

尽管我在桌面 CLR 上工作，但我自己还是在艰难地学习这一点。我们在编译器中遇到了必须避免收集压力的情况，并且我们正在跳过各种对象池环来做到这一点。奥尔霍夫斯基，我感受到你的痛苦。

那么，回到你的问题，如何在不产生收集压力的情况下迭代池对象的集合？

首先，我们来思考一下为什么在典型场景中会出现收集压力。假设您

 foreach(var node in ActiveNodes) { ... }

逻辑上分配了两个对象。首先，它分配表示节点序列的可枚举值（序列）。其次，它分配表示序列中当前位置的枚举器（游标）。

在实践中，有时您可以稍微作弊并拥有一个既代表序列又代表枚举器的对象，但您仍然分配了一个对象。

怎样才能避免这种收款压力呢？我想到三件事。

1) 首先不要创建 ActiveNodes 方法。让调用者按索引迭代池，并检查自己节点是否可用。序列就是已经分配的池，游标是一个整数，两者都不会产生新的收集压力。您付出的代价是重复的代码。

2）正如 Steven 所建议的，编译器将采用任何具有正确公共方法和属性的类型；它们不必是 IEnumerable 和 IEnumerator。您可以创建自己的可变结构序列和游标对象，按值传递它们，并避免收集压力。具有可变结构是危险的，但这是可能的。请注意，List 将这种策略用于其枚举器；研究其实施以获取想法。

3）正常在堆上分配序列和枚举器并将它们也池化！您已经采用了池策略，因此没有理由不能同时池化枚举器。枚举器甚至有一个方便的“重置”方法，通常只会引发异常，但您可以编写一个自定义枚举器对象，当它返回到池中时，使用它将枚举器重置回序列的开头。

大多数对象一次仅枚举一次，因此在典型情况下池可能很小。

（当然，现在您可能会遇到先有鸡还是先有蛋的问题；您将如何枚举枚举器池？）

First off, a number of people are pushing back on Olhovsky to suggest that this is worrying about nothing. Avoiding collection pressure is actually very important in some applications on some environments.

The compact framework garbage collector has an unsophisticated policy; it triggers a collection every time 1000KB of memory has been allocated. Now suppose you are writing a game that runs on the compact framework, and the physics engine generates 1KB of garbage every time it runs. Physics engines are typically run on the order of 20 times a second. So that's 1200KB of pressure per minute, and hey, that's already more than one collection per minute just from the physics engine. If the collection causes a noticable stutter in the game then that might be unacceptable. In such a scenario, anything you can do to decrease collection pressure helps.

I am learning this myself the hard way, even though I work on the desktop CLR. We have scenarios in the compiler where we must avoid collection pressure, and we are jumping through all kinds of object pooling hoops to do so. Olhovsky, I feel your pain.

So, to come to your question, how can you iterate over the collection of pooled objects without creating collection pressure?

First, let's think about why collection pressure happens in the typical scenario. Suppose you have

 foreach(var node in ActiveNodes) { ... }

Logically this allocates two objects. First, it allocates the enumerable -- the sequence -- that represents the sequence of nodes. Second, it allocates the enumerator -- the cursor -- that represents the current position in the sequence.

In practice sometimes you can cheat a bit and have one object that represents both the sequence and the enumerator, but you still have one object allocated.

How can we avoid this collection pressure? Three things come to mind.

1) Don't make an ActiveNodes method in the first place. Make the caller iterate over the pool by index, and check themselves whether the node is available. The sequence is then the pool, which is already allocated, and the cursor is an integer, neither of which are creating new collection pressure. The price you pay is duplicated code.

2) As Steven suggests, the compiler will take any types that have the right public methods and properties; they don't have to be IEnumerable and IEnumerator. You can make your own mutable-struct sequence and cursor objects, pass those around by value, and avoid collection pressure. It is dangerous to have mutable structs, but it is possible. Note that List<T> uses this strategy for its enumerator; study its implementation for ideas.

3) Allocate the sequence and the enumerators on the heap normally and pool them too! You're already going with a pooling strategy, so there's no reason why you can't pool an enumerator as well. Enumerators even have a convenient "Reset" method that usually just throws an exception, but you could write a custom enumerator object that used it to reset the enumerator back to the beginning of the sequence when it goes back in the pool.

Most objects are only enumerated once at a time, so the pool can be small in typical cases.

(Now, of course you may have a chicken-and-egg problem here; how are you going to enumerate the pool of enumerators?)

回复收藏 0 原文

最舍不得你 2024-11-03 23:41:02

在任何“正常”设计中，迭代项目通常会导致创建新的可枚举对象。创建和处置对象非常快，因此只有在非常特殊的情况下（其中低延迟是最优先考虑的）垃圾收集可能（我说“可能”）是一个问题。

通过返回不实现IEnumerable的结构，可以实现没有垃圾的设计。 C# 编译器仍然可以迭代此类对象，因为 foreach 语句使用鸭子类型。例如，.NET 的 List 就采用了这种方法。

在数组和 List上使用 foreach 时，不会生成垃圾。在数组上使用 foreach 时，C# 会将操作转换为 for 语句，而 List 已经实现了 struct 枚举器，导致 foreach 不产生垃圾。

这是一个 struct enumerable 和 struct enumerator。当您返回可枚举时，C# 编译器可以对其进行 foreach：

public struct StructEnumerable<T>
{
    private readonly List<T> pool;

    public StructEnumerable(List<T> pool)
    {
        this.pool = pool;
    }

    public StructEnumerator<T> GetEnumerator()
    {
        return new StructEnumerator<T>(this.pool);
    }
}

这是 StructEnumerator：

public struct StructEnumerator<T>
{
    private readonly List<T> pool;
    private int index;

    public StructEnumerator(List<T> pool)
    {
        this.pool = pool;
        this.index = 0;
    }

    public T Current
    {
        get
        {
            if (this.pool == null || this.index == 0)
                throw new InvalidOperationException();

            return this.pool[this.index - 1];
        }
    }

    public bool MoveNext()
    {
        this.index++;
        return this.pool != null && this.pool.Count >= this.index;
    }

    public void Reset()
    {
        this.index = 0;
    }
}

您可以简单地返回 StructEnumerable，如下所示：

public StructEnumerable<T> Items
{
    get { return new StructEnumerable<T>(this.pool); }
}

并且 C# 可以迭代使用普通的 foreach ：

foreach (var item in pool.Items)
{
    Console.WriteLine(item);
}

请注意，您无法使用 System.Linq.Enumerable 对项目进行 LINQ 您需要IEnumerable接口，涉及创建枚举器，因此涉及垃圾收集。当然，您可以构建自己的 LINQ 扩展方法，但这不太可能有帮助，因为这通常仍会导致创建新对象（当为使用的委托生成闭包时）。

更新（2024）：较新的 .NET（Core）版本包含对 LINQ 方法的许多优化，这些优化在某些情况下使其速度非常快，并且在某些情况下不会产生垃圾，特别是当 LINQ 方法用于过滤数组和 List时。 T>。这意味着您应该在从 LINQ 操作恢复为手动编写代码之前对代码的性能进行基准测试，因为 LINQ 操作很可能比您自己合理编写的任何操作快几个数量级。

Iterating items will in any 'normal' design usually result in the creation of a new enumerable object. Creating and disposing objects is very fast, so only in very special scenarios (where low latency is the top most priority) garbage collections could (I say 'could') be a problem.

A design without garbage is possible by returning structures that don't implement IEnumerable. The C# compiler can still iterate such objects, because the foreach statement uses duck typing. .NET's List<T>, for instance, takes this approach.

When using foreach over both an array and List<T>, no garbage will be generated. When using foreach on an array, C# will transform the operation to a for statement, while List<T> already implements a struct enumerator, causing the foreach to produce no garbage.

Here is a struct enumerable and struct enumerator. When you return the enumerable, the C# compiler can foreach over it:

public struct StructEnumerable<T>
{
    private readonly List<T> pool;

    public StructEnumerable(List<T> pool)
    {
        this.pool = pool;
    }

    public StructEnumerator<T> GetEnumerator()
    {
        return new StructEnumerator<T>(this.pool);
    }
}

Here is the StructEnumerator:

public struct StructEnumerator<T>
{
    private readonly List<T> pool;
    private int index;

    public StructEnumerator(List<T> pool)
    {
        this.pool = pool;
        this.index = 0;
    }

    public T Current
    {
        get
        {
            if (this.pool == null || this.index == 0)
                throw new InvalidOperationException();

            return this.pool[this.index - 1];
        }
    }

    public bool MoveNext()
    {
        this.index++;
        return this.pool != null && this.pool.Count >= this.index;
    }

    public void Reset()
    {
        this.index = 0;
    }
}

You can simply return the StructEnumerable<T> as follows:

public StructEnumerable<T> Items
{
    get { return new StructEnumerable<T>(this.pool); }
}

And C# can iterate over this with a normal foreach:

foreach (var item in pool.Items)
{
    Console.WriteLine(item);
}

Note that you can't LINQ over the item using System.Linq.Enumerable You need the IEnumerable<T> interface for that, and that involves creating enumerators and, therefore, garbage collection. You could, of course, build your own LINQ extension methods, but that will unlikely help, because that will often still result in new objects being created (when closures are being generated for used delegates).

UPDATE (2024): Newer .NET (Core) versions contain many optimizations to the LINQ methods that in some cases make it extraordinary fast and in some cases produce no garbage, especially when LINQ methods are used to filter array and List<T>. This means that you should benchmark the performance of your code before reverting from LINQ operations to manual written code, because the LINQ operations might very well be orders of magnitude faster than anything you can reasonably write yourself.

回复收藏 0 原文