Yield 方法中的垃圾收集

发布于 2024-07-12 19:46:08 字数 890 浏览 9 评论 0原文

假设我有一个这样的方法(从 Jon Skeet 之前的 SO 答案中窃取):

public static IEnumerable<TSource> DuplicatesBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        // Yield it if the key hasn't actually been added - i.e. it
        // was already in the set
        if (!seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

在这个方法中,我有一个 HashSet,用于保存已看到的键。 如果我在类似的事情中使用这种方法。

List<string> strings = new List<string> { "1", "1", "2", "3" };
List<string> somewhatUniques = strings.DuplicatesBy(s => s).Take(2);

这只会枚举字符串列表中的前 2 项。 但是垃圾回收如何收集 sawKeys 哈希集。 由于yield只是暂停该方法的执行,如果该方法成本高昂,我如何确保正确处置东西?

Say I have a method like this (stolen from a previous SO answer by Jon Skeet):

public static IEnumerable<TSource> DuplicatesBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        // Yield it if the key hasn't actually been added - i.e. it
        // was already in the set
        if (!seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

In this method I have a HashSet that is used to hold keys that have been seen. If I use this method in something like this.

List<string> strings = new List<string> { "1", "1", "2", "3" };
List<string> somewhatUniques = strings.DuplicatesBy(s => s).Take(2);

This will only enumerate over the first 2 items in the strings list. But how does garbage collection collect the seenKeys hashset. Since yield just pauses the execution of the method, if the method is expensive how can I make sure I dispose of things properly?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

等你爱我 2024-07-19 19:46:09

编译器生成一个隐藏类来实现此代码。 它有一个超级秘密的名字:“d__0`2”。 您的 sawKeys 和源变量成为该类的字段,确保除非收集类对象,否则它们不会被垃圾收集。

该类实现 IEnumerator<> 接口,使用迭代器的客户端代码使用该接口来调用 MoveNext() 方法。 正是该接口引用使类对象保持活动状态。 这使得它的田地保持活力。 一旦客户端代码完成 foreach 循环,接口引用就会消失,从而允许 GC 清理所有内容。

使用 Ildasm.exe 或 Reflector 亲自查看。 它还会让您深入了解语法糖的隐藏成本。 迭代器并不便宜。

The compiler generates a hidden class to implement this code. It has a super-secret name: "d__0`2". Your seenKeys and source variables become fields of that class, ensuring that they can't get garbage collected unless the class object is collected.

The class implements the IEnumerator<> interface, the client code that uses the iterator uses that interface to call the MoveNext() method. It is that interface reference that keeps the class object alive. Which keeps its fields alive. As soon as the client code completes the foreach loop, the interface reference disappears, allowing the GC to clean everything up.

Use Ildasm.exe or Reflector to see this for yourself. It will give you some insight in the hidden cost of syntactic sugar as well. Iterators aren't cheap.

长不大的小祸害 2024-07-19 19:46:09

好吧,垃圾收集不会立即收集它。 显然不能。

在内部,当您在方法上执行类似 foreach 的操作时,它会多次调用 GetEnumerator(),然后调用 MoveNext() 来获取每一项内容。 枚举器是一次性的,当枚举器被处置时(foreach 在循环结束时为您处置它),垃圾收集将随意清理迭代器中的任何对象。

因此,如果你的迭代器中有很多昂贵的状态,并且你要对其进行很长时间的迭代,那么你可能想要要么不使用yield return,要么通过调用类似 ToArray() 的方法立即评估整个枚举然后看着那个。

编辑:所以,为了回答你的最后一个问题——如何确保它被处置——如果你在上面使用 LINQ 或 foreach 结构,你不需要做任何特别的事情,因为它们通过他们平常的魔法自己照顾它。 如果您手动获取枚举器,请确保在完成后对其调用 Dispose() 或将其放入 using 块中。

Well, garbage collection doesn't collect it right away. It can't, obviously.

Internally, when you do something like a foreach over your method, it's calling GetEnumerator() and then MoveNext() on it a lot of times to get each thing. Enumerators are disposable, and when the enumerator is disposed -- foreach disposes it for you at the end of the loop -- garbage collection will feel free to clean up any objects that are in your iterator.

So, if you have a lot of expensive state in your iterator and you're iterating over it for a long time, then you probably want to either not use yield return, or evaluate the whole enumeration right away by calling something like ToArray() and then looking at that.

EDIT: So, in response to your final question -- how you can make sure it gets disposed -- there's nothing special you need to do if you're using LINQ or foreach constructs on it, because they take care of it themselves via their usual magic. If you're manually getting the enumerator, make sure you call Dispose() on it when you're finished or put it in a using block.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文