当前位置：文江博客话题详情

从只能迭代一次的 IEnumerable 获取头和尾

发布于 2024-10-21 04:16:14 字数 372 浏览 4 评论 0原文

我有一系列元素。该序列只能迭代一次并且可以是“无限”的。

获取此类序列的头部和尾部的最佳方法是什么？

更新：如果我将其包含在原始问题中，那就太好了:)

Head 是序列的第一个元素，tail 是“其余的”。这意味着尾巴也是“无限”的。
当我说无限时，我的意思是“非常大”和“我不想一次将其全部存储在内存中”。它实际上也可能是无限的，例如传感器数据（但在我的情况下不是）。
当我说它只能迭代一次时，我的意思是生成序列需要大量资源，所以我不想再这样做。它也可能是易失性数据，再次像传感器数据一样，在下次读取时不会相同（但在我的情况下不是）。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

人生百味 2024-10-28 04:16:14

将 IEnumerable 分解为 head & tail 对于递归处理来说并不是特别好（与函数列表不同），因为当您递归地使用 tail 操作时，您将创建许多间接寻址。但是，您可以编写如下内容：

我忽略了参数检查和异常处理等内容，但它显示了这个想法...

Tuple<T, IEnumerable<T>> HeadAndTail<T>(IEnumerable<T> source) {
  // Get first element of the 'source' (assuming it is there)
  var en = source.GetEnumerator();
  en.MoveNext();
  // Return first element and Enumerable that iterates over the rest
  return Tuple.Create(en.Current, EnumerateTail(en));
}

// Turn remaining (unconsumed) elements of enumerator into enumerable
IEnumerable<T> EnumerateTail<T>(IEnumerator en) {
  while(en.MoveNext()) yield return en.Current; 
}

HeadAndTail 方法获取第一个元素并将其作为第一个元素返回元组的。元组的第二个元素是 IEnumerable，它是从剩余元素生成的（通过迭代我们已经创建的枚举器的其余部分）。

Decomposing IEnumerable<T> into head & tail isn't particularly good for recursive processing (unlike functional lists) because when you use the tail operation recursively, you'll create a number of indirections. However, you can write something like this:

I'm ignoring things like argument checking and exception handling, but it shows the idea...

Tuple<T, IEnumerable<T>> HeadAndTail<T>(IEnumerable<T> source) {
  // Get first element of the 'source' (assuming it is there)
  var en = source.GetEnumerator();
  en.MoveNext();
  // Return first element and Enumerable that iterates over the rest
  return Tuple.Create(en.Current, EnumerateTail(en));
}

// Turn remaining (unconsumed) elements of enumerator into enumerable
IEnumerable<T> EnumerateTail<T>(IEnumerator en) {
  while(en.MoveNext()) yield return en.Current; 
}

The HeadAndTail method gets the first element and returns it as the first element of a tuple. The second element of a tuple is IEnumerable<T> that's generated from the remaining elements (by iterating over the rest of the enumerator that we already created).

回复收藏 0 原文

伪心 2024-10-28 04:16:14

显然，每次调用 HeadAndTail 都应该再次枚举序列（除非使用某种缓存）。例如，请考虑以下情况：

var a = HeadAndTail(sequence);
Console.WriteLine(HeadAndTail(a.Tail).Tail);
//Element #2; enumerator is at least at #2 now.

var b = HeadAndTail(sequence);
Console.WriteLine(b.Tail);
//Element #1; there is no way to get #1 unless we enumerate the sequence again.

出于同样的原因，HeadAndTail 无法实现为单独的 Head 和 Tail 方法（除非您甚至想要首先调用 Tail 再次枚举序列，即使它已经通过调用 Head 枚举过）。

此外，HeadAndTail 不应返回 IEnumerable 的实例（因为它可能会被枚举多次）。

这给我们留下了唯一的选择：HeadAndTail应该返回IEnumerator，并且，为了让事情变得更明显，它也应该接受IEnumerator（我们只是将 GetEnumerator 的调用从 HeadAndTail 内部移动到外部，以强调它仅供一次性使用）。

现在我们已经解决了需求，实现就非常简单了：

class HeadAndTail<T> {
    public readonly T Head;
    public readonly IEnumerator<T> Tail;

    public HeadAndTail(T head, IEnumerator<T> tail) {
        Head = head;
        Tail = tail;
    }
}

static class IEnumeratorExtensions {
    public static HeadAndTail<T> HeadAndTail<T>(this IEnumerator<T> enumerator) {
        if (!enumerator.MoveNext()) return null;
        return new HeadAndTail<T>(enumerator.Current, enumerator);
    }
}

现在它可以像这样使用：

Console.WriteLine(sequence.GetEnumerator().HeadAndTail().Tail.HeadAndTail().Head);
//Element #2

或者在像这样的递归函数中：

TResult FoldR<TSource, TResult>(
    IEnumerator<TSource> sequence,
    TResult seed,
    Func<TSource, TResult, TResult> f
) {
    var headAndTail = sequence.HeadAndTail();
    if (headAndTail == null) return seed;
    return f(headAndTail.Head, FoldR(headAndTail.Tail, seed, f));
}

int Sum(IEnumerator<int> sequence) {
    return FoldR(sequence, 0, (x, y) => x+y);
}

var array = Enumerable.Range(1, 5);
Console.WriteLine(Sum(array.GetEnumerator())); //1+(2+(3+(4+(5+0)))))

Obviously, each call to HeadAndTail should enumerate the sequence again (unless there is some sort of caching used). For example, consider the following:

var a = HeadAndTail(sequence);
Console.WriteLine(HeadAndTail(a.Tail).Tail);
//Element #2; enumerator is at least at #2 now.

var b = HeadAndTail(sequence);
Console.WriteLine(b.Tail);
//Element #1; there is no way to get #1 unless we enumerate the sequence again.

For the same reason, HeadAndTail could not be implemented as separate Head and Tail methods (unless you want even the first call to Tail to enumerate the sequence again even if it was already enumerated by a call to Head).

Additionally, HeadAndTail should not return an instance of IEnumerable (as it could be enumerated multiple times).

This leaves us with the only option: HeadAndTail should return IEnumerator, and, to make things more obvious, it should accept IEnumerator as well (we're just moving an invocation of GetEnumerator from inside the HeadAndTail to the outside, to emphasize it is of one-time use only).

Now that we have worked out the requirements, the implementation is pretty straightforward:

class HeadAndTail<T> {
    public readonly T Head;
    public readonly IEnumerator<T> Tail;

    public HeadAndTail(T head, IEnumerator<T> tail) {
        Head = head;
        Tail = tail;
    }
}

static class IEnumeratorExtensions {
    public static HeadAndTail<T> HeadAndTail<T>(this IEnumerator<T> enumerator) {
        if (!enumerator.MoveNext()) return null;
        return new HeadAndTail<T>(enumerator.Current, enumerator);
    }
}

And now it can be used like this:

Console.WriteLine(sequence.GetEnumerator().HeadAndTail().Tail.HeadAndTail().Head);
//Element #2

Or in recursive functions like this:

TResult FoldR<TSource, TResult>(
    IEnumerator<TSource> sequence,
    TResult seed,
    Func<TSource, TResult, TResult> f
) {
    var headAndTail = sequence.HeadAndTail();
    if (headAndTail == null) return seed;
    return f(headAndTail.Head, FoldR(headAndTail.Tail, seed, f));
}

int Sum(IEnumerator<int> sequence) {
    return FoldR(sequence, 0, (x, y) => x+y);
}

var array = Enumerable.Range(1, 5);
Console.WriteLine(Sum(array.GetEnumerator())); //1+(2+(3+(4+(5+0)))))

回复收藏 0 原文

ぺ禁宫浮华殁 2024-10-28 04:16:14

虽然此处的其他方法建议对 tail 枚举使用 yield return，但这种方法会增加不必要的嵌套开销。更好的方法是将 Enumerator 转换回可以与 foreach 一起使用的东西：

public struct WrappedEnumerator<T>
{
    T myEnumerator;
    public T GetEnumerator() { return myEnumerator; }
    public WrappedEnumerator(T theEnumerator) { myEnumerator = theEnumerator; }
}
public static class AsForEachHelper
{
    static public WrappedEnumerator<IEnumerator<T>> AsForEach<T>(this IEnumerator<T> theEnumerator) {return new WrappedEnumerator<IEnumerator<T>>(theEnumerator);}

    static public WrappedEnumerator<System.Collections.IEnumerator> AsForEach(this System.Collections.IEnumerator theEnumerator) 
        { return new WrappedEnumerator<System.Collections.IEnumerator>(theEnumerator); }
}

如果使用单独的 WrappedEnumerator 结构通用 IEnumerable 和非通用 IEnumerable，可以让它们实现 IEnumerable 和 IEnumerable > 分别；不过，他们不会真正遵守 IEnumerable 约定，该约定指定应该可以多次调用 GetEnumerator()，并且每次调用都会返回一个独立的枚举器。

另一个重要的警告是，如果在 IEnumerator 上使用 AsForEach，则应准确枚举生成的 WrappedEnumerator。 > 一次。如果从未枚举它，则底层 IEnumerator 将永远不会调用其 Dispose 方法。

将上面提供的方法应用于当前的问题，可以轻松地在 IEnumerable 上调用 GetEnumerator()，读出前几项，然后然后使用 AsForEach() 转换余数，以便可以将其与 ForEach 循环一起使用（或者，如上所述，将其转换为 的实现） IEnumerable）。但值得注意的是，调用 GetEnumerator() 会产生 Dispose 生成的 IEnumerator 的义务，以及执行该操作的类如果没有任何东西在尾部调用 GetEnumerator() ，那么头/尾分割将无法做到这一点。

While other approaches here suggest using yield return for the tail enumerable, such an approach adds unnecessary nesting overhead. A better approach would be to convert the Enumerator<T> back into something that can be used with foreach:

public struct WrappedEnumerator<T>
{
    T myEnumerator;
    public T GetEnumerator() { return myEnumerator; }
    public WrappedEnumerator(T theEnumerator) { myEnumerator = theEnumerator; }
}
public static class AsForEachHelper
{
    static public WrappedEnumerator<IEnumerator<T>> AsForEach<T>(this IEnumerator<T> theEnumerator) {return new WrappedEnumerator<IEnumerator<T>>(theEnumerator);}

    static public WrappedEnumerator<System.Collections.IEnumerator> AsForEach(this System.Collections.IEnumerator theEnumerator) 
        { return new WrappedEnumerator<System.Collections.IEnumerator>(theEnumerator); }
}

If one used separate WrappedEnumerator structs for the generic IEnumerable<T> and non-generic IEnumerable, one could have them implement IEnumerable<T> and IEnumerable respectively; they wouldn't really obey the IEnumerable<T> contract, though, which specifies that it should be possible to possible to call GetEnumerator() multiple times, with each call returning an independent enumerator.

Another important caveat is that if one uses AsForEach on an IEnumerator<T>, the resulting WrappedEnumerator should be enumerated exactly once. If it is never enumerated, the underlying IEnumerator<T> will never have its Dispose method called.

Applying the above-supplied methods to the problem at hand, it would be easy to call GetEnumerator() on an IEnumerable<T>, read out the first few items, and then use AsForEach() to convert the remainder so it can be used with a ForEach loop (or perhaps, as noted above, to convert it into an implementation of IEnumerable<T>). It's important to note, however, that calling GetEnumerator() creates an obligation to Dispose the resulting IEnumerator<T>, and the class that performs the head/tail split would have no way to do that if nothing ever calls GetEnumerator() on the tail.