当前位置：文江博客话题详情

C# Iterator enumerator

如何同时迭代两个数组？

发布于 2024-07-13 03:55:14 字数 427 浏览 8 评论 0 原文

我在解析文本文件时构建了两个数组。第一个包含列名称，第二个包含当前行的值。我需要一次迭代两个列表来构建地图。现在我有以下内容：

var currentValues = currentRow.Split(separatorChar);
var valueEnumerator = currentValues.GetEnumerator();

foreach (String column in columnList)
{
    valueEnumerator.MoveNext();
    valueMap.Add(column, (String)valueEnumerator.Current);
}

这工作得很好，但它并不能完全满足我的优雅感，并且如果数组的数量大于两个（我偶尔必须这样做），它会变得非常毛茸茸。还有人有其他更简洁的成语吗？

原文

I have two arrays built while parsing a text file. The first contains the column names, the second contains the values from the current row. I need to iterate over both lists at once to build a map. Right now I have the following:

var currentValues = currentRow.Split(separatorChar);
var valueEnumerator = currentValues.GetEnumerator();

foreach (String column in columnList)
{
    valueEnumerator.MoveNext();
    valueMap.Add(column, (String)valueEnumerator.Current);
}

This works just fine, but it doesn't quite satisfy my sense of elegance, and it gets really hairy if the number of arrays is larger than two (as I have to do occasionally). Does anyone have another, terser idiom?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

姐不稀罕 2024-07-20 03:55:15

您可以创建一个二维数组或一个字典（这会更好），而不是创建两个单独的数组。但实际上，如果它有效，我不会尝试改变它。

回复收藏 0 原文

琉璃繁缕 2024-07-20 03:55:14

您的初始代码中有一个不明显的伪错误 - IEnumerator 扩展了 IDisposable，因此您应该处置它。这对于迭代器块来说非常重要！对于数组来说这不是问题，但对于其他 IEnumerable 实现来说却是个问题。

我会这样做：

public static IEnumerable<TResult> PairUp<TFirst,TSecond,TResult>
    (this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence,
     Func<TFirst,TSecond,TResult> projection)
{
    using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
    {
        foreach (TFirst first in source)
        {
            if (!secondIter.MoveNext())
            {
                throw new ArgumentException
                    ("First sequence longer than second");
            }
            yield return projection(first, secondIter.Current);
        }
        if (secondIter.MoveNext())
        {
            throw new ArgumentException
                ("Second sequence longer than first");
        }
    }        
}

然后您可以在需要时重用它：

foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar),
             (column, value) => new { column, value })
{
    // Do something
}

或者您可以创建一个通用的 Pair 类型，并摆脱 PairUp 方法中的投影参数。

编辑：

对于 Pair 类型，调用代码将如下所示：

foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar))
{
    // column = pair.First, value = pair.Second
}

这看起来非常简单。是的，您需要将实用方法作为可重用代码放在某处。在我看来这几乎不是问题。现在对于多个数组...

如果数组具有不同类型，我们就会遇到问题。您无法在泛型方法/类型声明中表达任意数量的类型参数 - 您可以根据需要为任意数量的类型参数编写 PairUp 版本，就像 Action 和 一样Func 委托最多 4 个委托参数 - 但你不能让它任意。

但是，如果这些值都属于同一类型 - 并且如果您愿意坚持使用数组 - 这很容易。（非数组也可以，但你不能提前进行长度检查。）你可以这样做：

public static IEnumerable<T[]> Zip<T>(params T[][] sources)
{
    // (Insert error checking code here for null or empty sources parameter)

    int length = sources[0].Length;
    if (!sources.All(array => array.Length == length))
    {
        throw new ArgumentException("Arrays must all be of the same length");
    }

    for (int i=0; i < length; i++)
    {
        // Could do this bit with LINQ if you wanted
        T[] result = new T[sources.Length];
        for (int j=0; j < result.Length; j++)
        {
             result[j] = sources[j][i];
        }
        yield return result;
    }
}

那么调用代码将是：

foreach (var array in Zip(columns, row, whatevers))
{
    // column = array[0]
    // value = array[1]
    // whatever = array[2]
}

这涉及一定量的复制，当然 - 你正在创建一个每次都数组。你可以通过引入另一种类型来改变这一点：

public struct Snapshot<T>
{
    readonly T[][] sources;
    readonly int index;

    public Snapshot(T[][] sources, int index)
    {
        this.sources = sources;
        this.index = index;
    }

    public T this[int element]
    {
        return sources[element][index];
    }
}

虽然大多数人可能会认为这太过分了;)

老实说，我可以不断想出各种想法......但基本原理是：

用一点点可重用的工作，您可以使调用代码更好
对于任意类型的组合，由于泛型的工作方式，您必须单独执行每个参数数量（2、3、4...）
如果您乐意使用每个部分的类型相同，你可以做得更好

You've got a non-obvious pseudo-bug in your initial code - IEnumerator<T> extends IDisposable so you should dispose it. This can be very important with iterator blocks! Not a problem for arrays, but would be with other IEnumerable<T> implementations.

I'd do it like this:

public static IEnumerable<TResult> PairUp<TFirst,TSecond,TResult>
    (this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence,
     Func<TFirst,TSecond,TResult> projection)
{
    using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
    {
        foreach (TFirst first in source)
        {
            if (!secondIter.MoveNext())
            {
                throw new ArgumentException
                    ("First sequence longer than second");
            }
            yield return projection(first, secondIter.Current);
        }
        if (secondIter.MoveNext())
        {
            throw new ArgumentException
                ("Second sequence longer than first");
        }
    }        
}

Then you can reuse this whenever you have the need:

foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar),
             (column, value) => new { column, value })
{
    // Do something
}

Alternatively you could create a generic Pair type, and get rid of the projection parameter in the PairUp method.

EDIT:

With the Pair type, the calling code would look like this:

foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar))
{
    // column = pair.First, value = pair.Second
}

That looks about as simple as you can get. Yes, you need to put the utility method somewhere, as reusable code. Hardly a problem in my view. Now for multiple arrays...

If the arrays are of different types, we have a problem. You can't express an arbitrary number of type parameters in a generic method/type declaration - you could write versions of PairUp for as many type parameters as you wanted, just like there are Action and Func delegates for up to 4 delegate parameters - but you can't make it arbitrary.

If the values will all be of the same type, however - and if you're happy to stick to arrays - it's easy. (Non-arrays is okay too, but you can't do the length checking ahead of time.) You could do this:

public static IEnumerable<T[]> Zip<T>(params T[][] sources)
{
    // (Insert error checking code here for null or empty sources parameter)

    int length = sources[0].Length;
    if (!sources.All(array => array.Length == length))
    {
        throw new ArgumentException("Arrays must all be of the same length");
    }

    for (int i=0; i < length; i++)
    {
        // Could do this bit with LINQ if you wanted
        T[] result = new T[sources.Length];
        for (int j=0; j < result.Length; j++)
        {
             result[j] = sources[j][i];
        }
        yield return result;
    }
}

Then the calling code would be:

foreach (var array in Zip(columns, row, whatevers))
{
    // column = array[0]
    // value = array[1]
    // whatever = array[2]
}

This involves a certain amount of copying, of course - you're creating an array each time. You could change that by introducing another type like this:

public struct Snapshot<T>
{
    readonly T[][] sources;
    readonly int index;

    public Snapshot(T[][] sources, int index)
    {
        this.sources = sources;
        this.index = index;
    }

    public T this[int element]
    {
        return sources[element][index];
    }
}

This would probably be regarded as overkill by most though ;)

I could keep coming up with all kinds of ideas, to be honest... but the basics are:

With a little bit of reusable work, you can make the calling code nicer
For arbitrary combinations of types you'll have to do each number of parameters (2, 3, 4...) separately due to the way generics works
If you're happy to use the same type for each part, you can do better

回复收藏 0 原文

风流物 2024-07-20 03:55:14

如果列名的数量与每行中元素的数量相同，是否可以不使用 for 循环？

var currentValues = currentRow.Split(separatorChar);

for(var i=0;i<columnList.Length;i++){
   // use i to index both (or all) arrays and build your map
}

if there are the same number of column names as there are elements in each row, could you not use a for loop?

var currentValues = currentRow.Split(separatorChar);

for(var i=0;i<columnList.Length;i++){
   // use i to index both (or all) arrays and build your map
}

回复收藏 0 原文

执手闯天涯 2024-07-20 03:55:14

在函数式语言中，您通常会找到一个“zip”函数，它有望成为 C#4.0 的一部分。 Bart de Smet 提供了一个基于现有 LINQ 函数的有趣的 zip 实现：

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
  this IEnumerable<TFirst> first, 
  IEnumerable<TSecond> second, 
  Func<TFirst, TSecond, TResult> func)
{
  return first.Select((x, i) => new { X = x, I = i })
    .Join(second.Select((x, i) => new { X = x, I = i }), 
    o => o.I, 
    i => i.I, 
    (o, i) => func(o.X, i.X));
}

然后你可以这样做：

  int[] s1 = new [] { 1, 2, 3 };
  int[] s2 = new[] { 4, 5, 6 };
  var result = s1.Zip(s2, (i1, i2) => new {Value1 = i1, Value2 = i2});

In a functional language you would usually find a "zip" function which will hopefully be part of a C#4.0 . Bart de Smet provides a funny implementation of zip based on existing LINQ functions:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
  this IEnumerable<TFirst> first, 
  IEnumerable<TSecond> second, 
  Func<TFirst, TSecond, TResult> func)
{
  return first.Select((x, i) => new { X = x, I = i })
    .Join(second.Select((x, i) => new { X = x, I = i }), 
    o => o.I, 
    i => i.I, 
    (o, i) => func(o.X, i.X));
}

Then you can do:

  int[] s1 = new [] { 1, 2, 3 };
  int[] s2 = new[] { 4, 5, 6 };
  var result = s1.Zip(s2, (i1, i2) => new {Value1 = i1, Value2 = i2});

回复收藏 0 原文

故笙诉离歌 2024-07-20 03:55:14

如果您确实使用数组，最好的方法可能就是使用带有索引的传统 for 循环。当然，这并不是那么好，但据我所知，.NET 并没有提供更好的方法来做到这一点。

您还可以将代码封装到名为 zip 的方法中 - 这是一个常见的高阶列表函数。然而，C#缺乏合适的Tuple类型，这是相当粗糙的。您最终会返回一个 IEnumerable> ，这不是很好。

顺便说一句，您真的使用 IEnumerable 而不是 IEnumerable 或者为什么要转换 Current 值？

回复收藏 0 原文

冷心人i 2024-07-20 03:55:14

两者都使用 IEnumerator 会很好

var currentValues = currentRow.Split(separatorChar);
using (IEnumerator<string> valueEnum = currentValues.GetEnumerator(), columnEnum = columnList.GetEnumerator()) {
    while (valueEnum.MoveNext() && columnEnum.MoveNext())
        valueMap.Add(columnEnum.Current, valueEnum.Current);
}

一个扩展方法

public static IEnumerable<TResult> Zip<T1, T2, TResult>(this IEnumerable<T1> source, IEnumerable<T2> other, Func<T1, T2, TResult> selector) {
    using (IEnumerator<T1> sourceEnum = source.GetEnumerator()) {
        using (IEnumerator<T2> otherEnum = other.GetEnumerator()) {
            while (sourceEnum.MoveNext() && columnEnum.MoveNext())
                yield return selector(sourceEnum.Current, otherEnum.Current);
        }
    }
}

或者创建

var currentValues = currentRow.Split(separatorChar);
foreach (var valueColumnPair in currentValues.Zip(columnList, (a, b) => new { Value = a, Column = b }) {
    valueMap.Add(valueColumnPair.Column, valueColumnPair.Value);
}

Use IEnumerator for both would be nice

var currentValues = currentRow.Split(separatorChar);
using (IEnumerator<string> valueEnum = currentValues.GetEnumerator(), columnEnum = columnList.GetEnumerator()) {
    while (valueEnum.MoveNext() && columnEnum.MoveNext())
        valueMap.Add(columnEnum.Current, valueEnum.Current);
}

Or create an extension methods

public static IEnumerable<TResult> Zip<T1, T2, TResult>(this IEnumerable<T1> source, IEnumerable<T2> other, Func<T1, T2, TResult> selector) {
    using (IEnumerator<T1> sourceEnum = source.GetEnumerator()) {
        using (IEnumerator<T2> otherEnum = other.GetEnumerator()) {
            while (sourceEnum.MoveNext() && columnEnum.MoveNext())
                yield return selector(sourceEnum.Current, otherEnum.Current);
        }
    }
}

Usage

var currentValues = currentRow.Split(separatorChar);
foreach (var valueColumnPair in currentValues.Zip(columnList, (a, b) => new { Value = a, Column = b }) {
    valueMap.Add(valueColumnPair.Column, valueColumnPair.Value);
}

回复收藏 0 原文

~没有更多了~