使用 LINQ 透视数据

发布于 2024-07-23 14:21:30 字数 563 浏览 3 评论 0原文

我有一个包含枚举（TypeCode）和用户对象的项目集合，我需要将其展平以显示在网格中。这很难解释，所以让我举一个简单的例子。

集合具有如下项目：

TypeCode | User 
---------------
1        | Don Smith  
1        | Mike Jones  
1        | James Ray  
2        | Tom Rizzo  
2        | Alex Homes  
3        | Andy Bates

我需要输出为：

1          | 2          | 3  
Don Smith  | Tom Rizzo  | Andy Bates  
Mike Jones | Alex Homes |  
James Ray  |            |

我尝试使用 foreach 执行此操作，但我不能这样做，因为我会将新项目插入到 foreach 中的集合中，从而导致错误。

这可以在 Linq 中以更简洁的方式完成吗？

原文

I have a collection of items that contain an Enum (TypeCode) and a User object, and I need to flatten it out to show in a grid. It's hard to explain, so let me show a quick example.

Collection has items like so:

TypeCode | User 
---------------
1        | Don Smith  
1        | Mike Jones  
1        | James Ray  
2        | Tom Rizzo  
2        | Alex Homes  
3        | Andy Bates

I need the output to be:

1          | 2          | 3  
Don Smith  | Tom Rizzo  | Andy Bates  
Mike Jones | Alex Homes |  
James Ray  |            |

I've tried doing this using foreach, but I can't do it that way because I'd be inserting new items to the collection in the foreach, causing an error.

Can this be done in Linq in a cleaner fashion?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

幽蝶幻影 2024-07-30 14:21:30

我并不是说这是一种伟大的转型方式 - 但它是一个转型......

    // sample data
    var data = new[] {
        new { Foo = 1, Bar = "Don Smith"},
        new { Foo = 1, Bar = "Mike Jones"},
        new { Foo = 1, Bar = "James Ray"},
        new { Foo = 2, Bar = "Tom Rizzo"},
        new { Foo = 2, Bar = "Alex Homes"},
        new { Foo = 3, Bar = "Andy Bates"},
    };
    // group into columns, and select the rows per column
    var grps = from d in data
              group d by d.Foo
              into grp
              select new {
                  Foo = grp.Key,
                  Bars = grp.Select(d2 => d2.Bar).ToArray()
              };

    // find the total number of (data) rows
    int rows = grps.Max(grp => grp.Bars.Length);

    // output columns
    foreach (var grp in grps) {
        Console.Write(grp.Foo + "\t");
    }
    Console.WriteLine();
    // output data
    for (int i = 0; i < rows; i++) {
        foreach (var grp in grps) {
            Console.Write((i < grp.Bars.Length ? grp.Bars[i] : null) + "\t");
        }
        Console.WriteLine();
    }

I'm not saying it is a great way to pivot - but it is a pivot...

    // sample data
    var data = new[] {
        new { Foo = 1, Bar = "Don Smith"},
        new { Foo = 1, Bar = "Mike Jones"},
        new { Foo = 1, Bar = "James Ray"},
        new { Foo = 2, Bar = "Tom Rizzo"},
        new { Foo = 2, Bar = "Alex Homes"},
        new { Foo = 3, Bar = "Andy Bates"},
    };
    // group into columns, and select the rows per column
    var grps = from d in data
              group d by d.Foo
              into grp
              select new {
                  Foo = grp.Key,
                  Bars = grp.Select(d2 => d2.Bar).ToArray()
              };

    // find the total number of (data) rows
    int rows = grps.Max(grp => grp.Bars.Length);

    // output columns
    foreach (var grp in grps) {
        Console.Write(grp.Foo + "\t");
    }
    Console.WriteLine();
    // output data
    for (int i = 0; i < rows; i++) {
        foreach (var grp in grps) {
            Console.Write((i < grp.Bars.Length ? grp.Bars[i] : null) + "\t");
        }
        Console.WriteLine();
    }

回复收藏 0 原文

陈甜 2024-07-30 14:21:30

马克的答案给出了稀疏矩阵，不能直接输入网格。
我尝试从 Vasu 提供的链接扩展代码，如下所示：

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
    return source.GroupBy(key1Selector).Select(
        x => new
        {
            X = x.Key,
            Y = source.GroupBy(key2Selector).Select(
                z => new
                {
                    Z = z.Key,
                    V = aggregate(from item in source
                                  where key1Selector(item).Equals(x.Key)
                                  && key2Selector(item).Equals(z.Key)
                                  select item
                    )

                }
            ).ToDictionary(e => e.Z, o => o.V)
        }
    ).ToDictionary(e => e.X, o => o.Y);
} 
internal class Employee
{
    public string Name { get; set; }
    public string Department { get; set; }
    public string Function { get; set; }
    public decimal Salary { get; set; }
}
public void TestLinqExtenions()
{
    var l = new List<Employee>() {
    new Employee() { Name = "Fons", Department = "R&D", Function = "Trainer", Salary = 2000 },
    new Employee() { Name = "Jim", Department = "R&D", Function = "Trainer", Salary = 3000 },
    new Employee() { Name = "Ellen", Department = "Dev", Function = "Developer", Salary = 4000 },
    new Employee() { Name = "Mike", Department = "Dev", Function = "Consultant", Salary = 5000 },
    new Employee() { Name = "Jack", Department = "R&D", Function = "Developer", Salary = 6000 },
    new Employee() { Name = "Demy", Department = "Dev", Function = "Consultant", Salary = 2000 }};

    var result5 = l.Pivot3(emp => emp.Department, emp2 => emp2.Function, lst => lst.Sum(emp => emp.Salary));
    var result6 = l.Pivot3(emp => emp.Function, emp2 => emp2.Department, lst => lst.Count());
}

* 可以不过，不要谈论任何有关性能的事情。

Marc's answer gives sparse matrix that can't be pumped into Grid directly.
I tried to expand the code from the link provided by Vasu as below:

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
    return source.GroupBy(key1Selector).Select(
        x => new
        {
            X = x.Key,
            Y = source.GroupBy(key2Selector).Select(
                z => new
                {
                    Z = z.Key,
                    V = aggregate(from item in source
                                  where key1Selector(item).Equals(x.Key)
                                  && key2Selector(item).Equals(z.Key)
                                  select item
                    )

                }
            ).ToDictionary(e => e.Z, o => o.V)
        }
    ).ToDictionary(e => e.X, o => o.Y);
} 
internal class Employee
{
    public string Name { get; set; }
    public string Department { get; set; }
    public string Function { get; set; }
    public decimal Salary { get; set; }
}
public void TestLinqExtenions()
{
    var l = new List<Employee>() {
    new Employee() { Name = "Fons", Department = "R&D", Function = "Trainer", Salary = 2000 },
    new Employee() { Name = "Jim", Department = "R&D", Function = "Trainer", Salary = 3000 },
    new Employee() { Name = "Ellen", Department = "Dev", Function = "Developer", Salary = 4000 },
    new Employee() { Name = "Mike", Department = "Dev", Function = "Consultant", Salary = 5000 },
    new Employee() { Name = "Jack", Department = "R&D", Function = "Developer", Salary = 6000 },
    new Employee() { Name = "Demy", Department = "Dev", Function = "Consultant", Salary = 2000 }};

    var result5 = l.Pivot3(emp => emp.Department, emp2 => emp2.Function, lst => lst.Sum(emp => emp.Salary));
    var result6 = l.Pivot3(emp => emp.Function, emp2 => emp2.Department, lst => lst.Count());
}

* can't say anything about the performance though.

回复收藏 0 原文

南街九尾狐 2024-07-30 14:21:30

我想这与马克的答案类似，但我会发布它，因为我花了一些时间来研究它。结果由 " | " 分隔，如您的示例所示。在使用 group by 时，它还使用从 LINQ 查询返回的 IGrouping 类型，而不是构造新的匿名类型。这是经过测试的工作代码。

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 1, UserName = "Mike Jones"},
    new { TypeCode = 1, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};
var Columns = from i in Items
              group i.UserName by i.TypeCode;
Dictionary<int, List<string>> Rows = new Dictionary<int, List<string>>();
int RowCount = Columns.Max(g => g.Count());
for (int i = 0; i <= RowCount; i++) // Row 0 is the header row.
{
    Rows.Add(i, new List<string>());
}
int RowIndex;
foreach (IGrouping<int, string> c in Columns)
{
    Rows[0].Add(c.Key.ToString());
    RowIndex = 1;
    foreach (string user in c)
    {
        Rows[RowIndex].Add(user);
        RowIndex++;
    }
    for (int r = RowIndex; r <= Columns.Count(); r++)
    {
        Rows[r].Add(string.Empty);
    }
}
foreach (List<string> row in Rows.Values)
{
    Console.WriteLine(row.Aggregate((current, next) => current + " | " + next));
}
Console.ReadLine();

我还使用以下输入对其进行了测试：

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 3, UserName = "Mike Jones"},
    new { TypeCode = 3, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};

产生了以下结果，表明第一列不需要包含最长的列表。如果需要，您可以使用 OrderBy 获取按 TypeCode 排序的列。

1         | 3          | 2
Don Smith | Mike Jones | Tom Rizzo
          | James Ray  | Alex Homes
          | Andy Bates |

I guess this is similar to Marc's answer, but I'll post it since I spent some time working on it. The results are separated by " | " as in your example. It also uses the IGrouping<int, string> type returned from the LINQ query when using a group by instead of constructing a new anonymous type. This is tested, working code.

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 1, UserName = "Mike Jones"},
    new { TypeCode = 1, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};
var Columns = from i in Items
              group i.UserName by i.TypeCode;
Dictionary<int, List<string>> Rows = new Dictionary<int, List<string>>();
int RowCount = Columns.Max(g => g.Count());
for (int i = 0; i <= RowCount; i++) // Row 0 is the header row.
{
    Rows.Add(i, new List<string>());
}
int RowIndex;
foreach (IGrouping<int, string> c in Columns)
{
    Rows[0].Add(c.Key.ToString());
    RowIndex = 1;
    foreach (string user in c)
    {
        Rows[RowIndex].Add(user);
        RowIndex++;
    }
    for (int r = RowIndex; r <= Columns.Count(); r++)
    {
        Rows[r].Add(string.Empty);
    }
}
foreach (List<string> row in Rows.Values)
{
    Console.WriteLine(row.Aggregate((current, next) => current + " | " + next));
}
Console.ReadLine();

I also tested it with this input:

var Items = new[] {
    new { TypeCode = 1, UserName = "Don Smith"},
    new { TypeCode = 3, UserName = "Mike Jones"},
    new { TypeCode = 3, UserName = "James Ray"},
    new { TypeCode = 2, UserName = "Tom Rizzo"},
    new { TypeCode = 2, UserName = "Alex Homes"},
    new { TypeCode = 3, UserName = "Andy Bates"}
};

Which produced the following results showing that the first column doesn't need to contain the longest list. You could use OrderBy to get the columns ordered by TypeCode if needed.

1         | 3          | 2
Don Smith | Mike Jones | Tom Rizzo
          | James Ray  | Alex Homes
          | Andy Bates |

回复收藏 0 原文

陈年往事 2024-07-30 14:21:30

您可以使用 Linq 的 .ToLookup 以您要查找的方式进行分组。

var lookup = data.ToLookup(d => d.TypeCode, d => d.User);

然后就是将其转化为消费者可以理解的形式的问题。例如：

//Warning: untested code
var enumerators = lookup.Select(g => g.GetEnumerator()).ToList();
int columns = enumerators.Count;
while(columns > 0)
{
  for(int i = 0; i < enumerators.Count; ++i)
  {
    var enumerator = enumerators[i];
    if(enumator == null) continue;
    if(!enumerator.MoveNext())
    { 
      --columns;
      enumerators[i] = null;
    }
  }
  yield return enumerators.Select(e => (e != null) ? e.Current : null);
}

将其放入 IEnumerable<> 中方法，它（可能）会返回 User 集合（列）的集合（行），其中 null 被放入没有数据的列中。

You can use Linq's .ToLookup to group in the manner you are looking for.

var lookup = data.ToLookup(d => d.TypeCode, d => d.User);

Then it's a matter of putting it into a form that your consumer can make sense of. For instance:

//Warning: untested code
var enumerators = lookup.Select(g => g.GetEnumerator()).ToList();
int columns = enumerators.Count;
while(columns > 0)
{
  for(int i = 0; i < enumerators.Count; ++i)
  {
    var enumerator = enumerators[i];
    if(enumator == null) continue;
    if(!enumerator.MoveNext())
    { 
      --columns;
      enumerators[i] = null;
    }
  }
  yield return enumerators.Select(e => (e != null) ? e.Current : null);
}

Put that in an IEnumerable<> method and it will (probably) return a collection (rows) of collections (column) of User where a null is put in a column that has no data.

回复收藏 0 原文

蓬勃野心 2024-07-30 14:21:30

@Sanjaya.Tio 我对你的答案很感兴趣，并创建了这个适应，最大限度地减少了 keySelector 的执行。（未经测试）

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
  var lookup = source.ToLookup(x => new {Key1 = key1Selector(x), Key2 = key2Selector(x)});

  List<TKey1> key1s = lookup.Select(g => g.Key.Key1).Distinct().ToList();
  List<TKey2> key2s = lookup.Select(g => g.Key.Key2).Distinct().ToList();

  var resultQuery =
    from key1 in key1s
    from key2 in key2s
    let lookupKey = new {Key1 = key1, Key2 = key2}
    let g = lookup[lookupKey]
    let resultValue = g.Any() ? aggregate(g) : default(TValue)
    select new {Key1 = key1, Key2 = key2, ResultValue = resultValue};

  Dictionary<TKey1, Dictionary<TKey2, TValue>> result = new Dictionary<TKey1, Dictionary<TKey2, TValue>>();
  foreach(var resultItem in resultQuery)
  {
    TKey1 key1 = resultItem.Key1;
    TKey2 key2 = resultItem.Key2;
    TValue resultValue = resultItem.ResultValue;

    if (!result.ContainsKey(key1))
    {
      result[key1] = new Dictionary<TKey2, TValue>();
    }
    var subDictionary = result[key1];
    subDictionary[key2] = resultValue; 
  }
  return result;
}

@Sanjaya.Tio I was intrigued by your answer and created this adaptation which minimizes keySelector execution. (untested)

public static Dictionary<TKey1, Dictionary<TKey2, TValue>> Pivot3<TSource, TKey1, TKey2, TValue>(
    this IEnumerable<TSource> source
    , Func<TSource, TKey1> key1Selector
    , Func<TSource, TKey2> key2Selector
    , Func<IEnumerable<TSource>, TValue> aggregate)
{
  var lookup = source.ToLookup(x => new {Key1 = key1Selector(x), Key2 = key2Selector(x)});

  List<TKey1> key1s = lookup.Select(g => g.Key.Key1).Distinct().ToList();
  List<TKey2> key2s = lookup.Select(g => g.Key.Key2).Distinct().ToList();

  var resultQuery =
    from key1 in key1s
    from key2 in key2s
    let lookupKey = new {Key1 = key1, Key2 = key2}
    let g = lookup[lookupKey]
    let resultValue = g.Any() ? aggregate(g) : default(TValue)
    select new {Key1 = key1, Key2 = key2, ResultValue = resultValue};

  Dictionary<TKey1, Dictionary<TKey2, TValue>> result = new Dictionary<TKey1, Dictionary<TKey2, TValue>>();
  foreach(var resultItem in resultQuery)
  {
    TKey1 key1 = resultItem.Key1;
    TKey2 key2 = resultItem.Key2;
    TValue resultValue = resultItem.ResultValue;

    if (!result.ContainsKey(key1))
    {
      result[key1] = new Dictionary<TKey2, TValue>();
    }
    var subDictionary = result[key1];
    subDictionary[key2] = resultValue; 
  }
  return result;
}

回复收藏 0 原文

~没有更多了~