LINQ GroupBy 连续时间

发布于 2024-12-15 05:27:15 字数 1058 浏览 4 评论 0原文

假设我有一个如下所示的简单结构：

public class Range
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public Range(DateTime start, DateTime end)
    {
        this.Start = start;
        this.End = end;
    }
}

我创建了一个像这样的集合：

var dr1 = new Range(new DateTime(2011, 11, 1, 12, 0, 0), 
    new DateTime(2011, 11, 1, 13, 0, 0));
var dr2 = new Range(new DateTime(2011, 11, 1, 13, 0, 0), 
    new DateTime(2011, 11, 1, 14, 0, 0));
var dr3 = new Range(new DateTime(2011, 11, 1, 14, 0, 0), 
    new DateTime(2011, 11, 1, 15, 0, 0));
var dr4 = new Range(new DateTime(2011, 11, 1, 16, 0, 0), 
    new DateTime(2011, 11, 1, 17, 0, 0));

var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };

我想要做的是将连续的范围进行分组 - 即，如果前一个范围的结束值与下一个范围相同，则它们是连续的下一个开始。

我们可以假设范围值不存在冲突/重复或重叠。

在发布的示例中，我最终会得到两个组：

2011-11-1 12:00:00 - 2011-11-1 15:00:00

2011-11-1 16:00:00 - 2011-11-1 17:00:00

为此提出一个迭代解决方案相当容易。但是我可以使用一些 LINQ 魔法来以漂亮的单行代码实现这一点吗？

原文

Assuming I have a simple structure that looks like this:

public class Range
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public Range(DateTime start, DateTime end)
    {
        this.Start = start;
        this.End = end;
    }
}

And I create an collection like so:

var dr1 = new Range(new DateTime(2011, 11, 1, 12, 0, 0), 
    new DateTime(2011, 11, 1, 13, 0, 0));
var dr2 = new Range(new DateTime(2011, 11, 1, 13, 0, 0), 
    new DateTime(2011, 11, 1, 14, 0, 0));
var dr3 = new Range(new DateTime(2011, 11, 1, 14, 0, 0), 
    new DateTime(2011, 11, 1, 15, 0, 0));
var dr4 = new Range(new DateTime(2011, 11, 1, 16, 0, 0), 
    new DateTime(2011, 11, 1, 17, 0, 0));

var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };

What I want to do is group the ranges where they are continuous - i.e. they are continuous if the End value of the previous Range is the same as the Start of the next.

We can assume that there are no collisions/duplicates or overlaps in the Range values.

In the example posted, I would end up with two groups:

2011-11-1 12:00:00 - 2011-11-1 15:00:00

2011-11-1 16:00:00 - 2011-11-1 17:00:00

It's fairly easy to come up with an iterative solution for this. But is there some LINQ magic I can use to get this in a pretty one-liner?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

三五鸿雁 2024-12-22 05:27:15

您最好的选择是使用 yield 和扩展方法：

static IEnumerable<Range> GroupContinuous(this IEnumerable<Range> ranges)
{
    // Validate parameters.

    // Can order by start date, no overlaps, no collisions
    ranges = ranges.OrderBy(r => r.Start);

    // Get the enumerator.
    using (IEnumerator<Range> enumerator = ranges.GetEnumerator();
    {
        // Move to the first item, if nothing, break.
        if (!enumerator.MoveNext()) yield break;

        // Set the previous range.
        Range previous = enumerator.Current;

        // Cycle while there are more items.
        while (enumerator.MoveNext())
        {
            // Get the current item.
            Range current = enumerator.Current;

            // If the start date is equal to the end date
            // then merge with the previous and continue.
            if (current.Start == previous.End)
            {
                // Merge.
                previous = new Range(previous.Start, current.End);

                // Continue.
                continue;
            }

            // Yield the previous item.
            yield return previous;

            // The previous item is the current item.
            previous = current;
        }

        // Yield the previous item.
        yield return previous;
    }
}

授予，对 OrderBy 的调用正在进行引起充分ranges 序列的迭代，但这是无法避免的。一旦您订购了它，您就可以避免在返回结果之前必须实现结果；如果条件允许，您只需产生结果即可。

但是，如果您知道序列是有序的，那么您根本不必调用 OrderBy，并且可以在遍历列表并中断时yield 项不同的折叠 Range 实例。

最终，如果序列是无序的，那么您有两个选择：

对列表进行排序，然后进行处理（请记住，OrderBy 也会延迟，但必须使用一次完整迭代来对序列进行排序），当您有一个要处理的项目时，使用 yield 返回该项目
立即处理整个序列并作为整个物化序列返回

Your best bet is to use yield and an extension method:

static IEnumerable<Range> GroupContinuous(this IEnumerable<Range> ranges)
{
    // Validate parameters.

    // Can order by start date, no overlaps, no collisions
    ranges = ranges.OrderBy(r => r.Start);

    // Get the enumerator.
    using (IEnumerator<Range> enumerator = ranges.GetEnumerator();
    {
        // Move to the first item, if nothing, break.
        if (!enumerator.MoveNext()) yield break;

        // Set the previous range.
        Range previous = enumerator.Current;

        // Cycle while there are more items.
        while (enumerator.MoveNext())
        {
            // Get the current item.
            Range current = enumerator.Current;

            // If the start date is equal to the end date
            // then merge with the previous and continue.
            if (current.Start == previous.End)
            {
                // Merge.
                previous = new Range(previous.Start, current.End);

                // Continue.
                continue;
            }

            // Yield the previous item.
            yield return previous;

            // The previous item is the current item.
            previous = current;
        }

        // Yield the previous item.
        yield return previous;
    }
}

Granted, the call to OrderBy is going to cause a full iteration of the ranges sequence, but there's no avoiding that. Once you have it ordered, you can prevent having to materialize your results before returning them; you simply yield the results if the conditions dictate.

If you know that the sequence is ordered, however, then you don't have to call OrderBy at all, and you can yield items as you traverse the list and break on different collapsed Range instances.

Ultimately, if the sequence is unordered, then you have two options:

Order the list and then process (remember, OrderBy is deferred as well, but will have to use one full iteration to order the sequence), using yield to return the item when you have one to process
Process the entire sequence at once and return as an entire materialized sequence

回复收藏 0 原文

心房敞 2024-12-22 05:27:15

casperOne 扩展方法的通用版本，如下使用：

var items = new[]
    {
        // Range 1
        new { A = 0, B = 1 },
        new { A = 1, B = 2 },
        new { A = 2, B = 3 },
        new { A = 3, B = 4 },
        // Range 2
        new { A = 5, B = 6 },
        new { A = 6, B = 7 },
        new { A = 7, B = 8 },
        new { A = 8, B = 9 },
    };

var ranges = items.ContinousRange(
    x => x.A,
    x => x.B,
    (lower, upper) => new { A = lower, B = upper });

foreach(var range in ranges)
{
    Console.WriteLine("{0} - {1}", range.A, range.B);
}

扩展方法的实现

    /// <summary>
    /// Calculates continues ranges based on individual elements lower and upper selections. Cannot compensate for overlapping.
    /// </summary>
    /// <typeparam name="T">The type containing a range</typeparam>
    /// <typeparam name="T1">The type of range values</typeparam>
    /// <param name="source">The ranges to be combined</param>
    /// <param name="lowerSelector">The range's lower bound</param>
    /// <param name="upperSelector">The range's upper bound</param>
    /// <param name="factory">A factory to create a new range</param>
    /// <returns>An enumeration of continuous ranges</returns>
    public static IEnumerable<T> ContinousRange<T, T1>(this IEnumerable<T> source, Func<T, T1> lowerSelector, Func<T, T1> upperSelector, Func<T1, T1, T> factory)
    {
        //Validate parameters

        // Can order by start date, no overlaps, no collisions
        source = source.OrderBy(lowerSelector);

        // Get enumerator
        using(var enumerator = source.GetEnumerator())
        {
            // Move to the first item, if nothing, break.
            if (!enumerator.MoveNext()) yield break;

            // Set the previous range.
            var previous = enumerator.Current;

            // Cycle while there are more items
            while(enumerator.MoveNext())
            {
                // Get the current item.
                var current = enumerator.Current;

                // If the start date is equal to the end date
                // then merge with the previoud and continue
                if (lowerSelector(current).Equals(upperSelector(previous)))
                {
                    // Merge
                    previous = factory(lowerSelector(previous), upperSelector(current));

                    // Continue
                    continue;
                }

                // Yield the previous item.
                yield return previous;

                // The previous item is the current item.
                previous = current;
            }

            // Yield the previous item.
            yield return previous;
        }
    }

A generic version of casperOne's extension method, used as such:

var items = new[]
    {
        // Range 1
        new { A = 0, B = 1 },
        new { A = 1, B = 2 },
        new { A = 2, B = 3 },
        new { A = 3, B = 4 },
        // Range 2
        new { A = 5, B = 6 },
        new { A = 6, B = 7 },
        new { A = 7, B = 8 },
        new { A = 8, B = 9 },
    };

var ranges = items.ContinousRange(
    x => x.A,
    x => x.B,
    (lower, upper) => new { A = lower, B = upper });

foreach(var range in ranges)
{
    Console.WriteLine("{0} - {1}", range.A, range.B);
}

Implementation of extension method

    /// <summary>
    /// Calculates continues ranges based on individual elements lower and upper selections. Cannot compensate for overlapping.
    /// </summary>
    /// <typeparam name="T">The type containing a range</typeparam>
    /// <typeparam name="T1">The type of range values</typeparam>
    /// <param name="source">The ranges to be combined</param>
    /// <param name="lowerSelector">The range's lower bound</param>
    /// <param name="upperSelector">The range's upper bound</param>
    /// <param name="factory">A factory to create a new range</param>
    /// <returns>An enumeration of continuous ranges</returns>
    public static IEnumerable<T> ContinousRange<T, T1>(this IEnumerable<T> source, Func<T, T1> lowerSelector, Func<T, T1> upperSelector, Func<T1, T1, T> factory)
    {
        //Validate parameters

        // Can order by start date, no overlaps, no collisions
        source = source.OrderBy(lowerSelector);

        // Get enumerator
        using(var enumerator = source.GetEnumerator())
        {
            // Move to the first item, if nothing, break.
            if (!enumerator.MoveNext()) yield break;

            // Set the previous range.
            var previous = enumerator.Current;

            // Cycle while there are more items
            while(enumerator.MoveNext())
            {
                // Get the current item.
                var current = enumerator.Current;

                // If the start date is equal to the end date
                // then merge with the previoud and continue
                if (lowerSelector(current).Equals(upperSelector(previous)))
                {
                    // Merge
                    previous = factory(lowerSelector(previous), upperSelector(current));

                    // Continue
                    continue;
                }

                // Yield the previous item.
                yield return previous;

                // The previous item is the current item.
                previous = current;
            }

            // Yield the previous item.
            yield return previous;
        }
    }

回复收藏 0 原文

相权↑美人 2024-12-22 05:27:15

您可以使用 Aggregate() 方法和 lambda 将它们组合在一起。正如您所说，假设没有重复或重叠：

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.FirstOrDefault(r => r.End == next.Start);

    if (last != null)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});

现在，如果您知道范围是有序的，则可以始终将下一个与我们处理的最后一个进行比较，如下所示：

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.LastOrDefault();

    if (last != null && last.End == next.Start)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});

You could use the Aggregate() method and a lambda to group them together. This is, as you say, assuming no duplicates or overlaps:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.FirstOrDefault(r => r.End == next.Start);

    if (last != null)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});

Now, if you know the ranges are ordered, you could get away with always comparing the next against the last one we processed, like so:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.LastOrDefault();

    if (last != null && last.End == next.Start)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});

回复收藏 0 原文

情仇皆在手 2024-12-22 05:27:15

这是另一个 LINQ 解决方案。它使用一个查询找到每个连续范围的开头，使用另一个查询找到每个连续范围的结尾，然后遍历这些对来构造新的范围。

var starts = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End);
var ends = ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start);
var result = starts.Zip(ends, (s, e) => new Range(s.Start, e.End));

它可以重写为一行代码，但单独的版本更清晰且更易于维护：

var result = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End).Zip(ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start), (s, e) => new Range(s.Start, e.End));

Here is another LINQ solution. It finds the start of each continuous range with one query, the end of each continuous range with another, and then goes through the pairs to construct new ranges.

var starts = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End);
var ends = ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start);
var result = starts.Zip(ends, (s, e) => new Range(s.Start, e.End));

It could be rewritten into a one-liner, but the separate version is clearer and easier to maintain:

var result = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End).Zip(ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start), (s, e) => new Range(s.Start, e.End));

回复收藏 0 原文

猫腻 2024-12-22 05:27:15

以下内容有效，但实际上是对 LINQ 的滥用：

// Dummy at the end to get the last continues range
ranges.Add(new Range(default(DateTime), default(DateTime)));
// The previous element in the list
Range previous = ranges.FirstOrDefault();
// The start element of the current continuous range
Range start = previous;
ranges.Skip(1).Select(x => {var result = new {current = x, previous = previous};
                            previous = x; return result;})
              .Where(x => x.previous.End != x.current.Start)
              .Select(x => { var result = new Range(start.Start, x.previous.End);
                             start = x.current; return result; });

该代码执行以下操作：

首先选择：

将当前值和当前前一个值保存在新匿名类型的实例中
将前一个值设置为下一次迭代的当前值

其中：仅选择那些标记新连续范围开始的元素

第二次选择：

使用保存的开始值的开始日期和上一项的结束值创建一个 Range 对象。
将起始值设置为当前项目，因为它标志着新的连续范围的开始。
返回在(1)中创建的范围

请注意：
我会坚持使用迭代解决方案，因为上面的代码不可读，不可维护，并且它比仅仅输入一个循环花费了我更多的时间一个如果...

The following works but it really is an abuse of LINQ:

// Dummy at the end to get the last continues range
ranges.Add(new Range(default(DateTime), default(DateTime)));
// The previous element in the list
Range previous = ranges.FirstOrDefault();
// The start element of the current continuous range
Range start = previous;
ranges.Skip(1).Select(x => {var result = new {current = x, previous = previous};
                            previous = x; return result;})
              .Where(x => x.previous.End != x.current.Start)
              .Select(x => { var result = new Range(start.Start, x.previous.End);
                             start = x.current; return result; });

The code does the following:

First select:

Save the current value and the current previous value in an instance of a new anonymous type
Set the previous value to the current value for the next iteration

Where: Only select those elements that mark the start of a new continuous range

Second select:

Create a Range object with the start date of the saved start value and the end value of the previous item.
Set the start value to the current item as it marks the start of a new continuous range.
Return the range created in (1)

Please note:
I would stick with the iterative solution, because the above code is unreadable, unmaintainable and it took me significantly more time than just typing a loop and an if...

回复收藏 0 原文

命比纸薄 2024-12-22 05:27:15

下面的代码以查询理解语法执行此操作。

public static List<Range> Group(List<Range> dates){
    if(dates.Any()){
        var previous = dates.FirstOrDefault();
        previous = new Range(previous.Start,previous.Start);
        var last = dates.Last();
        var result = (from dt in dates
                     let isDone = dt.Start != previous.End
                     let prev = isDone || last == dt ? 
                                                previous :  
                                               (previous = new Range(previous.Start,dt.End))
                     where isDone || last == dt
                     let t = (previous = dt)
                     select prev).ToList();
        if(result.Last().End != last.End)
           result.Add(last);
        return result;
    }
    return Enumerable.Empty<Range>().ToList();
}

我认为我实际上不会在生产代码中执行此操作，因为我觉得它违反了最小惊喜规则。 linq 语句通常没有副作用，而这是有效的，因为它有副作用。然而我认为值得发帖来表明它确实可以使用 O(n) 中的查询理解语法来解决

The below code does it in query comprehension syntax.

public static List<Range> Group(List<Range> dates){
    if(dates.Any()){
        var previous = dates.FirstOrDefault();
        previous = new Range(previous.Start,previous.Start);
        var last = dates.Last();
        var result = (from dt in dates
                     let isDone = dt.Start != previous.End
                     let prev = isDone || last == dt ? 
                                                previous :  
                                               (previous = new Range(previous.Start,dt.End))
                     where isDone || last == dt
                     let t = (previous = dt)
                     select prev).ToList();
        if(result.Last().End != last.End)
           result.Add(last);
        return result;
    }
    return Enumerable.Empty<Range>().ToList();
}

I don't think I would actually do this in production code because I feel it violates the rule of least surprise. Where linq statements usually has no side effects, this works because it has side effects. However I thought it worthwhile to post to show that indeed it can be solved using query comprehension syntax in O(n)

回复收藏 0 原文

你是我的挚爱i 2024-12-22 05:27:15

        var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };
        var starts = ranges.Select(p => p.Start);
        var ends = ranges.Select(p => p.End);
        var discreet = starts.Union(ends).Except(starts.Intersect(ends)).OrderBy(p => p).ToList();
        List<Range> result = new List<Range>();
        for (int i = 0; i < discreet.Count; i = i + 2)
        {
            result.Add(new Range(discreet[i], discreet[i + 1]));
        }
        return result;

        var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };
        var starts = ranges.Select(p => p.Start);
        var ends = ranges.Select(p => p.End);
        var discreet = starts.Union(ends).Except(starts.Intersect(ends)).OrderBy(p => p).ToList();
        List<Range> result = new List<Range>();
        for (int i = 0; i < discreet.Count; i = i + 2)
        {
            result.Add(new Range(discreet[i], discreet[i + 1]));
        }
        return result;

回复收藏 0 原文

~没有更多了~