使用 LINQ 按日期对序列进行无间隙分组

发布于 2024-09-04 12:16:45 字数 2292 浏览 2 评论 0原文

我正在尝试选择列表的一个子组,其中项目具有连续的日期,例如,

ID  StaffID  Title              ActivityDate
--  -------  -----------------  ------------
 1       41  Meeting with John    03/06/2010
 2       41  Meeting with John    08/06/2010
 3       41  Meeting Continues    09/06/2010
 4       41  Meeting Continues    10/06/2010
 5       41  Meeting with Kay     14/06/2010
 6       41  Meeting Continues    15/06/2010

我每次都使用一个枢轴点,因此将示例枢轴项设为 3,我希望获得以下结果周围的连续事件枢轴:

ID  StaffID  Title              ActivityDate
--  -------  -----------------  ------------
 2       41  Meeting with John    08/06/2010
 3       41  Meeting Continues    09/06/2010
 4       41  Meeting Continues    10/06/2010

我当前的实现是费力地“走进”过去,然后进入未来,以构建列表:

var activity = // item number 3: Meeting Continues (09/06/2010)

var orderedEvents = activities.OrderBy(a => a.ActivityDate).ToArray();

// Walk into the past until a gap is found
var preceedingEvents = orderedEvents.TakeWhile(a => a.ID != activity.ID);
DateTime dayBefore;
var previousEvent = activity;
while (previousEvent != null)
{
    dayBefore = previousEvent.ActivityDate.AddDays(-1).Date;
    previousEvent = preceedingEvents.TakeWhile(a => a.ID != previousEvent.ID).LastOrDefault();
    if (previousEvent != null)
    {
        if (previousEvent.ActivityDate.Date == dayBefore)
            relatedActivities.Insert(0, previousEvent);
        else
            previousEvent = null;
    }
}


// Walk into the future until a gap is found
var followingEvents = orderedEvents.SkipWhile(a => a.ID != activity.ID);
DateTime dayAfter;
var nextEvent = activity;
while (nextEvent != null)
{
    dayAfter = nextEvent.ActivityDate.AddDays(1).Date;
    nextEvent = followingEvents.SkipWhile(a => a.ID != nextEvent.ID).Skip(1).FirstOrDefault();
    if (nextEvent != null)
    {
        if (nextEvent.ActivityDate.Date == dayAfter)
            relatedActivities.Add(nextEvent);
        else
            nextEvent = null;
    }
}

列表latedActivities应该按顺序包含连续的事件。

是否有更好的方法(也许使用 LINQ)?

我有一个使用 .Aggregate() 但无法想象当它在序列中发现间隙时如何让聚合突破。

I'm trying to select a subgroup of a list where items have contiguous dates, e.g.

ID  StaffID  Title              ActivityDate
--  -------  -----------------  ------------
 1       41  Meeting with John    03/06/2010
 2       41  Meeting with John    08/06/2010
 3       41  Meeting Continues    09/06/2010
 4       41  Meeting Continues    10/06/2010
 5       41  Meeting with Kay     14/06/2010
 6       41  Meeting Continues    15/06/2010

I'm using a pivot point each time, so take the example pivot item as 3, I'd like to get the following resulting contiguous events around the pivot:

ID  StaffID  Title              ActivityDate
--  -------  -----------------  ------------
 2       41  Meeting with John    08/06/2010
 3       41  Meeting Continues    09/06/2010
 4       41  Meeting Continues    10/06/2010

My current implementation is a laborious "walk" into the past, then into the future, to build the list:

var activity = // item number 3: Meeting Continues (09/06/2010)

var orderedEvents = activities.OrderBy(a => a.ActivityDate).ToArray();

// Walk into the past until a gap is found
var preceedingEvents = orderedEvents.TakeWhile(a => a.ID != activity.ID);
DateTime dayBefore;
var previousEvent = activity;
while (previousEvent != null)
{
    dayBefore = previousEvent.ActivityDate.AddDays(-1).Date;
    previousEvent = preceedingEvents.TakeWhile(a => a.ID != previousEvent.ID).LastOrDefault();
    if (previousEvent != null)
    {
        if (previousEvent.ActivityDate.Date == dayBefore)
            relatedActivities.Insert(0, previousEvent);
        else
            previousEvent = null;
    }
}


// Walk into the future until a gap is found
var followingEvents = orderedEvents.SkipWhile(a => a.ID != activity.ID);
DateTime dayAfter;
var nextEvent = activity;
while (nextEvent != null)
{
    dayAfter = nextEvent.ActivityDate.AddDays(1).Date;
    nextEvent = followingEvents.SkipWhile(a => a.ID != nextEvent.ID).Skip(1).FirstOrDefault();
    if (nextEvent != null)
    {
        if (nextEvent.ActivityDate.Date == dayAfter)
            relatedActivities.Add(nextEvent);
        else
            nextEvent = null;
    }
}

The list relatedActivities should then contain the contiguous events, in order.

Is there a better way (maybe using LINQ) for this?

I had an idea of using .Aggregate() but couldn't think how to get the aggregate to break out when it finds a gap in the sequence.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

小草泠泠 2024-09-11 12:16:45

这是一个实现:

public static IEnumerable<IGrouping<int, T>> GroupByContiguous(
  this IEnumerable<T> source,
  Func<T, int> keySelector
)
{
   int keyGroup = Int32.MinValue;
   int currentGroupValue = Int32.MinValue;
   return source
     .Select(t => new {obj = t, key = keySelector(t))
     .OrderBy(x => x.key)
     .GroupBy(x => {
       if (currentGroupValue + 1 < x.key)
       {
         keyGroup = x.key;
       }
       currentGroupValue = x.key;
       return keyGroup;
     }, x => x.obj);
}

您可以通过减法将日期转换为整数,或者想象一个 DateTime 版本(简单)。

Here's an implementation:

public static IEnumerable<IGrouping<int, T>> GroupByContiguous(
  this IEnumerable<T> source,
  Func<T, int> keySelector
)
{
   int keyGroup = Int32.MinValue;
   int currentGroupValue = Int32.MinValue;
   return source
     .Select(t => new {obj = t, key = keySelector(t))
     .OrderBy(x => x.key)
     .GroupBy(x => {
       if (currentGroupValue + 1 < x.key)
       {
         keyGroup = x.key;
       }
       currentGroupValue = x.key;
       return keyGroup;
     }, x => x.obj);
}

You can either convert the dates to ints by means of subtraction, or imagine a DateTime version (easily).

放肆 2024-09-11 12:16:45

在这种情况下,我认为标准的 foreach 循环可能比 LINQ 查询更具可读性:

var relatedActivities = new List<TActivity>();
bool found = false;

foreach (var item in activities.OrderBy(a => a.ActivityDate))
{
    int count = relatedActivities.Count;
    if ((count > 0) && (relatedActivities[count - 1].ActivityDate.Date.AddDays(1) != item.ActivityDate.Date))
    {
        if (found)
            break;

        relatedActivities.Clear();
    }

    relatedActivities.Add(item);
    if (item.ID == activity.ID)
        found = true;
}

if (!found)
    relatedActivities.Clear();

就其价值而言,这里有一个大致等效但可读性差得多的 LINQ 查询:

var relatedActivities = activities
    .OrderBy(x => x.ActivityDate)
    .Aggregate
    (
        new { List = new List<TActivity>(), Found = false, ShortCircuit = false },
        (a, x) =>
        {
            if (a.ShortCircuit)
                return a;

            int count = a.List.Count;
            if ((count > 0) && (a.List[count - 1].ActivityDate.Date.AddDays(1) != x.ActivityDate.Date))
            {
                if (a.Found)
                    return new { a.List, a.Found, ShortCircuit = true };

                a.List.Clear();
            }

            a.List.Add(x);
            return new { a.List, Found = a.Found || (x.ID == activity.ID), a.ShortCircuit };
        },
        a => a.Found ? a.List : new List<TActivity>()
    );

In this case I think that a standard foreach loop is probably more readable than a LINQ query:

var relatedActivities = new List<TActivity>();
bool found = false;

foreach (var item in activities.OrderBy(a => a.ActivityDate))
{
    int count = relatedActivities.Count;
    if ((count > 0) && (relatedActivities[count - 1].ActivityDate.Date.AddDays(1) != item.ActivityDate.Date))
    {
        if (found)
            break;

        relatedActivities.Clear();
    }

    relatedActivities.Add(item);
    if (item.ID == activity.ID)
        found = true;
}

if (!found)
    relatedActivities.Clear();

For what it's worth, here's a roughly equivalent -- and far less readable -- LINQ query:

var relatedActivities = activities
    .OrderBy(x => x.ActivityDate)
    .Aggregate
    (
        new { List = new List<TActivity>(), Found = false, ShortCircuit = false },
        (a, x) =>
        {
            if (a.ShortCircuit)
                return a;

            int count = a.List.Count;
            if ((count > 0) && (a.List[count - 1].ActivityDate.Date.AddDays(1) != x.ActivityDate.Date))
            {
                if (a.Found)
                    return new { a.List, a.Found, ShortCircuit = true };

                a.List.Clear();
            }

            a.List.Add(x);
            return new { a.List, Found = a.Found || (x.ID == activity.ID), a.ShortCircuit };
        },
        a => a.Found ? a.List : new List<TActivity>()
    );
岁吢 2024-09-11 12:16:45

不知何故,我不认为 LINQ 真正适合用于双向一维深度优先搜索,但我使用 Aggregate 构建了一个可用的 LINQ。对于这个例子,我将使用列表而不是数组。另外,我将使用 Activity 来引用您存储数据的任何类。将其替换为适合您的代码的任何类。

在开始之前,我们需要一个小函数来处理一些事情。 List.Add(T) 返回 null,但我们希望能够在列表中累积并返回此聚合函数的新列表。因此,您所需要的只是一个如下所示的简单函数。

private List<T> ListWithAdd<T>(List<T> src, T obj)
{
    src.Add(obj);
    return src;
}

首先,我们获取所有活动的排序列表,然后初始化相关活动的列表。该初始列表将仅包含要启动的目标活动。

List<Activity> orderedEvents = activities.OrderBy(a => a.ActivityDate).ToList();
List<Activity> relatedActivities = new List<Activity>();
relatedActivities.Add(activity);

我们必须将其分为两个列表,过去和未来,就像您现在所做的那样。

我们将从过去开始,结构看起来应该很熟悉。然后我们将把所有这些聚合到 relatedActivities 中。这使用了我们之前编写的ListWithAdd函数。您可以将其压缩为一行并跳过将 previousEvents 声明为其自己的变量,但在本示例中我将其分开。

var previousEvents = orderedEvents.TakeWhile(a => a.ID != activity.ID).Reverse();
relatedActivities = previousEvents.Aggregate<Activity, List<Activity>>(relatedActivities, (items, prevItem) => items.OrderBy(a => a.ActivityDate).First().ActivityDate.Subtract(prevItem.ActivityDate).Days.Equals(1) ? ListWithAdd(items, prevItem) : items).ToList();

接下来,我们将以类似的方式构建以下事件,并同样对其进行聚合。

var nextEvents = orderedEvents.SkipWhile(a => a.ID != activity.ID);
relatedActivities = nextEvents.Aggregate<Activity, List<Activity>>(relatedActivities, (items, nextItem) => nextItem.ActivityDate.Subtract(items.OrderBy(a => a.ActivityDate).Last().ActivityDate).Days.Equals(1) ? ListWithAdd(items, nextItem) : items).ToList();

之后您可以对结果进行正确排序,因为现在 relatedActivities 应包含所有没有间隙的活动。当它遇到第一个间隙时,它不会立即中断,不,但我不认为你可以从字面上突破 LINQ。因此,它只是忽略它发现的任何超出间隙的东西。

请注意,此示例代码仅对实际时间差进行操作。您的示例输出似乎暗示您需要一些其他比较因素,但这应该足以让您开始。只需在两个条目中的日期减法比较中添加必要的逻辑即可。

Somehow, I don't think LINQ was truly meant to be used for bidirectional-one-dimensional-depth-first-searches, but I constructed a working LINQ using Aggregate. For this example I'm going to use a List instead of an array. Also, I'm going to use Activity to refer to whatever class you are storing the data in. Replace it with whatever is appropriate for your code.

Before we even start, we need a small function to handle something. List.Add(T) returns null, but we want to be able to accumulate in a list and return the new list for this aggregate function. So all you need is a simple function like the following.

private List<T> ListWithAdd<T>(List<T> src, T obj)
{
    src.Add(obj);
    return src;
}

First, we get the sorted list of all activities, and then initialize the list of related activities. This initial list will contain the target activity only, to start.

List<Activity> orderedEvents = activities.OrderBy(a => a.ActivityDate).ToList();
List<Activity> relatedActivities = new List<Activity>();
relatedActivities.Add(activity);

We have to break this into two lists, the past and the future just like you currently do it.

We'll start with the past, the construction should look mostly familiar. Then we'll aggregate all of it into relatedActivities. This uses the ListWithAdd function we wrote earlier. You could condense it into one line and skip declaring previousEvents as its own variable, but I kept it separate for this example.

var previousEvents = orderedEvents.TakeWhile(a => a.ID != activity.ID).Reverse();
relatedActivities = previousEvents.Aggregate<Activity, List<Activity>>(relatedActivities, (items, prevItem) => items.OrderBy(a => a.ActivityDate).First().ActivityDate.Subtract(prevItem.ActivityDate).Days.Equals(1) ? ListWithAdd(items, prevItem) : items).ToList();

Next, we'll build the following events in a similar fashion, and likewise aggregate it.

var nextEvents = orderedEvents.SkipWhile(a => a.ID != activity.ID);
relatedActivities = nextEvents.Aggregate<Activity, List<Activity>>(relatedActivities, (items, nextItem) => nextItem.ActivityDate.Subtract(items.OrderBy(a => a.ActivityDate).Last().ActivityDate).Days.Equals(1) ? ListWithAdd(items, nextItem) : items).ToList();

You can properly sort the result afterwards, as now relatedActivities should contain all activities with no gaps. It won't immediately break when it hits the first gap, no, but I don't think you can literally break out of a LINQ. So it instead just ignores anything which it finds past a gap.

Note that this example code only operates on the actual difference in time. Your example output seems to imply that you need some other comparison factors, but this should be enough to get you started. Just add the necessary logic to the date subtraction comparison in both entries.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文