List<> 上的递归循环导致堆栈溢出

发布于 2024-09-15 12:26:27 字数 2631 浏览 5 评论 0 原文

我有一个包含两个字符串和一个日期时间的对象的List<>。我想使用两个字符串作为键和最后一个 DateTime 值构建仅包含最后一个唯一项目的相同对象的另一个列表。在 SQL 中,请考虑以下内容:

SELECT col1, col2, MAX(datetime) FROM table GROUP BY col1, col2

这给出了 col1、col2 和最后日期时间的唯一列表。所以..我试图在带有两个列表的代码中执行此操作。其中包含重复项,仅解析并抓取最后一个唯一的项目以填充第二个列表。

我拥有的数据集很大,所以只需浏览重复列表,然后检查该项目是否在唯一列表中,如果没有添加它,如果是,比较日期等......是相当慢的。所以我想我可以递归地遍历重复列表并抓住唯一的项目找到它们的最大日期时间并在循环时删除非最大的项目,使我的重复列表越来越小,从而加快速度。 (我希望你仍然关注我..)

所以无论如何。我编写了一个包含两个列表的递归循环,但是当我循环遍历时,我在大约第 3000 次迭代时收到了 System.StackOverflowException 异常。

这是我的代码。想象一下 ListWithDuplicates 充满了数据。实际的 ListDataItem 有更多我遗漏的属性。但我的主要问题是为什么我不能以这种方式循环访问公共列表而不导致 StackOverflowException ?

using System;
using System.Net;
using System.IO;
using System.Collections.Generic;
using System.Linq;

public class RecursionTest
{
    public List<listDataItem> ListWithDuplicates { get; set; }
    public List<listDataItem> ListWithUniques { get; set; }

    public RecursionTest()
    {
        Process();
    }

    public void Process()
    {
        int rowcount = 0;
        int duplicates = 0;
        int total = 0;
        RecursiveLoopForUnique(ref rowcount, ref duplicates, ref total, "", "");
    }

    private void RecursiveLoopForUnique(ref int rowcount, ref int duplicates, ref int total, string col1, string col2)
    {
        if (rowcount > 0)
            duplicates += ListWithDuplicates.RemoveAll(z => z.COL1 == col1 && z.COL2 == col2);
        if (ListWithDuplicates.Count > 0)
        {
            foreach (listDataItem item in ListWithDuplicates)
            {
                rowcount++;
                if (ListWithUniques.FindAll(z => z.COL1 == item.COL1 && z.COL2 == item.COL2).Count < 1)
                {
                    ListWithUniques.Add(ListWithDuplicates.FindAll(z => z.COL1 == item.COL1 && z.COL2 == item.COL2).OrderByDescending(z => z.DATETIME).First());
                    col1 = item.COL1;
                    col2 = item.COL2;
                    break;
                }
            }
            RecursiveLoopForUnique(ref rowcount, ref duplicates, ref total, col1, col2);
        }
        else
            return;
    }

    public class listDataItem
    {
        public string COL1 { get; set; }
        public string COL2 { get; set; }
        public DateTime DATETIME { get; set; }            

        public listDataItem(string col1, string col2, DateTime datetime)
        {
            COL1 = col1;
            COL2 = col2;
            DATETIME = datetime;
        }
    }
}

I have a List<> of objects containing two strings and a DateTime. I want to build another list of the same objects containing only the last unique items using the two strings as keys and the last DateTime value. In SQL think the following:

SELECT col1, col2, MAX(datetime) FROM table GROUP BY col1, col2

This gives the unique list of col1, col2 and the last datetime. So.. I'm trying to do this in code with two lists. One with duplicates in it which parse and grab only the last unique items out of it to populate a second list.

The data sets I have are huge, so just going through the duplicate list then checking if the item is in the unique list, if it's not adding it, if it is, comparing the dates etc.. is pretty slow. So I thought I could recursively go through the duplicate list and grab the unique items find their max datetime and delete the non max ones as I loop through, making my duplicate list smaller and smaller, thus speeding things up. (i hope your still following me..)

So anyway. I wrote a recursive loop with two lists, but when I loop through I get a System.StackOverflowException on about the 3000th iteration.

Here's my code. Imagine the ListWithDuplicates is full of data. The actual ListDataItem has more properties I've left out. But my main question is why can't I loop through the public list in this manner without causing the StackOverflowException?

using System;
using System.Net;
using System.IO;
using System.Collections.Generic;
using System.Linq;

public class RecursionTest
{
    public List<listDataItem> ListWithDuplicates { get; set; }
    public List<listDataItem> ListWithUniques { get; set; }

    public RecursionTest()
    {
        Process();
    }

    public void Process()
    {
        int rowcount = 0;
        int duplicates = 0;
        int total = 0;
        RecursiveLoopForUnique(ref rowcount, ref duplicates, ref total, "", "");
    }

    private void RecursiveLoopForUnique(ref int rowcount, ref int duplicates, ref int total, string col1, string col2)
    {
        if (rowcount > 0)
            duplicates += ListWithDuplicates.RemoveAll(z => z.COL1 == col1 && z.COL2 == col2);
        if (ListWithDuplicates.Count > 0)
        {
            foreach (listDataItem item in ListWithDuplicates)
            {
                rowcount++;
                if (ListWithUniques.FindAll(z => z.COL1 == item.COL1 && z.COL2 == item.COL2).Count < 1)
                {
                    ListWithUniques.Add(ListWithDuplicates.FindAll(z => z.COL1 == item.COL1 && z.COL2 == item.COL2).OrderByDescending(z => z.DATETIME).First());
                    col1 = item.COL1;
                    col2 = item.COL2;
                    break;
                }
            }
            RecursiveLoopForUnique(ref rowcount, ref duplicates, ref total, col1, col2);
        }
        else
            return;
    }

    public class listDataItem
    {
        public string COL1 { get; set; }
        public string COL2 { get; set; }
        public DateTime DATETIME { get; set; }            

        public listDataItem(string col1, string col2, DateTime datetime)
        {
            COL1 = col1;
            COL2 = col2;
            DATETIME = datetime;
        }
    }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

饮湿 2024-09-22 12:26:27

怎么样:

Dictionary<string, item> destDict = new Dictionary<string, item>();

foreach (item curr in items)
{
    string key = curr.col1 + curr.col2;
    if (!destDict.Keys.Contains(key))
    {
        destDict.Add(key, curr);
    }
    else
    {
        if (destDict[key].date < curr.date)
        {
            destDict[key].date = curr.date;
        }
    }
}

我在一个包含 2 个唯一的 col1/col2 对各 1000 个的列表上对此进行了测试。工作正常并且比 LINQ groupby/select 更快。

How about this:

Dictionary<string, item> destDict = new Dictionary<string, item>();

foreach (item curr in items)
{
    string key = curr.col1 + curr.col2;
    if (!destDict.Keys.Contains(key))
    {
        destDict.Add(key, curr);
    }
    else
    {
        if (destDict[key].date < curr.date)
        {
            destDict[key].date = curr.date;
        }
    }
}

I tested this on a list containing 1000 each of 2 unique col1/col2 pairs. Worked fine and was faster than a LINQ groupby/select.

面如桃花 2024-09-22 12:26:27

LINQ,是的。

listDataItem latestListDataItem =
    ListWithDuplicates.Where(item => item.COL1 == yourCol1Param && item.COL2 == yourCol2Param)
                      .Max(item => item.DATETIME);

MSDN 注释..

其中:http://msdn.microsoft.com/en -us/library/bb534803.aspx

最大:http:// msdn.microsoft.com/en-us/library/bb347632.aspx

订购者:http://msdn.microsoft.com/en-us/library/bb534966.aspx

最后:http://msdn.microsoft.com/en-us/library/bb358775.aspx

LINQ, yay.

listDataItem latestListDataItem =
    ListWithDuplicates.Where(item => item.COL1 == yourCol1Param && item.COL2 == yourCol2Param)
                      .Max(item => item.DATETIME);

MSDN notes on..

Where: http://msdn.microsoft.com/en-us/library/bb534803.aspx

Max: http://msdn.microsoft.com/en-us/library/bb347632.aspx

OrderBy: http://msdn.microsoft.com/en-us/library/bb534966.aspx

Last: http://msdn.microsoft.com/en-us/library/bb358775.aspx

臻嫒无言 2024-09-22 12:26:27

我不确定语法,但应该很接近。

from d in DupsList
group d.DATETIME on d.col1, d.col2 in grp
select new listDataItem  (grp.Key.col1, grp.Key.col2, grp.Max()};

I'm not sure about the syntax, but it should be close.

from d in DupsList
group d.DATETIME on d.col1, d.col2 in grp
select new listDataItem  (grp.Key.col1, grp.Key.col2, grp.Max()};
情何以堪。 2024-09-22 12:26:27

好吧,如果您有超过几千对独特的 C1、C2,那么您就会遇到这种情况,因为您要为每个独特的组递归一次。

有很多方法可以解决这个问题;一种更清晰、更快速的方法是按 C1 和 C2 对列表进行排序,然后精确地向下查找一次以查找每组中的最新日期。如果您不想自己重新实现它,最好的方法是:

ListWithUniques = ListWithDuplicates
    .GroupBy(x => new { COL1, COL2 })
    .Select(g => g.OrderByDescending(x => x.DATETIME).First())

Well, if you have more than a few thousand unique pairs of C1, C2, then you'll encounter this, since you're recursing once for each unique group.

There are a lot of ways you could fix this up; one that would wind up much clearer and faster would be to sort the list by C1 and C2, and then go down it exactly once to find the most recent date in each group. If you aren't wedded to reimplementing it yourself, the best way is this:

ListWithUniques = ListWithDuplicates
    .GroupBy(x => new { COL1, COL2 })
    .Select(g => g.OrderByDescending(x => x.DATETIME).First())
空‖城人不在 2024-09-22 12:26:27
SELECT col1, col2, MAX(datetime) FROM table GROUP BY col1, col2

在 LINQ 中:

var query = from row in table
            group row into g
            select new
            {
                Col1 = g.Key.Col1,
                Col2 = g.Key.Col2,
                Date = g.Max(b => b.Date)
            };

并且以一种可能更有用的形式:

var dict = query.ToDictionary(a => new { a.Col1, a.Col2 }, a => a.Date);

然后您可以像这样引用它:

DateTime specificMaxDate = dict[new { Col1 = 2, Col2 = 3 }];
SELECT col1, col2, MAX(datetime) FROM table GROUP BY col1, col2

in LINQ:

var query = from row in table
            group row into g
            select new
            {
                Col1 = g.Key.Col1,
                Col2 = g.Key.Col2,
                Date = g.Max(b => b.Date)
            };

And in a potentially more useful form:

var dict = query.ToDictionary(a => new { a.Col1, a.Col2 }, a => a.Date);

Then you can reference it like so:

DateTime specificMaxDate = dict[new { Col1 = 2, Col2 = 3 }];
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文