LINQ 到 SortedList

发布于 2024-09-02 22:19:30 字数 768 浏览 2 评论 0原文

我是一个完整的 LINQ 新手,所以我不知道我的 LINQ 是否不适合我需要做的事情,或者我对性能的期望是否太高。

我有一个对象的 SortedList,以 int 为键; SortedList 与 SortedDictionary 相反,因为我将使用预先排序的数据填充集合。我的任务是找到确切的密钥,或者如果没有确切的密钥,则找到具有下一个更高值的密钥。如果搜索对于列表来说太高(例如最高键是 100,但搜索 105),则返回 null。

// The structure of this class is unimportant.  Just using
// it as an illustration.
public class CX
{
    public int KEY;
    public DateTime DT;
}

static CX getItem(int i, SortedList<int, CX> list)
{
    var items =
    (from kv in list
     where kv.Key >= i
     select kv.Key);

    if (items.Any())
    {
        return list[items.Min()];
    }

    return null;
}

给定一个包含 50,000 条记录的列表,调用 getItem 500 次大约需要一秒半的时间。调用 50,000 次需要 2 分钟以上。这个性能看起来很差。我的 LINQ 不好吗?我是不是期待太多了?我应该推出自己的二分搜索功能吗?

I'm a complete LINQ newbie, so I don't know if my LINQ is incorrect for what I need to do or if my expectations of performance are too high.

I've got a SortedList of objects, keyed by int; SortedList as opposed to SortedDictionary because I'll be populating the collection with pre-sorted data. My task is to find either the exact key or, if there is no exact key, the one with the next higher value. If the search is too high for the list (e.g. highest key is 100, but search for 105), return null.

// The structure of this class is unimportant.  Just using
// it as an illustration.
public class CX
{
    public int KEY;
    public DateTime DT;
}

static CX getItem(int i, SortedList<int, CX> list)
{
    var items =
    (from kv in list
     where kv.Key >= i
     select kv.Key);

    if (items.Any())
    {
        return list[items.Min()];
    }

    return null;
}

Given a list of 50,000 records, calling getItem 500 times takes about a second and a half. Calling it 50,000 times takes over 2 minutes. This performance seems very poor. Is my LINQ bad? Am I expecting too much? Should I be rolling my own binary search function?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

弄潮 2024-09-09 22:19:30

首先,您的查询将被评估两次(一次针对 Any,一次针对 Min)。其次,Min 要求它迭代整个列表,即使它已排序的事实意味着第一项将是最小值。您应该能够更改此设置:

if (items.Any())
{
    return list[items.Min()];
}

改为:

var default = 
    (from kv in list
     where kv.Key >= i
     select (int?)kv.Key).FirstOrDefault();

if(default != null) return list[default.Value];

return null;

UPDATE

因为您正在选择值类型,所以 FirstOrDefault 不会返回可为 null 的对象。我已更改您的查询,将所选值转换为 int?,从而允许检查结果值是否为 null。我建议使用 ContainsKey,因为如果您的列表包含 0 的值,则会返回 true。例如,假设您有以下值

0 2 4 6 8

如果您传入小于或等于 8 的任何值,那么您将获得正确的值。但是,如果您传入 9,您将得到 0 (default(int)),它列表中,但不在 有效的结果。

First, your query is being evaluated twice (once for Any, and once for Min). Second, Min requires that it iterate over the entire list, even though the fact that it's sorted means that the first item will be the minimum. You should be able to change this:

if (items.Any())
{
    return list[items.Min()];
}

To this:

var default = 
    (from kv in list
     where kv.Key >= i
     select (int?)kv.Key).FirstOrDefault();

if(default != null) return list[default.Value];

return null;

UPDATE

Because you're selecting a value type, FirstOrDefault doesn't return a nullable object. I have altered your query to cast the selected value to an int? instead, allowing the resulting value to be checked for null. I would advocate this over using ContainsKey, as that would return true if your list contained a value for 0. For example, say you have the following values

0 2 4 6 8

If you were to pass in anything less than or equal to 8, then you would get the correct value. However, if you were to pass in 9, you would get 0 (default(int)), which is in the list but isn't a valid result.

你怎么这么可爱啊 2024-09-09 22:19:30

自己编写二分搜索可能很困难。

幸运的是,微软已经编写了一个非常强大的:Array.BinarySearch这实际上是SortedList.IndexOfKey内部使用的方法。唯一的问题是,它需要一个 T[] 参数,而不是任何 IList (如 SortedList.Keys )。

但你知道吗?有一个名为 Reflector 的出色工具,可让您查看 .NET 源代码。 代码

看看:IList 上的通用 BinarySearch 扩展方法,直接取自 Microsoft 的 Array.BinarySearch 的反射 > 实施。

public static int BinarySearch<T>(this IList<T> list, int index, int length, T value, IComparer<T> comparer) {
    if (list == null)
        throw new ArgumentNullException("list");
    else if (index < 0 || length < 0)
        throw new ArgumentOutOfRangeException((index < 0) ? "index" : "length");
    else if (list.Count - index < length)
        throw new ArgumentException();

    int lower = index;
    int upper = (index + length) - 1;

    while (lower <= upper) {
        int adjustedIndex = lower + ((upper - lower) >> 1);
        int comparison = comparer.Compare(list[adjustedIndex], value);
        if (comparison == 0)
            return adjustedIndex;
        else if (comparison < 0)
            lower = adjustedIndex + 1;
        else
            upper = adjustedIndex - 1;
    }

    return ~lower;
}

public static int BinarySearch<T>(this IList<T> list, T value, IComparer<T> comparer) {
    return list.BinarySearch(0, list.Count, value, comparer);
}

public static int BinarySearch<T>(this IList<T> list, T value) where T : IComparable<T> {
    return list.BinarySearch(value, Comparer<T>.Default);
}

这将让您调用 list.Keys.BinarySearch 并获取所需索引的负位按位补码,以防找不到所需的键(以下内容基本上直接取自 tzaman 的答案):

int index = list.Keys.BinarySearch(i);
if (index < 0)
    index = ~index;
var item = index < list.Count ? list[list.Keys[index]] : null;
return item;

Writing a binary search on your own can be tough.

Fortunately, Microsoft already wrote a pretty robust one: Array.BinarySearch<T>. This is, in fact, the method that SortedList<TKey, TValue>.IndexOfKey uses internally. Only problem is, it takes a T[] argument, instead of any IList<T> (like SortedList<TKey, TValue>.Keys).

You know what, though? There's this great tool called Reflector that lets you look at .NET source code...

Check it out: a generic BinarySearch extension method on IList<T>, taken straight from the reflected code of Microsoft's Array.BinarySearch<T> implementation.

public static int BinarySearch<T>(this IList<T> list, int index, int length, T value, IComparer<T> comparer) {
    if (list == null)
        throw new ArgumentNullException("list");
    else if (index < 0 || length < 0)
        throw new ArgumentOutOfRangeException((index < 0) ? "index" : "length");
    else if (list.Count - index < length)
        throw new ArgumentException();

    int lower = index;
    int upper = (index + length) - 1;

    while (lower <= upper) {
        int adjustedIndex = lower + ((upper - lower) >> 1);
        int comparison = comparer.Compare(list[adjustedIndex], value);
        if (comparison == 0)
            return adjustedIndex;
        else if (comparison < 0)
            lower = adjustedIndex + 1;
        else
            upper = adjustedIndex - 1;
    }

    return ~lower;
}

public static int BinarySearch<T>(this IList<T> list, T value, IComparer<T> comparer) {
    return list.BinarySearch(0, list.Count, value, comparer);
}

public static int BinarySearch<T>(this IList<T> list, T value) where T : IComparable<T> {
    return list.BinarySearch(value, Comparer<T>.Default);
}

This will let you call list.Keys.BinarySearch and get the negative bitwise complement of the index you want in case the desired key isn't found (the below is taken basically straight from tzaman's answer):

int index = list.Keys.BinarySearch(i);
if (index < 0)
    index = ~index;
var item = index < list.Count ? list[list.Keys[index]] : null;
return item;
倾城花音 2024-09-09 22:19:30

SortedList 上使用 LINQ 不会给您带来排序的好处。

为了获得最佳性能,您应该编写自己的二分搜索。

Using LINQ on a SortedList will not give you the benefit of the sort.

For optimal performance, you should write your own binary search.

翻了热茶 2024-09-09 22:19:30

好的,只是为了让这一点更加可见 - 这是 Adam Robinson 答案的更简洁版本:

return list.FirstOrDefault(kv => kv.Key >= i).Value; 

FirstOrDefault 函数有一个接受谓词的重载,它选择满足条件的第一个元素 - 您可以使用它可以直接获取所需的元素,如果不存在则返回 null。

OK, just to give this a little more visibility - here's a more concise version of Adam Robinson's answer:

return list.FirstOrDefault(kv => kv.Key >= i).Value; 

The FirstOrDefault function has an overload that accepts a predicate, which selects the first element satisfying a condition - you can use that to directly get the element you want, or null if it doesn't exist.

屋檐 2024-09-09 22:19:30

为什么不使用 List 类中内置的 BinarySearch 呢?

var keys = list.Keys.ToList();
int index = keys.BinarySearch(i);
if (index < 0)
    index = ~index;
var item = index < keys.Count ? list[keys[index]] : null;
return item;

如果搜索目标不在列表中,则 BinarySearch 返回下一个较高项的按位补码;如果结果为负,我们可以使用它来重新补充结果,从而直接得到您想要的结果。如果它等于Count,则您的搜索键比列表中的任何内容都大。

这应该比执行 LINQ where 快得多,因为它已经排序了...
正如评论所指出的,ToList 调用将强制对整个列表进行评估,因此只有当您在不更改底层 SortedList 的情况下进行多次搜索时,这才有用,并且您单独保留 keys 列表。

Why not use the BinarySearch that's built into the List class?

var keys = list.Keys.ToList();
int index = keys.BinarySearch(i);
if (index < 0)
    index = ~index;
var item = index < keys.Count ? list[keys[index]] : null;
return item;

If the search target isn't in the list, BinarySearch returns the bit-wise complement of the next-higher item; we can use that to directly get you what you want by re-complementing the result if it's negative. If it becomes equal to the Count, your search key was bigger than anything in the list.

This should be much faster than doing a LINQ where, since it's already sorted...
As comments have pointed out, the ToList call will force an evaluation of the whole list, so this is only beneficial if you do multiple searches without altering the underlying SortedList, and you keep the keys list around separately.

你好,陌生人 2024-09-09 22:19:30

在 PowerCollections 中使用 OrderedDictionary,您可以获得一个枚举器,该枚举器从您要查找的关键位置开始...如果它不存在,您将获得下一个最近的节点,然后可以从 O(log N 中的节点向前/向后导航) 每次导航呼叫的时间。

这样做的优点是您不必编写自己的搜索,甚至不必在 SortedList 之上管理自己的搜索。

Using OrderedDictionary in PowerCollections you can get an enumerator that starts where they key you are looking for should be... if it's not there, you'll get the next closest node and can then navigate forwards/backwards from that in O(log N) time per nav call.

This has the advantage of you not having to write your own search or even manage your own searches on top of a SortedList.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文