什么 .NET 字典支持“查找最近的键”?手术?

发布于 2024-08-10 12:49:42 字数 334 浏览 9 评论 0原文

我正在将一些 C++ 代码转换为 C#,它调用 std::map::lower_bound(k) 来查找映射中键等于或大于 k 的条目。然而,我没有看到任何方法可以用 .NET 的 SortedDictionary 做同样的事情。我怀疑我可以使用 SortedList 实现解决方法,但不幸的是 SortedList 太慢(插入和删除键的 O(n) )。我能做些什么?

注意:我找到了一种解决方法,可以利用我的特定场景...具体来说,我的键是从 0 开始的密集整数,所以我使用了 List作为我的字典,列表索引作为键,并且搜索等于或大于 k 的键只需几次循环迭代即可完成。但很高兴看到最初的问题得到解答。

I'm converting some C++ code to C# and it calls std::map::lower_bound(k) to find an entry in the map whose key is equal to or greater than k. However, I don't see any way to do the same thing with .NET's SortedDictionary. I suspect I could implement a workaround using SortedList, but unfortunately SortedList is too slow (O(n) for inserting and deleting keys). What can I do?

Note: I found a workaround using that takes advantage of my particular scenario... Specifically, my keys are a dense population of integers starting at just over 0, so I used a List<TValue> as my dictionary with the list index serving as the key, and searching for a key equal or greater than k can be done in only a few loop iterations. But it would still be nice to see the original question answered.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

神回复 2024-08-17 12:49:42

我开发了几个支持“查找下一个更高的键”和“查找下一个更低的键”操作的集合类。

首先,我制作了一组 Compact Patricia Trie 集合。这些是旨在最小化内存使用的集合/字典;它们对于大量 URL 和某些其他类型的数据特别有效。它们只能部分解决问题,因为仅支持某些类型的键,即 byte[]string 和所有原始整数类型 (Int8..UInt64)。此外,字符串排序区分大小写。 NuGet 包:Loyc.Utilities

发布此答案后,我制作了更多排序的数据结构来一般解决此问题: BList, BDictionary, <代码>BMultiMapSparseAList。详情请看我的第二个回答。

I have developed several collection classes that support "find next higher key" and "find next lower key" operations.

First I made a set of Compact Patricia Trie collections. These are sets/dictionaries designed to minimize memory usage; they are especially efficient for large collections of URLs and certain other kinds of data. They only partly solve the problem, because only certain kinds of keys are supported, namely byte[], string, and all primitive integer types (Int8..UInt64). Also, string sorting is case-sensitive. NuGet package: Loyc.Utilities

After publishing this answer, I made more sorted data structures that solve this problem generally: BList<T>, BDictionary<K,V>, BMultiMap<K,V> and SparseAList<T>. See my second answer for details.

小瓶盖 2024-08-17 12:49:42

问题在于,字典/哈希表被设计为根据输入值到达唯一的内存位置,因此您需要一个数据结构,该数据结构旨在容纳与您存储的每个值相关的范围,同时时间正确更新每个间隔

我认为跳过列表(或平衡二叉树)可以帮助你。尽管它们不能以 O(n) 的速度执行查找,但它们可以以对数方式执行查找,并且仍然比树更快。

我知道这不是一个正确的答案,因为我不能说 .NET BCL 已经包含这样一个类,不幸的是您必须自己实现一个类,或者找到一个为您支持它的第 3 方程序集。不过,CodeProject 这里似乎有一个很好的例子。

The problem is that a dictionary/hash table is designed to arrive at a unique memory location based on an input value, so you'll need a data structure that is designed to accommodate a range related to each value you store, and at the same time update each interval correctly

I think skip lists (or balanced binary trees) can help you. Although they cannot perform lookups in O(n), they can do logarithmically and still faster than trees.

I know this is not a proper answer since I cannot say that the .NET BCL already contains such a class, you'll unfortunately have to implement one yourself, or find a 3rd party assembly that supports it for you. There seems to be a nice example over at The CodeProject here, though.

烙印 2024-08-17 12:49:42

你可以试试我下面写的代码。它使用二分搜索,因此假设列表/数组是预先排序的。

public static class ListExtensions
{
    public static int GetAtMostIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer)
    {
        return GetAtMostIndex(list, value, comparer, 0, list.Count);
    }

    public static int GetAtLeastIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer)
    {
        return GetAtLeastIndex(list, value, comparer, 0, list.Count);
    }

    public static int GetAtMostIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer, int index, int count)
    {
        if (count == 0)
        {
            return -1;
        }

        int startIndex = index;
        int endIndex = index + count - 1;
        int middleIndex = 0;
        int compareResult = -1;

        while (startIndex < endIndex)
        {
            middleIndex = (startIndex + endIndex) >> 1; //  / 2
            compareResult = comparer.Invoke(list[middleIndex], value);

            if (compareResult > 0)
            {
                endIndex = middleIndex - 1;
            }
            else if (compareResult < 0)
            {
                startIndex = middleIndex + 1;
            }
            else
            {
                return middleIndex;
            }
        }

        if (startIndex == endIndex)
        {
            compareResult = comparer.Invoke(list[startIndex], value);

            if (compareResult <= 0)
            {
                return startIndex;
            }
            else
            {
                int returnIndex = startIndex - 1;

                if (returnIndex < index)
                {
                    return -1;
                }
                else
                {
                    return returnIndex;
                }
            }
        }
        else
        {
            //todo: test
            return startIndex - 1;
        }
    }

    public static int GetAtLeastIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer, int index, int count)
    {
        if (count == 0)
        {
            return -1;
        }

        int startIndex = index;
        int endIndex = index + count - 1;
        int middleIndex = 0;
        int compareResult = -1;

        while (startIndex < endIndex)
        {
            middleIndex = (startIndex + endIndex) >> 1; //  / 2
            compareResult = comparer.Invoke(list[middleIndex], value);

            if (compareResult > 0)
            {
                endIndex = middleIndex - 1;
            }
            else if (compareResult < 0)
            {
                startIndex = middleIndex + 1;
            }
            else
            {
                return middleIndex;
            }
        }

        if (startIndex == endIndex)
        {
            compareResult = comparer.Invoke(list[startIndex], value);

            if (compareResult >= 0)
            {
                return startIndex;
            }
            else
            {
                int returnIndex = startIndex + 1;

                if (returnIndex >= index + count)
                {
                    return -1;
                }
                else
                {
                    return returnIndex;
                }
            }
        }
        else
        {
            return endIndex + 1;
        }
    }
}

You can try the code i wrote below. it using binary search, therefore assuming the list/array is pre-sorted.

public static class ListExtensions
{
    public static int GetAtMostIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer)
    {
        return GetAtMostIndex(list, value, comparer, 0, list.Count);
    }

    public static int GetAtLeastIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer)
    {
        return GetAtLeastIndex(list, value, comparer, 0, list.Count);
    }

    public static int GetAtMostIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer, int index, int count)
    {
        if (count == 0)
        {
            return -1;
        }

        int startIndex = index;
        int endIndex = index + count - 1;
        int middleIndex = 0;
        int compareResult = -1;

        while (startIndex < endIndex)
        {
            middleIndex = (startIndex + endIndex) >> 1; //  / 2
            compareResult = comparer.Invoke(list[middleIndex], value);

            if (compareResult > 0)
            {
                endIndex = middleIndex - 1;
            }
            else if (compareResult < 0)
            {
                startIndex = middleIndex + 1;
            }
            else
            {
                return middleIndex;
            }
        }

        if (startIndex == endIndex)
        {
            compareResult = comparer.Invoke(list[startIndex], value);

            if (compareResult <= 0)
            {
                return startIndex;
            }
            else
            {
                int returnIndex = startIndex - 1;

                if (returnIndex < index)
                {
                    return -1;
                }
                else
                {
                    return returnIndex;
                }
            }
        }
        else
        {
            //todo: test
            return startIndex - 1;
        }
    }

    public static int GetAtLeastIndex<TItem, TValue>(/*this*/ IList<TItem> list, TValue value, Func<TItem, TValue, int> comparer, int index, int count)
    {
        if (count == 0)
        {
            return -1;
        }

        int startIndex = index;
        int endIndex = index + count - 1;
        int middleIndex = 0;
        int compareResult = -1;

        while (startIndex < endIndex)
        {
            middleIndex = (startIndex + endIndex) >> 1; //  / 2
            compareResult = comparer.Invoke(list[middleIndex], value);

            if (compareResult > 0)
            {
                endIndex = middleIndex - 1;
            }
            else if (compareResult < 0)
            {
                startIndex = middleIndex + 1;
            }
            else
            {
                return middleIndex;
            }
        }

        if (startIndex == endIndex)
        {
            compareResult = comparer.Invoke(list[startIndex], value);

            if (compareResult >= 0)
            {
                return startIndex;
            }
            else
            {
                int returnIndex = startIndex + 1;

                if (returnIndex >= index + count)
                {
                    return -1;
                }
                else
                {
                    return returnIndex;
                }
            }
        }
        else
        {
            return endIndex + 1;
        }
    }
}
谁许谁一生繁华 2024-08-17 12:49:42

我创建了几个为任何数据类型提供此功能的数据结构:BList(排序列表)、BDictionary(一个字典,其项目为按键排序)和 BMultiMap (一种字典,其中多个值可以与一个键关联)。有关详细信息,请参阅本文。这些数据结构中的每一个都提供 FindLowerBound()FindUpperBound() 方法,其工作方式类似于 C++ 的 lower_boundupper_bound。在内部,这些集合类似于 B+ 树,因此它们具有良好的性能和较低的内存用法; BDictionary<,> 通常比标准 SortedDictionary<,> 使用的内存少约 44%(而标准 SortedDictionary<,> 使用的内存平均比 略少) Dictionary<,>),假设 64 位键和 64 位值。

我还制作了一个“稀疏”集合 SparseAList,它与 BDictionary 类似,只是您可以在任何地方插入和删除“空白空间”在集合中(空空间不消耗任何内存)。有关详细信息,请参阅本文

所有这些集合都位于 Loyc.Collections NuGet 包中。

I created several data structures that provide this functionality for any data type: BList<T> (a sorted list), BDictionary<K,V> (a dictionary whose items are sorted by key), and BMultiMap<K,V> (a dictionary in which more than one value can be associated with a key). See this article for details. Each of these data structures provide FindLowerBound() and FindUpperBound() methods that work like C++'s lower_bound and upper_bound. Internally, these collections are similar to B+ trees, so they have good performance and low memory usage; BDictionary<,> typically uses about 44% less memory than a standard SortedDictionary<,> (which in turn uses, on average, slightly less memory than Dictionary<,>), assuming 64-bit keys and 64-bit values.

I also made a "sparse" collection, SparseAList<T>, which is similar to BDictionary<int,T> except that you can insert and remove "empty space" anywhere in the collection (empty space does not consume any memory). See this article for details.

All of these collections are in the Loyc.Collections NuGet package.

你的心境我的脸 2024-08-17 12:49:42

找到最接近 K:

dict.Keys.Where(i => i >= K).OrderBy(i => i).First();

或更快:

public int? GetNearestKey(dict, K) 
{
    int? lowerK = null;
    foreach (int key in dict.Keys)
    {
        if (key == K) 
        {
            lowerK = K;
            break; 
        }
        else if (key >= K && (!lowerK.HasValue || key < lowerK))
        {
            lowerK = key;
        }
    }
    return lowerK;
}

find nearest to K:

dict.Keys.Where(i => i >= K).OrderBy(i => i).First();

or much faster:

public int? GetNearestKey(dict, K) 
{
    int? lowerK = null;
    foreach (int key in dict.Keys)
    {
        if (key == K) 
        {
            lowerK = K;
            break; 
        }
        else if (key >= K && (!lowerK.HasValue || key < lowerK))
        {
            lowerK = key;
        }
    }
    return lowerK;
}
傾城如夢未必闌珊 2024-08-17 12:49:42

基本框架中没有二叉搜索树集合实现,因此您必须构建一个或找到一个实现。正如您所指出的,SortedList 在搜索方面最接近,但在插入/删除方面速度较慢(由于其底层数组实现)。

There isn't a binary search tree collection implementation in the base framework, so you'll either have to build one or find an implementation. As you noted, SortedList is closest in terms of searching but is slower (due to its underlying array implementation) for insertion/deletion.

相守太难 2024-08-17 12:49:42

我认为有关 SortedList< 的问题存在错误/em> 复杂性。

SortedList插入新项目。如果您事先知道容量,那么在最坏的情况下可以在 O(Log(n)) 内完成。

I think there's a mistake in the question about SortedList complexity.

SortedList has O(log(n)) amortized complexity for inserting new item. If you know in advance the capacity it can be done in O(Log(n)) in the worst case.

┾廆蒐ゝ 2024-08-17 12:49:42

您可以使用以下扩展方法对 SortedSet 执行此操作:

public static class SortedSetExtensions
{
    public static bool FindLowerOrEqualThan<T>(this SortedSet<T> set, T value, out T first)
    {
        if(set.Count == 0)
        {
            first = default(T);
            return false;
        }

        var minimum = set.Min;

        if(set.Comparer.Compare(minimum, value) > 0)
        {
            first = default(T);
            return false;
        }

        first = set.GetViewBetween(minimum, value).Max;
        return true;
    }

    public static bool FindGreaterOrEqualThan<T>(this SortedSet<T> set, T value, out T first)
    {
        if (set.Count == 0)
        {
            first = default(T);
            return false;
        }

        var maximum = set.Max;

        if (set.Comparer.Compare(maximum, value) < 0)
        {
            first = default(T);
            return false;
        }

        first = set.GetViewBetween(value, maximum).Min;
        return true;
    }
}

You can do this for SortedSet<T> with following extension methods:

public static class SortedSetExtensions
{
    public static bool FindLowerOrEqualThan<T>(this SortedSet<T> set, T value, out T first)
    {
        if(set.Count == 0)
        {
            first = default(T);
            return false;
        }

        var minimum = set.Min;

        if(set.Comparer.Compare(minimum, value) > 0)
        {
            first = default(T);
            return false;
        }

        first = set.GetViewBetween(minimum, value).Max;
        return true;
    }

    public static bool FindGreaterOrEqualThan<T>(this SortedSet<T> set, T value, out T first)
    {
        if (set.Count == 0)
        {
            first = default(T);
            return false;
        }

        var maximum = set.Max;

        if (set.Comparer.Compare(maximum, value) < 0)
        {
            first = default(T);
            return false;
        }

        first = set.GetViewBetween(value, maximum).Min;
        return true;
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文