快速字符串查找的最佳集合

发布于 2024-10-28 21:46:17 字数 275 浏览 2 评论 0原文

我需要一个字符串列表以及一种快速确定该列表中是否包含字符串的方法。

为了提高查找速度,我考虑了 SortedListDictionary;然而,当我只需要一个字符串时,两者都可以使用KeyValuePair

我知道我可以使用 KeyValuePair 并简单地忽略 Value 部分。但我确实更喜欢高效,只是想知道是否有更适合我要求的集合。

I need a list of strings and a way to quickly determine if a string is contained within that list.

To enhance lookup speed, I considered SortedList and Dictionary; however, both work with KeyValuePairs when all I need is a single string.

I know I could use a KeyValuePair and simply ignore the Value portion. But I do prefer to be efficient and am just wondering if there is a collection better suited to my requirements.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

独留℉清风醉 2024-11-04 21:46:17

如果您使用的是 .NET 3.5 或更高版本,请使用 HashSet

如果做不到这一点,Dictionary(或者您想要的 TValue 类型参数的任何类型)将比 SortedList 更快,如果你有很多条目 - 后者将使用二分搜索,因此它将是 O(log n) 查找,而不是 O(1)。

If you're on .NET 3.5 or higher, use HashSet<String>.

Failing that, a Dictionary<string, byte> (or whatever type you want for the TValue type parameter) would be faster than a SortedList if you have a lot of entries - the latter will use a binary search, so it'll be O(log n) lookup, instead of O(1).

朦胧时间 2024-11-04 21:46:17

如果您只想知道字符串是否在集合中,请使用 HashSet

If you just want to know if a string is in the set use HashSet<string>

硪扪都還晓 2024-11-04 21:46:17

这听起来像是 Per MSDN 的工作

 var keys = new HashSet<string>();

:Contains 函数有 O( 1)复杂性。

但您应该注意,添加时不会出现重复错误。

This sounds like a job for

 var keys = new HashSet<string>();

Per MSDN: The Contains function has O(1) complexity.

But you should be aware that it does not give an error for duplicates when adding.

看轻我的陪伴 2024-11-04 21:46:17

HashSet就像一个 < code>Dictionary,但只有键。

HashSet<string> is like a Dictionary, but with only keys.

我纯我任性 2024-11-04 21:46:17

如果您想滚动自己的数据结构,请使用 Trie。
http://en.wikipedia.org/wiki/Trie

最坏的情况是字符串是存在:O(字符串长度)

If you feel like rolling your own data structure, use a Trie.
http://en.wikipedia.org/wiki/Trie

worst-case is if the string is present: O(length of string)

西瑶 2024-11-04 21:46:17

我知道这个答案对本次聚会来说有点晚了,但我遇到了我们的系统运行缓慢的问题。经过分析后,我们发现我们的数据结构结构化方式发生了大量字符串查找。

因此我们做了一些研究,遇到了这些基准测试,做了我们自己的测试,现在已经改用 SortedList。

if (sortedlist.ContainsKey(thekey))
{   
//found it.
}

尽管事实证明字典速度更快,但我们需要重构的代码更少,而且性能的提高对我们来说已经足够好了。

无论如何,想分享该网站,以防其他人遇到类似的问题。它们在数据结构之间进行比较,其中您要查找的字符串是“键”(如哈希表、字典等)或“值”(列表、数组或字典等),这就是我们的字符串所在的位置存储。

I know this answer is a bit late to this party, but I was running into an issue where our systems were running slow. After profiling we found out there was a LOT of string lookups happening with the way we had our data structures structured.

So we did some research, came across these benchmarks, did our own tests, and have switched over to using SortedList now.

if (sortedlist.ContainsKey(thekey))
{   
//found it.
}

Even though a Dictionary proved to be faster, it was less code we had to refactor, and the performance increase was good enough for us.

Anyway, wanted to share the website in case other people are running into similar issues. They do comparisons between data structures where the string you're looking for is a "key" (like HashTable, Dictionary, etc) or in a "value" (List, Array, or in a Dictionary, etc) which is where ours are stored.

挽清梦 2024-11-04 21:46:17

我知道这个问题已经很老了,但我只需要解决同样的问题,只是针对一小部分字符串(2 到 4 之间)。

就我而言,我实际上对字符串数组进行了手动查找,结果比 HashSet(我对其进行了基准测试)要快得多。

for (int i = 0; i < this.propertiesToIgnore.Length; i++)
{
    if (this.propertiesToIgnore[i].Equals(propertyName))
    {
        return true;
    }
}

请注意,它比仅适用于小型数组的哈希集更好!

编辑:仅适用于手动 for 循环,不使用 LINQ,注释中详细信息

I know the question is old as hell, but I just had to solve the same problem, only for a very small set of strings(between 2 and 4).

In my case, I actually used manual lookup over an array of strings which turned up to be much faster than HashSet<string>(I benchmarked it).

for (int i = 0; i < this.propertiesToIgnore.Length; i++)
{
    if (this.propertiesToIgnore[i].Equals(propertyName))
    {
        return true;
    }
}

Note, that it is better than hash set for only for tiny arrays!

EDIT: works only with a manual for loop, do not use LINQ, details in comments

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文