在哈希表和排序列表中查找项目哪个更快?

发布于 2024-07-22 06:09:19 字数 26 浏览 6 评论 0原文

在哈希表和排序列表中查找项目哪个更快?

Which is faster to find an item in a hashtable or in a sorted list?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

暗恋未遂 2024-07-29 06:09:19

算法复杂性是一件好事,众所周知,哈希表的复杂度是O(1),而排序向量(在你的情况下,我想使用排序数组比列表更好)将提供O(log n) 访问时间。

但您应该知道,复杂性表示法为您提供了 N 趋于无穷大的访问时间。 这意味着,如果您知道您的数据将不断增长,复杂性符号将为您提供有关选择算法的一些提示。

当您知道您的数据将保持相当低的长度时:例如,您的数组/哈希表中只有几个条目,您必须继续观察和测量。 所以做个测试吧。

例如,在另一个问题中:对数组进行排序。 对于一些条目,冒泡排序虽然 O(N^2) 可能比快速排序更快,但 O(n log n)< /em>.

此外,根据其他答案,并根据您的项目,您必须尝试为您的哈希表实例找到最佳的哈希函数。 否则,它可能会导致哈希表中的查找性能急剧下降(正如 Hank Gay 的答案中指出的那样)。

编辑:看看这篇文章,了解Big O 表示法的含义

Algorithm complexity is a good thing to know, and hashtables are known to be O(1) while a sorted vector (in your case I guess it is better to use a sorted array than a list) will provide O(log n) access time.

But you should know that complexity notation gives you the access time for N going to the infinite. That means that if you know that your data will keep growing, complexity notation gives you some hint on the algorithm to chose.

When you know that your data will keep a rather low length: for instance having only a few entries in your array/hashtable, you must go with your watch and measure. So have a test.

For instance, in another problem: sorting an array. For a few entries bubble sort while O(N^2) may be quicker than .. the quick sort, while it is O(n log n).

Also, accordingly to other answers, and depending on your item, you must try to find the best hash function for your hashtable instance. Otherwise it may lead to dramatic bad performance for lookup in your hashtable (as pointed out in Hank Gay's answer).

Edit: Have a look to this article to understand the meaning of Big O notation .

独夜无伴 2024-07-29 06:09:19

假设“排序列表”是指“可随机访问的排序集合”。 列表具有只能逐个元素遍历它的属性,这将导致 O(N) 复杂度。

在已排序可索引集合中查找元素的最快方法是通过 N 元搜索,O(logN),而没有冲突的哈希表的查找复杂度为 O(1)。

Assuming that by 'sorted list' you mean 'random-accessible, sorted collection'. A list has the property that you can only traverse it element by element, which will result in a O(N) complexity.

The fastest way to find an element in a sorted indexable collection is by N-ary search, O(logN), while a hashtable without collissions has a find complexity of O(1).

情绪失控 2024-07-29 06:09:19

除非哈希算法极其慢(和/或坏),否则哈希表会更快。

更新:正如评论者所指出的,您也可能会因太多冲突而导致性能下降,这并不是因为您的哈希算法不好,而是因为哈希表不够大。 大多数库实现(至少在高级语言中)会在幕后自动增长哈希表 - 这将导致触发增长的插入性能低于预期 - 但如果你正在滚动自己的哈希表,这绝对是一些事情考虑。

Unless the hashing algorithm is extremely slow (and/or bad), the hashtable will be faster.

UPDATE: As commenters have pointed out, you could also be getting degraded performance from too many collisions not because your hash algorithm is bad but simply because the hashtable isn't big enough. Most library implementations (at least in high-level languages) will automatically grow your hashtable behind the scenes—which will cause slower-than-expected performance on the insert that triggers the growth—but if you're rolling your own, it's definitely something to consider.

网名女生简单气质 2024-07-29 06:09:19

SortedList 中的 get 操作是 O(log n),而 HashTable 中的相同操作是 O(1) >。 因此,通常HashTable 会快得多。 但这取决于许多因素:

  • 列表的大小
  • 哈希算法的性能
  • 冲突次数/哈希算法的质量

The get operation in a SortedList is O(log n) while the same operation e a HashTable is O(1). So, normally, the HashTable would be much faster. But this depends on a number of factors:

  • The size of the list
  • Performance of the hashing algorithm
  • Number of collisions / quality of the hashing algorithm
貪欢 2024-07-29 06:09:19

这完全取决于您存储的数据量。

假设你有足够的内存来扔它(因此哈希表足够大),哈希表将在固定的时间内定位目标数据,但是计算哈希的需要会增加一些(也是固定的)开销。

搜索排序列表不会有散列开销,但实际定位目标数据所需的时间会随着列表的增长而增加。

因此,一般来说,对于小数据集,排序列表通常会更快。 (对于经常更改和/或不经常搜索的极小数据集,未排序的列表可能会更快,因为它避免了排序的开销。)随着数据集变大,列表搜索时间的增长掩盖了散列的固定开销,并且散列表变得更快。

该断点的位置将根据您的特定哈希表和排序列表搜索实现而有所不同。 在许多典型大小的数据集上运行测试和基准性能,看看哪个在您的特定情况下实际上表现更好。 (或者,如果代码已经运行得“足够快”,则不要这样做。只需使用您更熟悉的任何一个,而不必担心优化不需要优化的内容。)

It depends entirely on the amount of data you have stored.

Assuming you have enough memory to throw at it (so the hash table is big enough), the hash table will locate the target data in a fixed amount of time, but the need to calculate the hash will add some (also fixed) overhead.

Searching a sorted list won't have that hashing overhead, but the time required to do the work of actually locating the target data will increase as the list grows.

So, in general, a sorted list will generally be faster for small data sets. (For extremely small data sets which are frequently changed and/or infrequently searched, an unsorted list may be even faster, since it avoids the overhead of doing the sort.) As the data set becomes large, the growth of the list's search time overshadows the fixed overhead of hashing, and the hash table becomes faster.

Where that breakpoint is will vary depending on your specific hash table and sorted-list-search implementations. Run tests and benchmark performance on a number of typically-sized data sets to see which will actually perform better in your particular case. (Or, if the code already runs "fast enough", don't. Just use whichever you're more comfortable with and don't worry about optimizing something which doesn't need to be optimized.)

白况 2024-07-29 06:09:19

在某些情况下,这取决于集合的大小(在较小程度上取决于实现细节)。 如果您的列表非常小,可能有 5-10 项,我猜列表会更快。 否则 xtofl 是对的。

In some cases, it depends on the size of the collection (and to a lesser degree, implementation details). If your list is very small, 5-10 items maybe, I'd guess the list would be faster. Otherwise xtofl has it right.

网名女生简单气质 2024-07-29 06:09:19

HashTable 对于包含超过 10 个项目的列表会更有效。 如果列表的项目少于 10 个,则哈希算法带来的开销会更多。

如果您需要快速字典,但还需要以有序的方式保存项目,请使用 OrderedDictionary。 (.Net 2.0 及以上)

HashTable would be more efficient for list containing more than 10 items. If the list has fewer than 10 items, the overhead due to hashing algo will be more.

In case you need a fast dictionary but also need to keep the items in an ordered fashion use the OrderedDictionary. (.Net 2.0 onwards)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文