将项目集合分类到存储桶中的最有效方法是什么?
我有一个任意哈希数组,其中哈希元素是一个整数(称为“id”)。我想将这些哈希值排序到多个桶中(在数组上恒定),其中每个桶是任意范围的“id”(例如1-10、15-20、20-30)。执行此操作的最佳排序策略是什么?是否可以不使用嵌套循环?
I have an array of arbitrary hashes, with an element of the hash an integer (call it 'id'). I want to sort these hashes into a number of buckets (constant over the array), where each bucket is an arbitrary range of 'ids' (e.g. 1-10, 15-20, 20-30). What is the best sorting strategy to do this? Is it possible to do without a nested loop?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果存储桶的数量很少,那么使用嵌套循环可能会更好。外部循环遍历哈希值,内部循环遍历存储桶。
O(n*m)
。如果散列的数量和存储桶的数量很大,您可以:
基本上循环遍历散列,将它们添加到当前存储桶,并在需要时前进到下一个存储桶。 O(n*log(n) + m*log(m))
If the number of buckets is small, you are probably better off with the nested loops. The outer loop over the hashes, and the inner over the buckets.
O(n*m)
.If the number of hashes, and the number of buckets are large, you can:
The basically loops through the hashes adding them to the current bucket and advancing to the next bucket when needed. O(n*log(n) + m*log(m))
如果哈希质量良好,它们将表现出均匀分布,因此您可以使用均匀分布的存储桶在一次传递中对集合进行分区。
如果您还希望哈希值在存储桶中排序,请在所有内容都存储在存储桶中后使用正常的排序算法。然而,这对于哈希值来说是一种不寻常的使用。 (如果您不想在存储桶内排序,那么“排序”这个词用词不当。您真正想要的是分区。)
If the hashes are good quality, they will exhibit an even distribution, so you can use evenly-distributed buckets to partition the collection in a single pass.
If you also want the hashes sorted within the buckets, use a normal sorting algorithm after everything is in buckets. This would be an unusual use of hashes, however. (If you aren't trying to sort within buckets, then the word "sort" is a misnomer. What you really wanted was partitioning.)
您没有提到语言/平台,而是为了提高击键效率(C#):
You don't mention a language/platform, but for efficient in terms of keystrokes (C#):