你能以 O(n) 摊余复杂度对 n 个整数进行排序吗?

发布于 2024-11-09 17:54:37 字数 212 浏览 3 评论 0原文

理论上是否可以以 O(n) 的摊余复杂度对 n 个整数的数组进行排序?

尝试创建 O(n) 复杂度的最坏情况怎么样?

现在的大多数算法都是建立在平均 O(nlogn) + 最坏情况 O(n^2) 之上。 有些虽然使用更多内存,但最糟糕的是 O(nlogn)。

你能在不限制内存使用的情况下创建这样的算法吗? 如果你的记忆力有限怎么办?这会对你的算法造成什么影响?

Is it theoretically possible to sort an array of n integers in an amortized complexity of O(n)?

What about trying to create a worst case of O(n) complexity?

Most of the algorithms today are built on O(nlogn) average + O(n^2) worst case.
Some, while using more memory are O(nlogn) worst.

Can you with no limitation on memory usage create such an algorithm?
What if your memory is limited? how will this hurt your algorithm?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

丢了幸福的猪 2024-11-16 17:54:37

intertubes 上任何涉及基于比较的排序的页面都会告诉您您<使用比较排序时,strong>无法比 O(n lg n) 更快地排序。也就是说,如果您的排序算法通过比较 2 个元素来决定顺序,那么您就不能做得更好了。示例包括快速排序、冒泡排序、合并排序。

某些算法(例如计数排序、桶排序或基数排序)不使用比较。相反,它们依赖于数据本身的属性,例如数据中值的范围或数据值的大小。

这些算法可能具有更快的复杂性。这是一个示例场景:

您正在对 10^6 个整数进行排序,每个整数都在 010 之间。然后你可以数出零、一、二等的数量,然后按排序顺序将它们吐出来。这就是 countsort 的工作原理,在 O(n + m) 中,其中 m 是数据可以采用的值的数量(在本例中,m=11< /代码>)。


其他:

您正在对长度最多为 5 个字符的 10^6 二进制字符串进行排序。您可以使用基数排序:首先根据第一个字符将它们分成 2 个桶,然后对第二个字符、第三个、第四个和第五个字符进行基数排序。只要每一步都是稳定的排序,您最终应该得到一个 O(nm) 的完美排序列表,其中 m 是数据中的位数或位数(在本例中,<代码>m=5)。

但在一般情况下,您无法比 O(n lg n) 可靠地更快地排序(使用比较排序)。

Any page on the intertubes that deals with comparison-based sorts will tell you that you cannot sort faster than O(n lg n) with comparison sorts. That is, if your sorting algorithm decides the order by comparing 2 elements against each other, you cannot do better than that. Examples include quicksort, bubblesort, mergesort.

Some algorithms, like count sort or bucket sort or radix sort do not use comparisons. Instead, they rely on the properties of the data itself, like the range of values in the data or the size of the data value.

Those algorithms might have faster complexities. Here is an example scenario:

You are sorting 10^6 integers, and each integer is between 0 and 10. Then you can just count the number of zeros, ones, twos, etc. and spit them back out in sorted order. That is how countsort works, in O(n + m) where m is the number of values your datum can take (in this case, m=11).

Another:

You are sorting 10^6 binary strings that are all at most 5 characters in length. You can use the radix sort for that: first split them into 2 buckets depending on their first character, then radix-sort them for the second character, third, fourth and fifth. As long as each step is a stable sort, you should end up with a perfectly sorted list in O(nm), where m is the number of digits or bits in your datum (in this case, m=5).

But in the general case, you cannot sort faster than O(n lg n) reliably (using a comparison sort).

玩套路吗 2024-11-16 17:54:37

我对到目前为止所接受的答案不太满意。所以我正在重试答案:

理论上是否可以以 O(n) 的摊余复杂度对 n 个整数的数组进行排序?

这个问题的答案取决于执行排序算法的机器。如果您有一台可以在 1 位上运行的随机存取机器,您可以执行基数排序对于最多 k 位的整数,这已经被建议了。所以最终的复杂度是O(kn)
但是,如果您在字长至少为 k 位(所有消费计算机都是)的固定大小字机器上操作,则您可以实现的最佳效果是 O(n log n)。这是因为 log n log n log n log n log n log n log n log n log n log n k 或者您可以先进行 计数排序,然后使用 进行排序O (n log n) 算法,这也会产生第一种情况。

尝试创建 O(n) 复杂度的最坏情况怎么样?

那是不可能的。已经给出了链接。证明的想法是,为了能够排序,您必须决定每个要排序的元素是否大于或小于任何其他要排序的元素。通过使用传递性,这可以表示为决策树,它最多具有 n 个节点和 log n 深度。因此,如果您希望获得比 Ω(n log n) 更好的性能,这意味着从决策树中删除边。但如果决策树不完整,那么如何确保您对某些元素 ab 做出了正确的决策?

你能在不限制内存使用的情况下创建这样的算法吗?

所以从上面来看这是不可能的。因此,剩下的问题是无关紧要的。

I'm not quite happy with the accepted answer so far. So I'm retrying an answer:

Is it theoretically possible to sort an array of n integers in an amortized complexity of O(n)?

The answer to this question depends on the machine that would execute the sorting algorithm. If you have a random access machine, which can operate on exactly 1 bit, you can do radix sort for integers with at most k bits, which was already suggested. So you end up with complexity O(kn).
But if you are operating on a fixed size word machine with a word size of at least k bits (which all consumer computers are), the best you can achieve is O(n log n). This is because either log n < k or you could do a count sort first and then sort with a O (n log n) algorithm, which would yield the first case also.

What about trying to create a worst case of O(n) complexity?

That is not possible. A link was already given. The idea of the proof is that in order to be able to sort, you have to decide for every element to be sorted if it is larger or smaller to any other element to be sorted. By using transitivity this can be represented as a decision tree, which has n nodes and log n depth at best. So if you want to have performance better than Ω(n log n) this means removing edges from that decision tree. But if the decision tree is not complete, than how can you make sure that you have made a correct decision about some elements a and b?

Can you with no limitation on memory usage create such an algorithm?

So as from above that is not possible. And the remaining questions are therefore of no relevance.

坚持沉默 2024-11-16 17:54:37

如果整数在有限范围内,那么它们的 O(n)“排序”将涉及具有“n”位的位向量...循环遍历所讨论的整数并设置偏移量 n/ 的 n%8 位/8 在该字节数组中为 true。这是一个“O(n)”操作。同样,对该位数组进行列表/枚举/返回/打印所有设置位的另一个循环是 O(n) 操作。 (自然地,O(2n) 被简化为 O(n))。

这是一种特殊情况,其中 n 足够小,可以容纳在内存或文件中(使用 seek() 操作)。这不是一个通用的解决方案;但它在 Bentley 的“编程珍珠”中进行了描述——据称是解决现实世界问题的实用解决方案(涉及诸如电话号码“自由列表”之类的内容......类似于:找到第一个可用的电话号码发行给新订户)。

(注:log(10*10) 约为 24 位,用于表示长度最多为 10 位的每个可能的整数...因此典型 Unix/Linux 的 2*31 位有足够的空间最大大小的内存映射)。

If the integers are in a limited range then an O(n) "sort" of them would involve having a bit vector of "n" bits ... looping over the integers in question and setting the n%8 bit of offset n//8 in that byte array to true. That is an "O(n)" operation. Another loop over that bit array to list/enumerate/return/print all the set bits is, likewise, an O(n) operation. (Naturally O(2n) is reduced to O(n)).

This is a special case where n is small enough to fit within memory or in a file (with seek()) operations). It is not a general solution; but it is described in Bentley's "Programming Pearls" --- and was allegedly a practical solution to a real-world problem (involving something like a "freelist" of telephone numbers ... something like: find the first available phone number that could be issued to a new subscriber).

(Note: log(10*10) is ~24 bits to represent every possible integer up to 10 digits in length ... so there's plenty of room in 2*31 bits of a typical Unix/Linux maximum sized memory mapping).

情栀口红 2024-11-16 17:54:37

我相信您正在寻找基数排序

I believe you are looking for radix sort.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文