我可以在亚线性时间内找到未排序数组中的最大/最小值吗?
是否可以?如果不是,给定一个大小为 n 的数组,我如何知道对数组进行排序是否更好?
Is it possible? If not, given an array of size n, how do I know if its better to just sort the array?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
仅使用未排序的数组,无法在亚线性时间内完成此操作。由于您不知道哪个元素最大和最小,因此您必须查看所有元素,因此需要线性时间。
您会发现的最佳排序会比这更糟糕,可能相对于
n log n
,因此进行线性扫描会“更好”。如果允许您存储更多信息,还有其他方法可以加快该过程。您可以使用以下规则存储最小值和最大值:
通过这样做,可能绝大多数检索最小值/最大值都是恒定时间。仅当您删除了最小值或最大值的值时,下一次检索才需要一次检索的线性时间。
此后的下一次检索将再次是常数时间,因为您已经计算并存储了它们,假设您没有再次删除中间的最小/最大值。
仅求最大值的伪代码可能很简单:
在上面的初始化代码中,我们只需创建列表以及最大值和计数。添加最小值并进行计数也很容易。
要添加到列表中,我们遵循上述规则:
删除非常简单。只需删除该值即可。如果是最大值,则减少这些最大值的计数。请注意,只有当您知道当前最大值时,这才有意义 - 如果不知道,则您已经处于必须计算它的状态,因此只需保持该状态即可。
计数变为零将表明最大值现在未知(您已将它们全部删除):
获得最大值就是知道何时需要计算它(当
maxcount
为零时)。如果不需要计算,只需返回它:所有伪代码都使用看似全局变量,
list
、maxval
和maxcount
。在正确设计的系统中,它们当然是实例变量,以便您可以并排运行多个列表。With just the unsorted array, there is no way to do this in sub-linear time. Since you don't know which element is the largest and smallest, you have to look at them all, hence linear time.
The best sort you'll find will be worse than that, probably relative to
n log n
so it will be "better" to do the linear scan.There are other ways to speed up the process if you're allowed to store more information. You can store the minimum and maximum using the following rules:
By doing it that way, probably the vast majority of retrieving min/max are constant time. It's only when you've removed a value which was the min or max does the next retrieval require linear time for one retrieval.
The next retrieval after that will again be constant time since you've calculated and stored them, assuming you don't remove the min/max value in the interim again.
Pseudo-code for just the maximum could be as simple as:
In that initialisation code above, we simply create the list and a maximum value and count. It would be easy to also add the minimum value and count as well.
To add to the list, we follow the rules above:
Deleting is quite simple. Just delete the value. If it was the maximum value, decrement the count of those maximum values. Note that this only makes sense if you know the current maximum - if not, you were already in the state where you were going to have to calculate it so just stay in that state.
The count becoming zero will indicate the maximum is now unknown (you've deleted them all):
Getting the maximum is then a matter of knowing when it needs to be calculated (when
maxcount
is zero). If it doesn't need to be calculated, just return it:All that pseudo-code uses seemingly global variables,
list
,maxval
andmaxcount
. In a properly engineered system, they would of course be instance variables so that you can run multiple lists side-by-side.鉴于一般性问题:
我无法想象有什么机制可以实现这一点。
但是,如果您保留对最小值和最大值的引用并在每次插入/追加/替换操作时更新这些值,则最小/最大值查找的摊销成本可能非常便宜。
与寻找最小值和最大值的简单线性扫描相比,对数组进行排序的成本非常,因此只有在有其他好处时才进行排序。 (当然,插入排序可以提供非常相似的属性来更新每个插入/追加/替换操作的最小值和最大值,因此它可能是足够可以接受的。)
Given the generic question:
I can't imagine any mechanism that would make this happen.
However, if you keep a reference to the min and max value and update the values on every insert / append / replace operation, the amortized cost of min / max lookups can be very cheap.
Sorting the array is very expensive compared to a simple linear scan to find the min and max, so only sort if there is some other benefit. (Of course, insertion sort can provide very similar properties to updating the min and max values on every insert / append / replace operation, so it might be acceptable enough.)
对于未排序的数组,最小/最大复杂度为 O(N)。没有办法超越它。对于排序数组 0(1) 但排序为 0{N log N)。如果您需要仅搜索最小/最大或接近它的排序是没有用的。但是,如果您多次执行此操作,请查看一些搜索结构(例如 Rb 树或堆)以重新组织日期,以避免搜索中的线性时间。
For unsorted array min/max complexity is O(N). No way to outperform it. For sorted arrays 0(1) but sort is 0{N log N). and if you need to search for min/max only ones or near it sort is not useful. But if you go this operation many times look at some of search structures such as Rb-tree or heap to reorganize date for avoid linear time in search.
在这个完整的答案(使用C++代码)中,我在这里找到 - 从数字数组中获取最小值或最大值的最佳方法是什么- com - 它清楚地表明,如果 n 为偶数,则总比较次数为3n/2 - 2(对于奇数,常数为 3/2 ) 。
因此,在忽略对足够大的 n 没有影响的 2 个常量(限定符 3/2 和 -2 )之后,它显然属于 O(n) 并且它是就复杂性而言是线性的,但就效率而言(如果我可以这么说的话)它是 1.5n 并且非常出色
in this complete answer (with C++ code) i found here - What is the best way to get the minimum or maximum value from an Array of numbers - com - it clearly show that the total number of comparisons is 3n/2 - 2 if n is even (and for odd the constant is 3/2 ) .
so after ignoring 2 constants ( qualifier of 3/2, and -2 ) which have no effect for enough large n , it obviously belongs to O(n) and it is linear in terms of complexity but in terms of efficiency (if i can say so) it is 1.5n and is very excellent