寻求最佳算法(渐近地)并忽略其他细节
在某些情况下,解决问题的强力方法具有复杂性,在性能方面不够好。
我们以 Theta(n^2) 为例。
使用递归方法可以将其改进为 Theta(nlogn)。
显然,渐进地人们会选择使用递归方法,因为对于越来越大的输入 N,增长的阶数较低,即性能更好。
我的问题如下:
如果渐进地随着 N 变得越来越大,递归(即分而治之的方法)表现得更好,当我们对 N 的巨大输入进行递归时,忽略这一点不是不现实吗?最终可能会耗尽堆栈吗?
由于投入巨大,我们实际上永远不会得到结果。
那么,如果我们忽略这些细节,我们如何才能确保为特定问题选择最佳算法呢?
There are cases that a brute force approach on a problem has a complexity that is not good enough performance-wise.
Let's take for example e.g. Theta(n^2).
Using a recursive approach it can be improved to Theta(nlogn).
Obviously, asymptotically one would opt to use the recursive approach since for larger and larger input N the order of growth is lower i.e. better performance.
My question is the following:
If asymptotically as N becomes larger and larger the recursive(i.e. divide and conquer approach) performs better, isn't it unrealistic to ignore that as we recurse on huge inputs of N we could eventually run out of stack?
As a result for huge input we actually never get a result.
So how can we be sure that we choose the best algorithm for a specific problem if we ignore these details?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
可以在不使用硬件堆栈的情况下实现递归算法。如果堆栈溢出的可能性是一个严重的问题,您可以实现自己的堆栈或简单地限制递归的深度。
但是,如果算法递归地划分一组数据(大小为
n
),则递归次数将与log n
成正比。这意味着要耗尽大小为s
的堆栈,您需要大小约为2^s
的数据,该数据随着s
增长得非常快。人们往往不了解指数函数增长的速度有多快。我的观点是,使用这种类型的算法(在每个递归步骤中分割一组数据),输入数据不太可能大到需要如此多的递归以致耗尽堆栈。编辑:但是我不是专业程序员,所以我缺乏实践经验。
It is possible to implement recursive algorithms without using the hardware stack. If the possibility of a stack overflow is a serious concern, you can implement your own stack or simply limit the depth of the recursion.
However, if the algorithm is dividing a set of data (of size
n
) recursively, then the number of recursions will be proportional tolog n
. This means that to exhaust a stack of sizes
you need data of size on the order of2^s
, which grows extremely quickly withs
. People tend to not appreciate how quickly the exponential function is growing. My point is that with this type of algorithm (splitting a set of data in each recursive step) it is quite unlikely that the input data can be large enough to need so many recursions as to exhaust the stack.EDIT: But then I'm not a professional programmer, so I'm lacking on the practical experience front.
这是我第一次在快速排序中看到的一个有用的技巧:
这样做的要点是,每个递归调用都针对问题的一小半 - 通常小于原始问题大小的一半 - 而问题的较大一半是通过迭代完成。如果问题的大小为 2^n,则堆栈的大小只需为 n,即使由于分裂非常不均匀,您最终花费的时间接近 n^2 而不是 n log n。
至于使用 n log n 时间但加载堆栈的算法的“实际成本”,请注意,在 n log n 时间内运行的算法没有时间使用超过 n log n 的内存,因此加载和堆栈负载实际上不可能是大量内存,尽管它可能足以使 Java 解释器在您的递归失控的印象下崩溃。
Here is a useful trick that I first saw done in quicksort:
The point of this is that each recursive call is on the little half of the problem - typically less than half the size of the original problem - and the larger half of the problem is done via iteration. If the problem is of size 2^n, the stack need be of size only n, even if - because of very uneven splits - you end up taking time close to n^2 instead of n log n.
As far as the "real cost" goes of algorithms which use n log n time but loads of stack, note that an algorithm that runs in time n log n does not have the time to use more than n log n memory, so loads and loads of stack can't actually be huge amounts of memory, although it might be enough to make e.g. a Java interpreter crash out your program under the impression that you have runaway recursion.