我们是否需要了解/查找/分析算法的每种情况{最佳、平均和最差...所有}场景?
在有关数据结构和算法的书籍中,我们经常看到它们并没有分析所有算法的每种情况。
有些算法是与平均情况一起讨论的,有些是平均情况和最坏情况一起讨论的,有些算法是最好、平均和最坏情况一起讨论的。
为什么他们倾向于这样做?
为什么我们不需要了解所有算法的所有情况?
In the books about data-structures and algorithms, we often see that they do not analyze every case scenarios of all algorithms.
Some algorithms are discussed along with average case, some are with average and worst case and others are all of best, average and worst.
Why they tend to do that?
Why don't we need to know all cases for all algorithms?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
除非您控制输入,否则最好的情况通常是无用的。 (即最好的情况通常是异常情况)。除非很容易计算,否则不值得浪费时间。
平均情况:这是您可以预期的一般情况。假设您使用大量输入,这通常是最需要考虑的事情。
最坏的情况:如果您处理任意输入(特别是如果它们不受信任 - 即您接受网络上人们的输入),那么对于具有相同平均情况的两种算法来说,这是一个不错的平局打破者。一般来说,在设计中也需要考虑一些事情——这偶尔会出现。一般来说,如果您有两种平均情况为 O(n) 的算法,但一种是最坏情况的 O(n lg n),另一种是 O(n^2) - 它可能会影响您选择哪种算法的决定。或者它可能会影响你的算法设计。
示例:快速排序与合并排序。两者都是 O(n lg n) 快速排序的最坏情况是 O(n^2),合并排序 (IIRC) 仍然是 O(n lg n) - 但总的来说,如果数据适合内存,快速排序往往会更快。即使它有一个成本更高的最坏情况,因为我们知道它有一个成本更高的最坏情况,我们可以尝试减轻它(三中位数而不只是随机分区等)并利用以下事实:比合并排序更快。
Best case is generally useless unless you control the inputs. (i.e. best case is usually an anomalous case). Unless it's easy to compute, it's not worth wasting your time.
Average case: it's what can you expect in general. Assuming you work with a large range of inputs, this is usually the most useful thing to consider.
Worst case: a decent tie breaker for two algorithms with the same average case, if you deal with arbitrary inputs (especially if they're untrusted - i.e. you're accepting inputs from people on the web). Also something to consider in design in general - this will come up occasionally. In general, if you have two algorithms that are O(n) average case, but one is O(n lg n) worst case, and one is O(n^2) - it may influence your decision on which to go with. Or it may influence your algorithm design.
Example: quick sort v. merge sort. Both are O(n lg n) Quicksort's worst case is O(n^2), merge sort is (IIRC) still O(n lg n) - but in general, Quicksort tends to be faster if data fits in memory. Even though it has a costlier worst case, since we KNOW it has a costlier worst case, we can try to mitigate it (median-of-three instead of just a random partition, etc.) and take advantage of the fact that it usually is faster than mergesort.
最坏情况分析通常是花费适量精力分析算法的最有用方法。平均情况更复杂,因为平均情况通常取决于不同输入的可能性,因此您必须根据不同输入的概率来说明平均情况。最好的情况不是很有用,因为希望我的程序可以在 1 秒内完成,这不允许我计划一些需要很长时间的活动(如果我的程序实际上需要 3 个小时)。知道无论如何,它都会在五分钟内完成,这是更有用的。
最好的情况也有一个问题,程序存储少量预先准备的输入和输出,然后根据预先准备的输入检查它们获得的输入。如果他们获得匹配,那么他们会使用预先准备的输出进行响应,而不做任何其他事情,因此获得很好但毫无意义的最佳情况行为。
在某些情况下,最坏情况分析并不是您想要的。设计加密算法的人可能希望保证在 10 年内没有人可以破解它(不幸的是,通常这样的保证不存在)。
Worst case analysis is usually the most useful way to spend a moderate amount of effort analysing an algorithm. Average case is more complicated, because the average case usually depends on how likely different inputs are, so you have to say what the average case is in terms of what the probability is in terms of different inputs. Best case is not very useful because a hope that my program might complete in 1 second does not allow me to plan some activity that will take too long if my program actually takes 3 hours. Knowing that, no matter what, it will complete in five minutes, is much more useful.
Best case also has a problem with programs that store a small number of pre-prepared inputs and outputs, then check the input they get against the pre-prepared inputs. If they get a match, then they respond with the pre-prepared output without doing anything else and so get great - but meaningless - best case behaviour.
There are some cases where worst case analysis is not what you want. Somebody designing an encryption algorithm might want a guarantee that nobody can break it in less than 10 years (unfortunately, typically such guarantees don't exist).