计算运行时复杂度时如何找出基本操作?
我试图在创建的几个算法上获得最差的运行时复杂度顺序。 然而,我遇到了一个问题,即我总是倾向于为算法选择错误或错误数量的基本操作。
对我来说,基本操作的选择更像是一门艺术,而不是一门科学。 在谷歌搜索并阅读我的文本框之后,我仍然没有找到一个好的定义。 到目前为止,我将其定义为“算法执行中始终发生的操作”,例如比较或数组操作。
但是算法通常有很多总是要执行的比较,那么您选择哪个操作呢?
I am trying to get the worst run-time complexity order on a couple of algorithms created. However I have run into a problem that I keep tending to select the wrong or wrong amount of fundamental operations for an algorithm.
To me it appears to be that the selection of the fundamental operation is more of an art than a science. After googling and reading my text boxes, I still have not found a good definition. So far I have defined it as "An operation that always occurs within an algorithms execution" such as a comparison or array manipulation.
But algorithms often have many comparisons that are always executed so which operation do you pick?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我在某种程度上同意这是一门艺术,因此在编写文档等时应该始终澄清。但通常这是对底层数据结构的“访问”。 所以就像你说的,对于数组来说,它是比较或交换,对于哈希图来说,它可能是对键的手动检查,对于图来说,它是对顶点或边的访问,等等。
I agree to some degree it's an art, so you should always clarify when writing documentation, etc.. But usually it's a "visit" to the underlying data structure. So like you said, for an array it's a comparison or a swap, for a hash map it may be a manual examination of a key, for a graph it's a visit to a vertex or edge, etc.
即使实践复杂性理论家对此类事情也存在分歧,因此以下内容可能有点主观:http://blog.computationalcomplexity.org/2009/05/shaving-logs-with-unit-cost.html
大O表示法的目的是总结效率为读者提供一个算法。 在实际情况中,我最关心的是算法需要多少个时钟周期,假设大O常数既不是极小也不是极大(并且忽略内存层次结构的影响); 这是链接帖子中提到的“单位成本”模型。
对排序算法的比较进行计数的原因是比较的成本取决于输入数据的类型。 您可以说排序算法需要 O(cn log n) 个周期,其中 c 是比较的费用,但在这种情况下,对比较进行计数更简单,因为该算法执行的其他工作是 O(n log n)。 有一种排序算法,可以在 n^2 log n 步和 n^2 次比较中对 n 个长度为 n 的已排序数组的串联进行排序; 在这里,我希望分别说明比较次数和计算开销,因为两者都不一定占主导地位。
Even practicing complexity theorists have disagreements about this sort of thing, so what follows may be a bit subjective: http://blog.computationalcomplexity.org/2009/05/shaving-logs-with-unit-cost.html
The purpose of big-O notation is to summarize the efficiency of an algorithm for the reader. In practical contexts, I am most concerned with how many clock cycles an algorithm takes, assuming that the big-O constant is neither extremely small or large (and ignoring the effects of the memory hierarchy); this is the "unit-cost" model alluded to in the linked post.
The reason to count comparisons for sorting algorithms is that the cost of a comparison depends on the type of the input data. You could say that a sorting algorithm takes O(c n log n) cycles where c is the expense of a comparison, but it's simpler in this case to count comparisons instead because the other work performed by the algorithm is O(n log n). There's a sorting algorithm that sorts the concatenation of n sorted arrays of length n in n^2 log n steps and n^2 comparisons; here, I would expect that the number of comparisons and the computational overhead be stated separately, because neither necessarily dominates the other.
这仅在您实际实现该算法时才有效,但您可以使用探查器来查看哪个操作是瓶颈。 这是一个实际的观点。 从理论上讲,有些人假设所有不是基本操作的操作都在零时间内运行。
This only works when You have actually implemented the algorithm, but You could just use a profiler to see which operation is the bottleneck. That's a practical point of view. In theory, some assume that everything that is not the fundamental operation runs in zero time.
我听到的比较简单的定义是:
例如,在排序算法中,这些往往是比较而不是赋值,因为您几乎总是必须在重新排序之前访问并“检查”元素,但检查可能不会导致重新排序。 因此,比较的次数至少与作业的次数一样多。
The somewhat simple definition I have heard is:
For example, in a sorting algorithm, these tend to be comparisons rather than assignments as you almost always have to visit and 'check' an element before you re-order it, but the check may not result in a re-ordering. So there will always be at-least as many comparisons as assignments.