如何将多个 Big O 添加/合并为一个

发布于 2024-08-11 12:44:45 字数 200 浏览 5 评论 0原文

如果我有一个由(比方说)三个子算法组成的算法,它们都具有不同的 O() 特征,例如:

  • 算法 A: O(n)
  • 算法 B: O(log(n))
  • 算法 C: O( n log(n))

我如何从理论上估计整个算法的 O()?即不计时或执行任何其他实际测量。

有众所周知的公式或程序吗?

If I have an algorithm which is comprised of (let's say) three sub-algorithms, all with different O() characteristics, e.g.:

  • algorithm A: O(n)
  • algorithm B: O(log(n))
  • algorithm C: O(n log(n))

How do I theoretically estimate the O() for the entire algorithm? I.e. not clocking it or performing any other practical measurement.

Is there a well known formula or procedure?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

情徒 2024-08-18 12:44:45

有众所周知的公式或程序吗?

因为这是基础数学,是的。但在您的情况下更简单:由于所有这些都给出了上限,您可以简单地取所有边界的最大值来获得总上限。

– 前提是所有这些算法都是链接(而不是嵌套)。如果算法是嵌套的,则需要乘以它们的上限(在最简单的情况下)。

为了说明这一点,假设您有一个容器数据结构,其查找单个元素的成本为 O(log n)。您还有一个需要此类查找的算法,但其本身的运行时间为 O(n),假设 查找成本恒定,并对输入使用单个循环,每个循环的查找次数恒定。

现在,如果您想将前面提到的容器数据结构与该算法一起使用,则其查找运行时显然会取代(假设的)恒定时间查找。所以我们有相同的循环,但它的每次迭代现在都需要 O(log n) 而不是恒定时间 O(1),因此整体运行时间变为 O(n< /em> 记录n)。

Is there a well known formula or procedure?

Since that’s basic maths, yes. But it’s even simpler in your case: since all of these give an upper bound, you can simply take the maximum of all the bounds to get the total upper bound.

– Provided that all of these algorithms are chained (rather than nested). If the algorithms are nested, you need to multiply their upper bounds (in the simplest case).

To illustrate, let’s say that you have a container data structure which has a cost of O(log n) for lookup of a single element. You also have an algorithm that needs such a lookup, but has running time O(n) itself, assuming constant costs for lookup, and using a single loop over the input, with a constant number of lookups per loop.

Now, if you want to use the previously mentioned container data structure with this algorithm, its lookup runtime obviously replaces the (assumed) constant-time lookup. So we’ve got the same loop, but each of its iterations now takes O(log n) instead of constant-time O(1), so the overall runtime becomes O(n log n).

抱猫软卧 2024-08-18 12:44:45

“总”复杂度是所有​​子算法中最坏的情况。在您的示例中: O(n*log(n))

定义O(f(n)) 是从某个数字(某个 n0)开始,有一个常量“Const”,其中大小为 n 的输入上的操作总数为小于Const*f(n)

因此,如果您有一组子算法,复杂度将始终小于所有 const 的 Sigma (Const1 + Const2 + ...) 乘以最差复杂度函数(例如,来自“n”、“n*logn”和“n^2”它将是n^2)。根据复杂性定义,它是该组中最差的复杂性。

示例:

  1. 假设我们正在对数组 (n*logn) 进行排序,
  2. 查找键高于 x 的项目 (logn)

假设 Const1 为排序为 5(意味着我们可以在少于 5*n*logn 操作中对 n 个项目的列表进行排序),Const2 为 3(意味着我们可以在少于 3*logn 操作中找到该元素)。

在这种情况下,很明显,两种算法的操作总数都小于 (5+3)*n*logn 个操作,因此它是 O(n*logn)O(n*logn)代码>

The "total" complexity is the worst case of all sub algorithms. In your example: O(n*log(n))

Definition of O(f(n)) is that starting from some number (some n0), there is a constant, 'Const', where the total number of actions on input of size n is less than Const*f(n).

Therefore, if you have group of sub algorithms, the complexity will always be less then Sigma of all consts (Const1 + Const2 + ...) multiplied by the worst complexity function (e.g., from "n", "n*logn" and "n^2" it'll be n^2). By complexity definition it's the worst complexity in the group.

Example:

  1. Let's assume we're sorting an array (n*logn)
  2. Finding the item with key higher then x (logn)

Assume Const1 of the sort is 5 (meaning we can sort list of n items in less than 5*n*logn actions) and the Const2 is 3 (meaning we can find the element in less than 3*logn actions).

In this case, it's clear that the total number of action of both algorithms is less than (5+3)*n*logn actions, and therefore it O(n*logn)

只是一片海 2024-08-18 12:44:45

O(n) 算法效率估计的假设之一是我们的实际运行时间接近理论值。因此,我们不应该过于专注于试图找出微小的差异。例如,如果我们通过未排序的数据集进行简单的迭代搜索(平均而言,我们会在中间点找到值),则 O(n) 算法可能会在 O(n/2) 时间内完成,但它仍然是O(n) 算法。

如果您有一个函数必须在数据集上完成三个子进程才能退出,那么可以这么说,该数据集上最慢的子进程的运行时间将是帐篷中最长的杆子。在给出的具体示例中,函数(“算法”)运行时间为 O(n),我们不用担心它是否为 O(n + n*log(n) + log(n));该总和与 O(n) 之间的差异最多为两倍。我们关心的是相对大小,即运行时间是否是 log(n)、或 (n)、(n^2) 或 (n^3) 无限。我们关心的是10、100、1000的因数。

One of the assumptions with O(n) algorithm efficiency estimations is that our actual running time approaches the theoretical value. Thus, we shouldn't get too wrapped around the axle trying to figure out small variances. An O(n) algorithm may finish in O(n/2) time if we're doing a simple iterative search through an unsorted dataset (on average we'll find the value by the halfway point) for instance, but it is still an O(n) algorithm.

If you have a function that has to complete three subprocesses on a dataset before it can exit, then the slowest subprocess's running time on that dataset will be the longest pole in the tent, so to speak. In the specific example given, the function ('algorithm') runs in O(n) time, and we don't worry if it's O(n + n*log(n) + log(n)); the difference between that sum and O(n) is at most a factor of two. What we care about is the relative magnitude, i.e., whether running time is log(n), or (n) or (n^2) or (n^3) ad infinitum. What we care about is a factor of 10, or 100, or 1000.

撩发小公举 2024-08-18 12:44:45

问题的复杂程度是由“n”趋于无穷大的条件决定的。这个链接从根本上解释了为什么所有复杂度较低的 O(n) 数都会被丢弃;甚至 O(n) 格式也会丢弃其他多项式项,这些项会在不同的硬件上发生变化。合理地说,如果你有函数的时间,你可以将各种总时间相加。在时间相关的环境中,这可能是有意义的数据,例如处理大量数据,其中被调用的函数被多次调用。这还可以提供解决方案的可扩展性以及服务器等升级的性能峰值。这将是单机解决方案,并且系数也不会被删除。

根据体系结构以及编译器生成二进制代码的方式,所有机器在执行任何功能时都会有不同的运行系数。这意味着,如果您正在为多个用户设计程序,并且他们位于不同的计算机上,那么具体的计算不仅是无关的,而且也不准确。

在计算不是无关紧要或不准确的情况下:

单独结构的技巧是一个结构的时间函数不是所有其他结构的时间函数。

O(n) = x + y + z, 
x(n) = (t1) * (n^2)
y(n) = (t2) * (log n) 
z(n) = (t3) * (n log n) 

...

时间 (t1)、(t2) 和 (t3) 是作为特定机器上的函数的时间给出的。

The complexity of a problem is determined with the conditions of "n" tending towards infinity. This link is fundamentally why all O(n) numbers of less complexity are dropped; even the O(n) format drops other polynomial terms which would change on different hardware. Reasonably, you can add the various total times if you have the times of the functions. This could be meaningful data in a time dependent environment, such as handling large sets of data, where the called functions are called more than once. This could also give scalability of a solution and the performance peak of an upgrade on, let us say, a server. This would be a one machine solution, and coefficients would not be dropped either.

All machines will have various running coefficients in performing any function based on architecture and how the compiler generates the binary code. This means, if you are designing programs for multiple users, and they are on different computers , then the specific calculations are not only extraneous, but inaccurate as well.

In the case where the calculations are not extraneous or inaccurate:

The trick with separate structures is the time function of one is not the time function of all of the others.

O(n) = x + y + z, 
x(n) = (t1) * (n^2)
y(n) = (t2) * (log n) 
z(n) = (t3) * (n log n) 

...

Time (t1), (t2), and (t3) are given as time of the function on a particular machine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文