大O,你是如何计算/近似的?

发布于 2024-10-15 00:26:19 字数 162 浏览 9 评论 0原文

大多数拥有计算机科学学位的人肯定知道 Big O 代表什么。 它帮助我们衡量算法的扩展程度。

但我很好奇,您如何计算或近似算法的复杂性?

Most people with a degree in CS will certainly know what Big O stands for.
It helps us to measure how well an algorithm scales.

But I'm curious, how do you calculate or approximate the complexity of your algorithms?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(24

尸血腥色 2024-10-22 00:26:20

不要忘记还要考虑到空间复杂性,如果内存资源有限,空间复杂性也可能会引起关注。例如,您可能会听到有人想要一个恒定空间算法,这基本上是说算法占用的空间量不依赖于代码内的任何因素。

有时,复杂性可能来自调用次数、执行循环的频率、分配内存的频率等等,这是回答这个问题的另一部分。

最后,大 O 可用于最坏情况、最佳情况和摊销情况,其中通常用最坏情况来描述算法可能有多糟糕。

Don't forget to also allow for space complexities that can also be a cause for concern if one has limited memory resources. So for example you may hear someone wanting a constant space algorithm which is basically a way of saying that the amount of space taken by the algorithm doesn't depend on any factors inside the code.

Sometimes the complexity can come from how many times is something called, how often is a loop executed, how often is memory allocated, and so on is another part to answer this question.

Lastly, big O can be used for worst case, best case, and amortization cases where generally it is the worst case that is used for describing how bad an algorithm may be.

好多鱼好多余 2024-10-22 00:26:20

经常被忽视的是算法的预期行为。 它不会改变你的算法的 Big-O,但它确实与“过早优化......”的说法有关。

你的算法的预期行为是 - 非常简单 - 有多快您可以期望您的算法能够处理您最有可能看到的数据。

例如,如果您正在列表中搜索一个值,则时间复杂度为 O(n),但如果您知道您看到的大多数列表都预先包含了您的值,那么您的算法的典型行为会更快。

为了真正确定它,您需要能够描述“输入空间”的概率分布(如果您需要对列表进行排序,该列表已经被排序的频率是多少?完全颠倒的频率是多少?通常它大部分是排序的吗?)您知道这一点并不总是可行的,但有时您知道。

What often gets overlooked is the expected behavior of your algorithms. It doesn't change the Big-O of your algorithm, but it does relate to the statement "premature optimization. . .."

Expected behavior of your algorithm is -- very dumbed down -- how fast you can expect your algorithm to work on data you're most likely to see.

For instance, if you're searching for a value in a list, it's O(n), but if you know that most lists you see have your value up front, typical behavior of your algorithm is faster.

To really nail it down, you need to be able to describe the probability distribution of your "input space" (if you need to sort a list, how often is that list already going to be sorted? how often is it totally reversed? how often is it mostly sorted?) It's not always feasible that you know that, but sometimes you do.

吖咩 2024-10-22 00:26:20

好问题!

免责声明:此答案包含虚假陈述,请参阅下面的评论。

如果您使用的是 Big O,那么您正在谈论最坏的情况(稍后会详细介绍这意味着什么)。此外,平均情况下有大写的 theta,最佳情况下有大写的 omega。

查看此网站,了解 Big O 的可爱正式定义:https://xlinux.nist .gov/dads/HTML/bigOnotation.html

f(n) = O(g(n)) 表示存在正常数 c 和 k,使得对于所有 n ≥ k,0 ≤ f(n) ≤ cg(n)。对于函数 f,c 和 k 的值必须是固定的,并且不得依赖于 n。


好的,那么现在我们所说的“最好情况”和“最坏情况”复杂性是什么意思?

这可能通过示例最清楚地说明。例如,如果我们使用线性搜索在排序数组中查找数字,那么最坏的情况是当我们决定搜索数组的最后一个元素时,就像这样执行与数组中有多少项一样多的步骤。 最好的情况是当我们搜索第一个元素时,因为我们将在第一次检查后完成。

所有这些形容词大小写的复杂性的要点是,我们正在寻找一种方法,根据特定变量的大小来绘制假设程序运行完成的时间量。然而,对于许多算法,您可能会争辩说,对于特定大小的输入来说,没有单一的时间。请注意,这与函数的基本要求相矛盾,任何输入不应有超过一个输出。因此,我们提出了多个函数来描述算法的复杂性。现在,即使搜索大小为 n 的数组可能需要不同的时间,具体取决于您在数组中查找的内容以及与 n 成比例的时间,我们可以使用最佳情况、平均情况创建算法的信息描述,和最坏情况的类别。

抱歉,这篇文章写得很差,缺乏很多技术信息。但希望它能让时间复杂度类更容易思考。一旦你熟悉了这些,解析你的程序并寻找诸如 for 循环之类的东西就变得很简单了,这些东西依赖于数组大小,并根据你的数据结构进行推理,什么样的输入会导致微不足道的情况,以及输入会导致什么结果在最坏的情况下。

great question!

Disclaimer: this answer contains false statements see the comments below.

If you're using the Big O, you're talking about the worse case (more on what that means later). Additionally, there is capital theta for average case and a big omega for best case.

Check out this site for a lovely formal definition of Big O: https://xlinux.nist.gov/dads/HTML/bigOnotation.html

f(n) = O(g(n)) means there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k. The values of c and k must be fixed for the function f and must not depend on n.


Ok, so now what do we mean by "best-case" and "worst-case" complexities?

This is probably most clearly illustrated through examples. For example if we are using linear search to find a number in a sorted array then the worst case is when we decide to search for the last element of the array as this would take as many steps as there are items in the array. The best case would be when we search for the first element since we would be done after the first check.

The point of all these adjective-case complexities is that we're looking for a way to graph the amount of time a hypothetical program runs to completion in terms of the size of particular variables. However for many algorithms you can argue that there is not a single time for a particular size of input. Notice that this contradicts with the fundamental requirement of a function, any input should have no more than one output. So we come up with multiple functions to describe an algorithm's complexity. Now, even though searching an array of size n may take varying amounts of time depending on what you're looking for in the array and depending proportionally to n, we can create an informative description of the algorithm using best-case, average-case, and worst-case classes.

Sorry this is so poorly written and lacks much technical information. But hopefully it'll make time complexity classes easier to think about. Once you become comfortable with these it becomes a simple matter of parsing through your program and looking for things like for-loops that depend on array sizes and reasoning based on your data structures what kind of input would result in trivial cases and what input would result in worst-cases.

恏ㄋ傷疤忘ㄋ疼 2024-10-22 00:26:20

我不知道如何以编程方式解决这个问题,但人们做的第一件事是我们对完成的操作数中的某些模式的算法进行采样,比如 4n^2 + 2n + 1 我们有 2 个规则:

  1. 如果我们有一个项之和,保留增长率最大的项,其他项省略。
  2. 如果我们有几个因子的乘积,则常数因子被省略。

如果我们简化 f(x),其中 f(x) 是完成的操作数的公式(4n^2 + 2n + 1 如上所述),我们将获得大 O 值 [O(n^2)案件]。但这必须考虑程序中的拉格朗日插值,这可能很难实现。如果真正的大 O 值是 O(2^n),并且我们可能有类似 O(x^n) 的东西,那么这个算法可能无法编程。但如果有人证明我错了,请给我代码。 。 。 。

I don't know how to programmatically solve this, but the first thing people do is that we sample the algorithm for certain patterns in the number of operations done, say 4n^2 + 2n + 1 we have 2 rules:

  1. If we have a sum of terms, the term with the largest growth rate is kept, with other terms omitted.
  2. If we have a product of several factors constant factors are omitted.

If we simplify f(x), where f(x) is the formula for number of operations done, (4n^2 + 2n + 1 explained above), we obtain the big-O value [O(n^2) in this case]. But this would have to account for Lagrange interpolation in the program, which may be hard to implement. And what if the real big-O value was O(2^n), and we might have something like O(x^n), so this algorithm probably wouldn't be programmable. But if someone proves me wrong, give me the code . . . .

平安喜乐 2024-10-22 00:26:20

对于代码A,外循环将执行n+1次,“1”次表示检查i是否仍然满足要求的过程。内部循环运行 n 次、n-2 次......因此,0+2+..+(n-2)+n= ( 0+n)(n+1)/2= O(n²)

对于代码B,虽然内循环不会介入并执行foo(),但内循环将执行n次,具体取决于外循环的执行时间,即O(n)

For code A, the outer loop will execute for n+1 times, the '1' time means the process which checks the whether i still meets the requirement. And inner loop runs n times, n-2 times.... Thus,0+2+..+(n-2)+n= (0+n)(n+1)/2= O(n²).

For code B, though inner loop wouldn't step in and execute the foo(), the inner loop will be executed for n times depend on outer loop execution time, which is O(n)

月牙弯弯 2024-10-22 00:26:19

我将尽力在这里用简单的术语解释它,但请注意,这个主题需要我的学生几个月的时间才能最终掌握。您可以在Java 中的数据结构和算法的第 2 章中找到更多信息书。


没有机械程序可用于获取 BigOh。

作为一本“食谱”,要从一段代码中获取 BigOh,您首先需要意识到您正在创建一个数学公式来计算给定一定大小的输入执行的计算步骤。

目的很简单:从理论角度比较算法,而不需要执行代码。步骤数越少,算法越快。

例如,假设您有这段代码:

int sum(int* data, int N) {
    int result = 0;               // 1

    for (int i = 0; i < N; i++) { // 2
        result += data[i];        // 3
    }

    return result;                // 4
}

该函数返回数组所有元素的总和,我们想要创建一个公式来计算 该函数的计算复杂性

Number_Of_Steps = f(N)

因此我们有 f(N),一个计算计算步骤数的函数。函数的输入是要处理的结构的大小。这意味着该函数被调用,例如:

Number_Of_Steps = f(data.length)

参数N采用data.length值。现在我们需要函数f()的实际定义。这是从源代码中完成的,其中每个有趣的行都从 1 到 4 编号。

计算 BigOh 的方法有很多种。从现在开始,我们将假设每个不依赖于输入数据大小的句子都采用恒定的 C 数计算步骤。

我们将添加函数的单独步骤数,并且局部变量声明和返回语句都不依赖于 data 数组的大小。

这意味着第 1 行和第 4 行各需要 C 个步骤,函数有点像这样:

f(N) = C + ??? + C

下一部分是定义 for 语句的值。请记住,我们正在计算计算步骤的数量,这意味着 for 语句的主体被执行了 N 次。这与添加 CN 次相同:

f(N) = C + (C + C + ... + C) + C = C + N * C + C

没有机械规则来计算 for 的主体被执行了多少次,您需要通过查看代码的作用来计算它。为了简化计算,我们忽略了 for 语句的变量初始化、条件和增量部分。

为了获得实际的 BigOh,我们需要函数的渐近分析。大致是这样完成的:

  1. 去掉所有常量C
  2. f()获取标准形式的多项式
  3. 将多项式的项相除并按增长率排序。
  4. 保留当N接近无穷大时变大的那个。

我们的 f() 有两项:

f(N) = 2 * C * N ^ 0 + 1 * C * N ^ 1

去掉所有 C 常量和冗余部分:

f(N) = 1 + N ^ 1

因为最后一项是当 f()< 时变得更大的一项。 /code> 接近无穷大(想想极限)这是 BigOh 论点,并且sum() 函数有一个 BigOh :

O(N)

有一些技巧可以解决一些棘手的问题:使用 总结

例如,可以使用求和轻松解决此代码:

for (i = 0; i < 2*n; i += 2) {  // 1
    for (j=n; j > i; j--) {     // 2
        foo();                  // 3
    }
}

您需要询问的第一件事是 foo() 的执行顺序。虽然通常是O(1),但您需要询问您的教授。 O(1) 表示(几乎大部分)常量 C,与大小 N 无关。

第一个句子的 for 语句很棘手。虽然索引以 2 * N 结束,但增量是 2。这意味着第一个 for 仅执行了 N 步,我们需要将计数除以二。

f(N) = Summation(i from 1 to 2 * N / 2)( ... ) = 
     = Summation(i from 1 to N)( ... )

句子编号 two 甚至更棘手,因为它取决于 i 的值。看一下:索引 i 的值为:0, 2, 4, 6, 8, ..., 2 * N,第二个 for 执行次数:第一个的 N 倍,N - 2 第二个,N - 4 第三个......直到 N / 2 阶段,第二个 for 永远不会被执行。

根据公式,这意味着:

f(N) = Summation(i from 1 to N)( Summation(j = ???)(  ) )

我们再次计算步数。根据定义,每个求和都应始终从 1 开始,并以大于或等于 1 的数字结束。

f(N) = Summation(i from 1 to N)( Summation(j = 1 to (N - (i - 1) * 2)( C ) )

(我们假设 foo()O(1) 并且需要 C 步骤。)

我们这里有一个问题:当 i取值N / 2 + 1向上,内部求和以负数结束!这是不可能的,也是错误的。我们需要将求和一分为二,作为 iN / 2 + 1 时刻的关键点。

f(N) = Summation(i from 1 to N / 2)( Summation(j = 1 to (N - (i - 1) * 2)) * ( C ) ) + Summation(i from 1 to N / 2) * ( C )

自从关键时刻i>开始, N / 2,内部的 for 将不会被执行,并且我们假设其主体上的 C 执行复杂度恒定。

现在可以使用一些恒等规则来简化求和:

  1. 求和(w 从 1 到 N)( C ) = N * C
  2. 求和(w 从 1 到 N)( A (+/-) B ) = 求和(w 从 1 到N)( A ) (+/-) 求和(w 从 1 到 N)( B )
  3. 求和(w 从 1 到 N)( w * C ) = C * 求和(w 从 1 到 N)( w ) (C是一个常数,与 w 无关)
  4. Summation(w from 1 to N)( w ) = (N * (N + 1)) / 2

应用一些代数:

f(N) = Summation(i from 1 to N / 2)( (N - (i - 1) * 2) * ( C ) ) + (N / 2)( C )

f(N) = C * Summation(i from 1 to N / 2)( (N - (i - 1) * 2)) + (N / 2)( C )

f(N) = C * (Summation(i from 1 to N / 2)( N ) - Summation(i from 1 to N / 2)( (i - 1) * 2)) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - 2 * Summation(i from 1 to N / 2)( i - 1 )) + (N / 2)( C )

=> Summation(i from 1 to N / 2)( i - 1 ) = Summation(i from 1 to N / 2 - 1)( i )

f(N) = C * (( N ^ 2 / 2 ) - 2 * Summation(i from 1 to N / 2 - 1)( i )) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - 2 * ( (N / 2 - 1) * (N / 2 - 1 + 1) / 2) ) + (N / 2)( C )

=> (N / 2 - 1) * (N / 2 - 1 + 1) / 2 = 

   (N / 2 - 1) * (N / 2) / 2 = 

   ((N ^ 2 / 4) - (N / 2)) / 2 = 

   (N ^ 2 / 8) - (N / 4)

f(N) = C * (( N ^ 2 / 2 ) - 2 * ( (N ^ 2 / 8) - (N / 4) )) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - ( (N ^ 2 / 4) - (N / 2) )) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - (N ^ 2 / 4) + (N / 2)) + (N / 2)( C )

f(N) = C * ( N ^ 2 / 4 ) + C * (N / 2) + C * (N / 2)

f(N) = C * ( N ^ 2 / 4 ) + 2 * C * (N / 2)

f(N) = C * ( N ^ 2 / 4 ) + C * N

f(N) = C * 1/4 * N ^ 2 + C * N

BigOh 是:

O(N²)

I'll do my best to explain it here on simple terms, but be warned that this topic takes my students a couple of months to finally grasp. You can find more information on the Chapter 2 of the Data Structures and Algorithms in Java book.


There is no mechanical procedure that can be used to get the BigOh.

As a "cookbook", to obtain the BigOh from a piece of code you first need to realize that you are creating a math formula to count how many steps of computations get executed given an input of some size.

The purpose is simple: to compare algorithms from a theoretical point of view, without the need to execute the code. The lesser the number of steps, the faster the algorithm.

For example, let's say you have this piece of code:

int sum(int* data, int N) {
    int result = 0;               // 1

    for (int i = 0; i < N; i++) { // 2
        result += data[i];        // 3
    }

    return result;                // 4
}

This function returns the sum of all the elements of the array, and we want to create a formula to count the computational complexity of that function:

Number_Of_Steps = f(N)

So we have f(N), a function to count the number of computational steps. The input of the function is the size of the structure to process. It means that this function is called such as:

Number_Of_Steps = f(data.length)

The parameter N takes the data.length value. Now we need the actual definition of the function f(). This is done from the source code, in which each interesting line is numbered from 1 to 4.

There are many ways to calculate the BigOh. From this point forward we are going to assume that every sentence that doesn't depend on the size of the input data takes a constant C number computational steps.

We are going to add the individual number of steps of the function, and neither the local variable declaration nor the return statement depends on the size of the data array.

That means that lines 1 and 4 takes C amount of steps each, and the function is somewhat like this:

f(N) = C + ??? + C

The next part is to define the value of the for statement. Remember that we are counting the number of computational steps, meaning that the body of the for statement gets executed N times. That's the same as adding C, N times:

f(N) = C + (C + C + ... + C) + C = C + N * C + C

There is no mechanical rule to count how many times the body of the for gets executed, you need to count it by looking at what does the code do. To simplify the calculations, we are ignoring the variable initialization, condition and increment parts of the for statement.

To get the actual BigOh we need the Asymptotic analysis of the function. This is roughly done like this:

  1. Take away all the constants C.
  2. From f() get the polynomium in its standard form.
  3. Divide the terms of the polynomium and sort them by the rate of growth.
  4. Keep the one that grows bigger when N approaches infinity.

Our f() has two terms:

f(N) = 2 * C * N ^ 0 + 1 * C * N ^ 1

Taking away all the C constants and redundant parts:

f(N) = 1 + N ^ 1

Since the last term is the one which grows bigger when f() approaches infinity (think on limits) this is the BigOh argument, and the sum() function has a BigOh of:

O(N)

There are a few tricks to solve some tricky ones: use summations whenever you can.

As an example, this code can be easily solved using summations:

for (i = 0; i < 2*n; i += 2) {  // 1
    for (j=n; j > i; j--) {     // 2
        foo();                  // 3
    }
}

The first thing you needed to be asked is the order of execution of foo(). While the usual is to be O(1), you need to ask your professors about it. O(1) means (almost, mostly) constant C, independent of the size N.

The for statement on the sentence number one is tricky. While the index ends at 2 * N, the increment is done by two. That means that the first for gets executed only N steps, and we need to divide the count by two.

f(N) = Summation(i from 1 to 2 * N / 2)( ... ) = 
     = Summation(i from 1 to N)( ... )

The sentence number two is even trickier since it depends on the value of i. Take a look: the index i takes the values: 0, 2, 4, 6, 8, ..., 2 * N, and the second for get executed: N times the first one, N - 2 the second, N - 4 the third... up to the N / 2 stage, on which the second for never gets executed.

On formula, that means:

f(N) = Summation(i from 1 to N)( Summation(j = ???)(  ) )

Again, we are counting the number of steps. And by definition, every summation should always start at one, and end at a number bigger-or-equal than one.

f(N) = Summation(i from 1 to N)( Summation(j = 1 to (N - (i - 1) * 2)( C ) )

(We are assuming that foo() is O(1) and takes C steps.)

We have a problem here: when i takes the value N / 2 + 1 upwards, the inner Summation ends at a negative number! That's impossible and wrong. We need to split the summation in two, being the pivotal point the moment i takes N / 2 + 1.

f(N) = Summation(i from 1 to N / 2)( Summation(j = 1 to (N - (i - 1) * 2)) * ( C ) ) + Summation(i from 1 to N / 2) * ( C )

Since the pivotal moment i > N / 2, the inner for won't get executed, and we are assuming a constant C execution complexity on its body.

Now the summations can be simplified using some identity rules:

  1. Summation(w from 1 to N)( C ) = N * C
  2. Summation(w from 1 to N)( A (+/-) B ) = Summation(w from 1 to N)( A ) (+/-) Summation(w from 1 to N)( B )
  3. Summation(w from 1 to N)( w * C ) = C * Summation(w from 1 to N)( w ) (C is a constant, independent of w)
  4. Summation(w from 1 to N)( w ) = (N * (N + 1)) / 2

Applying some algebra:

f(N) = Summation(i from 1 to N / 2)( (N - (i - 1) * 2) * ( C ) ) + (N / 2)( C )

f(N) = C * Summation(i from 1 to N / 2)( (N - (i - 1) * 2)) + (N / 2)( C )

f(N) = C * (Summation(i from 1 to N / 2)( N ) - Summation(i from 1 to N / 2)( (i - 1) * 2)) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - 2 * Summation(i from 1 to N / 2)( i - 1 )) + (N / 2)( C )

=> Summation(i from 1 to N / 2)( i - 1 ) = Summation(i from 1 to N / 2 - 1)( i )

f(N) = C * (( N ^ 2 / 2 ) - 2 * Summation(i from 1 to N / 2 - 1)( i )) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - 2 * ( (N / 2 - 1) * (N / 2 - 1 + 1) / 2) ) + (N / 2)( C )

=> (N / 2 - 1) * (N / 2 - 1 + 1) / 2 = 

   (N / 2 - 1) * (N / 2) / 2 = 

   ((N ^ 2 / 4) - (N / 2)) / 2 = 

   (N ^ 2 / 8) - (N / 4)

f(N) = C * (( N ^ 2 / 2 ) - 2 * ( (N ^ 2 / 8) - (N / 4) )) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - ( (N ^ 2 / 4) - (N / 2) )) + (N / 2)( C )

f(N) = C * (( N ^ 2 / 2 ) - (N ^ 2 / 4) + (N / 2)) + (N / 2)( C )

f(N) = C * ( N ^ 2 / 4 ) + C * (N / 2) + C * (N / 2)

f(N) = C * ( N ^ 2 / 4 ) + 2 * C * (N / 2)

f(N) = C * ( N ^ 2 / 4 ) + C * N

f(N) = C * 1/4 * N ^ 2 + C * N

And the BigOh is:

O(N²)
绝不放开 2024-10-22 00:26:19

Big O 给出了算法时间复杂度的上限。它通常与处理数据集(列表)结合使用,但也可以在其他地方使用。

一些如何在 C 代码中使用它的示例。

假设我们有一个包含 n 个元素的数组,

int array[n];

如果我们想访问数组的第一个元素,这将是 O(1),因为无论数组有多大,它总是需要相同的常数时间来获取第一个元素。

x = array[0];

如果我们想在列表中查找一个数字:

for(int i = 0; i < n; i++){
    if(array[i] == numToFind){ return i; }
}

这将是 O(n),因为我们最多必须遍历整个列表才能找到我们的数字。 Big-O 仍然是 O(n),即使我们可能会在第一次尝试时找到我们的数字并运行一次循环,因为 Big-O 描述了算法的上限(omega 表示下界,theta 表示紧界) 。

当我们进入嵌套循环时:

for(int i = 0; i < n; i++){
    for(int j = i; j < n; j++){
        array[j] += 2;
    }
}

这是 O(n^2),因为对于外部循环 ( O(n) ) 的每次传递,我们都必须再次遍历整个列表,因此 n 相乘,留下 n 的平方。

这仅仅触及表面,但是当您开始分析更复杂的算法时,涉及证明的复杂数学就会发挥作用。希望这至少能让您熟悉基础知识。

Big O gives the upper bound for time complexity of an algorithm. It is usually used in conjunction with processing data sets (lists) but can be used elsewhere.

A few examples of how it's used in C code.

Say we have an array of n elements

int array[n];

If we wanted to access the first element of the array this would be O(1) since it doesn't matter how big the array is, it always takes the same constant time to get the first item.

x = array[0];

If we wanted to find a number in the list:

for(int i = 0; i < n; i++){
    if(array[i] == numToFind){ return i; }
}

This would be O(n) since at most we would have to look through the entire list to find our number. The Big-O is still O(n) even though we might find our number the first try and run through the loop once because Big-O describes the upper bound for an algorithm (omega is for lower bound and theta is for tight bound).

When we get to nested loops:

for(int i = 0; i < n; i++){
    for(int j = i; j < n; j++){
        array[j] += 2;
    }
}

This is O(n^2) since for each pass of the outer loop ( O(n) ) we have to go through the entire list again so the n's multiply leaving us with n squared.

This is barely scratching the surface but when you get to analyzing more complex algorithms complex math involving proofs comes into play. Hope this familiarizes you with the basics at least though.

失而复得 2024-10-22 00:26:19

虽然了解如何计算出特定问题的 Big O 时间很有用,但了解一些一般情况对于帮助您在算法中做出决策大有帮助。

以下是一些最常见的情况,摘自 http://en.wikipedia.org/wiki/ Big_O_notation#Orders_of_common_functions

O(1) - 确定数字是偶数还是奇数;使用恒定大小的查找表或哈希表

O(logn) - 使用二分搜索在排序数组中查找项目

O(n) - 在未排序列表中查找项目;两个 n 位数字相加

O(n2) - 通过简单的算法将两个 n 位数字相乘;将两个 n×n 矩阵相加;冒泡排序或插入排序

O(n3) - 通过简单算法将两个 n×n 矩阵相乘

O(cn) - 找到旅行商的(精确)解决方案使用动态规划的问题;使用强力搜索确定两个逻辑语句是否等价

O(n!) - 通过强力搜索解决旅行商问题

O(nn) - 通常用来代替 O(n!) 进行推导渐近复杂度的更简单公式

While knowing how to figure out the Big O time for your particular problem is useful, knowing some general cases can go a long way in helping you make decisions in your algorithm.

Here are some of the most common cases, lifted from http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions:

O(1) - Determining if a number is even or odd; using a constant-size lookup table or hash table

O(logn) - Finding an item in a sorted array with a binary search

O(n) - Finding an item in an unsorted list; adding two n-digit numbers

O(n2) - Multiplying two n-digit numbers by a simple algorithm; adding two n×n matrices; bubble sort or insertion sort

O(n3) - Multiplying two n×n matrices by simple algorithm

O(cn) - Finding the (exact) solution to the traveling salesman problem using dynamic programming; determining if two logical statements are equivalent using brute force

O(n!) - Solving the traveling salesman problem via brute-force search

O(nn) - Often used instead of O(n!) to derive simpler formulas for asymptotic complexity

清晨说晚安 2024-10-22 00:26:19

小提示:big O 表示法用于表示渐近复杂性(即当问题的规模增长到无穷大时)、它隐藏了一个常数。

这意味着在 O(n) 算法和 O(n2) 算法之间,最快的并不总是第一个(尽管总是存在一个 n 值,这样对于大小问题>n,第一个算法是最快的)。

请注意,隐藏常量很大程度上取决于实现!

此外,在某些情况下,运行时不是输入大小 n 的确定性函数。以使用快速排序进行排序为例:对 n 个元素的数组进行排序所需的时间不是一个常数,而是取决于数组的起始配置。

有不同的时间复杂度:

  • 最坏情况(通常是最容易计算出来的,但并不总是很有意义)
  • 平均情况(通常更难计算...)

  • ...

R. Sedgewick 和 P. Flajolet 的《算法分析简介》是一个很好的介绍。

正如您所说,过早的优化是万恶之源,并且(如果可能的话)在优化代码时确实应该始终使用分析。它甚至可以帮助您确定算法的复杂性。

Small reminder: the big O notation is used to denote asymptotic complexity (that is, when the size of the problem grows to infinity), and it hides a constant.

This means that between an algorithm in O(n) and one in O(n2), the fastest is not always the first one (though there always exists a value of n such that for problems of size >n, the first algorithm is the fastest).

Note that the hidden constant very much depends on the implementation!

Also, in some cases, the runtime is not a deterministic function of the size n of the input. Take sorting using quick sort for example: the time needed to sort an array of n elements is not a constant but depends on the starting configuration of the array.

There are different time complexities:

  • Worst case (usually the simplest to figure out, though not always very meaningful)
  • Average case (usually much harder to figure out...)

  • ...

A good introduction is An Introduction to the Analysis of Algorithms by R. Sedgewick and P. Flajolet.

As you say, premature optimisation is the root of all evil, and (if possible) profiling really should always be used when optimising code. It can even help you determine the complexity of your algorithms.

愛上了 2024-10-22 00:26:19

看到这里的答案,我认为我们可以得出结论,我们大多数人确实通过查看来近似算法的顺序并使用常识,而不是使用例如大师方法正如我们在大学时所认为的那样。
话虽如此,我必须补充一点,即使是教授(后来)也鼓励我们真正思考,而不是仅仅计算它。

另外我想补充一下递归函数是如何完成的:

假设我们有一个像 (方案代码):

(define (fac n)
    (if (= n 0)
        1
            (* n (fac (- n 1)))))

递归计算给定数字的阶乘。

第一步是尝试确定仅函数体的性能特征,在这种情况下,函数体内没有做任何特殊的事情,只是进行乘法(或返回值 1)。

因此,主体的性能为:O(1)(常数)。

接下来尝试确定递归调用的次数。在这种情况下,我们有 n-1 次递归调用。

所以递归调用的性能是:O(n-1)(顺序是n,因为我们扔掉了不重要的部分)。

然后将这两个放在一起,就可以得到整个递归函数的性能:

1 * (n-1) = O(n)


彼得,回答您提出的问题; 我在这里描述的方法实际上可以很好地处理这个问题。但请记住,这仍然是一个近似值,而不是完整的数学正确答案。这里描述的方法也是我们在大学教授的方法之一,如果我没记错的话,它用于比我在本例中使用的阶乘高级得多的算法。
当然,这完全取决于您对函数体的运行时间和递归调用次数的估计程度,但这对于其他方法来说也是如此。

Seeing the answers here I think we can conclude that most of us do indeed approximate the order of the algorithm by looking at it and use common sense instead of calculating it with, for example, the master method as we were thought at university.
With that said I must add that even the professor encouraged us (later on) to actually think about it instead of just calculating it.

Also I would like to add how it is done for recursive functions:

suppose we have a function like (scheme code):

(define (fac n)
    (if (= n 0)
        1
            (* n (fac (- n 1)))))

which recursively calculates the factorial of the given number.

The first step is to try and determine the performance characteristic for the body of the function only in this case, nothing special is done in the body, just a multiplication (or the return of the value 1).

So the performance for the body is: O(1) (constant).

Next try and determine this for the number of recursive calls. In this case we have n-1 recursive calls.

So the performance for the recursive calls is: O(n-1) (order is n, as we throw away the insignificant parts).

Then put those two together and you then have the performance for the whole recursive function:

1 * (n-1) = O(n)


Peter, to answer your raised issues; the method I describe here actually handles this quite well. But keep in mind that this is still an approximation and not a full mathematically correct answer. The method described here is also one of the methods we were taught at university, and if I remember correctly was used for far more advanced algorithms than the factorial I used in this example.
Of course it all depends on how well you can estimate the running time of the body of the function and the number of recursive calls, but that is just as true for the other methods.

心房敞 2024-10-22 00:26:19

如果您的成本是多项式,则只需保留最高阶项,无需其乘数。例如:

O((n/2 + 1)*(n/2)) = O(n2/4 + n/2) = O(n2 /4) = O(n2)

请注意,这不适用于无限级数。对于一般情况,没有单一的方法,但对于某些常见情况,存在以下不等式:

O(log N) < O(N) < O(N log N) O(N2) < O(Nk) < O(en) < O(n!)


If your cost is a polynomial, just keep the highest-order term, without its multiplier. E.g.:

O((n/2 + 1)*(n/2)) = O(n2/4 + n/2) = O(n2/4) = O(n2)

This doesn't work for infinite series, mind you. There is no single recipe for the general case, though for some common cases, the following inequalities apply:

O(log N) < O(N) < O(N log N) < O(N2) < O(Nk) < O(en) < O(n!)

爱的十字路口 2024-10-22 00:26:19

我从信息的角度来思考这个问题。任何问题都包含学习一定数量的比特。

您的基本工具是决策点及其熵的概念。决策点的熵是它为您提供的平均信息。例如,如果程序包含具有两个分支的决策点,则它的熵是每个分支的概率乘以该分支的逆概率的 log2 的总和。这就是你通过执行该决定学到的东西。

例如,具有两个分支的 if 语句,两个分支的可能性相同,其熵为 1/2 * log(2/1) + 1/2 * log(2/1) = 1/2 * 1 + 1/2 * 1 = 1。所以它的熵是1位。

假设您正在搜索包含 N 个项目的表,例如 N=1024。这是一个 10 位问题,因为 log(1024) = 10 位。因此,如果您可以使用具有同等可能结果的 IF 语句进行搜索,则应该需要 10 个决策。

这就是二分查找得到的结果。

假设您正在进行线性搜索。您查看第一个元素并询问它是否是您想要的。是的概率是 1/1024,不是的概率是 1023/1024。该决策的熵为 1/1024*log(1024/1) + 1023/1024 * log(1024/1023) = 1/1024 * 10 + 1023/1024 * 大约 0 = 大约 0.01 位。你学到的东西太少了!第二个决定也好不了多少。这就是线性搜索如此慢的原因。事实上,您需要学习的位数呈指数级增长。

假设您正在进行索引。假设表被预先分类到许多 bin 中,并且您使用键中的所有位中的一些来直接索引表条目。如果有 1024 个 bin,则对于所有 1024 个可能的结果,熵为 1/1024 * log(1024) + 1/1024 * log(1024) + ...。这是 1/1024 * 10 乘以 1024 个结果,或者该索引操作的 10 位熵。这就是索引搜索速度快的原因。

现在考虑排序。你有 N 个项目,并且有一个列表。对于每个项目,您必须搜索该项目在列表中的位置,然后将其添加到列表中。因此排序大约需要基础搜索步骤数的 N 倍。

因此,基于具有大致相同可能结果的二元决策的排序都需要大约 O(N log N) 个步骤。如果 O(N) 排序算法基于索引搜索,则它是可能的。

我发现几乎所有的算法性能问题都可以用这种方式来看待。

I think about it in terms of information. Any problem consists of learning a certain number of bits.

Your basic tool is the concept of decision points and their entropy. The entropy of a decision point is the average information it will give you. For example, if a program contains a decision point with two branches, it's entropy is the sum of the probability of each branch times the log2 of the inverse probability of that branch. That's how much you learn by executing that decision.

For example, an if statement having two branches, both equally likely, has an entropy of 1/2 * log(2/1) + 1/2 * log(2/1) = 1/2 * 1 + 1/2 * 1 = 1. So its entropy is 1 bit.

Suppose you are searching a table of N items, like N=1024. That is a 10-bit problem because log(1024) = 10 bits. So if you can search it with IF statements that have equally likely outcomes, it should take 10 decisions.

That's what you get with binary search.

Suppose you are doing linear search. You look at the first element and ask if it's the one you want. The probabilities are 1/1024 that it is, and 1023/1024 that it isn't. The entropy of that decision is 1/1024*log(1024/1) + 1023/1024 * log(1024/1023) = 1/1024 * 10 + 1023/1024 * about 0 = about .01 bit. You've learned very little! The second decision isn't much better. That is why linear search is so slow. In fact it's exponential in the number of bits you need to learn.

Suppose you are doing indexing. Suppose the table is pre-sorted into a lot of bins, and you use some of all of the bits in the key to index directly to the table entry. If there are 1024 bins, the entropy is 1/1024 * log(1024) + 1/1024 * log(1024) + ... for all 1024 possible outcomes. This is 1/1024 * 10 times 1024 outcomes, or 10 bits of entropy for that one indexing operation. That is why indexing search is fast.

Now think about sorting. You have N items, and you have a list. For each item, you have to search for where the item goes in the list, and then add it to the list. So sorting takes roughly N times the number of steps of the underlying search.

So sorts based on binary decisions having roughly equally likely outcomes all take about O(N log N) steps. An O(N) sort algorithm is possible if it is based on indexing search.

I've found that nearly all algorithmic performance issues can be looked at in this way.

御守 2024-10-22 00:26:19

让我们从头开始吧。

首先,接受这样一个原则:对数据的某些简单操作可以在 O(1) 时间内完成,即与输入大小无关的时间。 C 中的这些基本运算由

  1. 算术运算(例如+ 或%)组成。
  2. 逻辑运算(例如,&&)。
  3. 比较运算(例如,<=)。
  4. 结构访问操作(例如像 A[i] 这样的数组索引,或指针跟随
    低声与->操作员)。
  5. 简单的赋值,例如将值复制到变量中。
  6. 调用库函数(例如,scanf、printf)。

这一原理的合理性需要对典型计算机的机器指令(原始步骤)进行详细研究。所描述的每个操作都可以用一些少量的机器指令来完成;通常只需要一两个指令。
因此,C 中的几种语句可以在 O(1) 时间内执行,即在与输入无关的某个恒定时间量内执行。这些简单的包含

  1. 赋值语句在其表达式中不涉及函数调用。
  2. 阅读声明。
  3. 编写不需要函数调用来计算参数的语句。
  4. 跳转语句break、Continue、goto和return表达式,其中
    表达式不包含函数调用。

在 C 语言中,许多 for 循环是通过将索引变量初始化为某个值并
每次循环时将该变量递增 1。 for循环结束时
指数达到一定限度。例如,for 循环

for (i = 0; i < n-1; i++) 
{
    small = i;
    for (j = i+1; j < n; j++)
        if (A[j] < A[small])
            small = j;
    temp = A[small];
    A[small] = A[i];
    A[i] = temp;
}

使用索引变量 i。每次循环时 i 都会增加 1,并且迭代次数
当 i 达到 n − 1 时停止。

但是,目前,请关注 for 循环的简单形式,其中最终值和初始值之间的差异除以索引变量递增的量告诉我们我们循环了多少次。该计数是准确的,除非有办法通过跳转语句退出循环;它是任何情况下迭代次数的上限。

例如,for 循环迭代 ((n − 1) − 0)/1 = n − 1 次
由于 0 是 i 的初始值,n − 1 是 i 达到的最高值(即,当 i
达到n−1,循环停止,不进行迭代(i = n−1),并加1
在循环的每次迭代中到 i 。

在最简单的情况下,每个循环体花费的时间是相同的
迭代,我们可以将主体的 big-oh 上限乘以
循环次数。严格来说,我们必须添加 O(1) 时间来初始化
循环索引以及循环索引与循环索引第一次比较的 O(1) 时间
限制,因为我们比循环多测试一次。然而,除非
可以执行循环零次,初始化循环和测试的时间
限制一次是一个低阶项,可以通过求和规则删除。


现在考虑这个例子:

(1) for (j = 0; j < n; j++)
(2)   A[i][j] = 0;

我们知道第(1)行需要O(1)时间。显然,我们循环了 n 次,如下所示
我们可以通过从网上找到的上限减去下限来确定
(1) 然后加 1。由于正文(第 (2) 行)需要 O(1) 时间,因此我们可以忽略
递增 j 的时间以及将 j 与 n 进行比较的时间,两者也是 O(1)。
因此,第(1)行和第(2)行的运行时间是n和O(1)的乘积,即O(n)

类似地,我们可以限制由行组成的外循环的运行时间
(2) 到 (4),即

(2) for (i = 0; i < n; i++)
(3)     for (j = 0; j < n; j++)
(4)         A[i][j] = 0;

我们已经确定第 (3) 行和 (4) 行的循环需要 O(n) 时间。
因此,我们可以忽略增加 i 并测试 i 是否 < 的 O(1) 时间。内
每次迭代,得出外循环的每次迭代都需要 O(n) 时间。

外循环的初始化 i = 0 和条件的第 (n + 1) 次测试
我< n 同样需要 O(1) 时间并且可以忽略不计。最后,我们观察到我们走了
绕外层循环 n 次,每次迭代花费 O(n) 时间,总共给出
O(n^2) 运行时间。


一个更实际的例子。

在此处输入图像描述

Lets start from the beginning.

First of all, accept the principle that certain simple operations on data can be done in O(1) time, that is, in time that is independent of the size of the input. These primitive operations in C consist of

  1. Arithmetic operations (e.g. + or %).
  2. Logical operations (e.g., &&).
  3. Comparison operations (e.g., <=).
  4. Structure accessing operations (e.g. array-indexing like A[i], or pointer fol-
    lowing with the -> operator).
  5. Simple assignment such as copying a value into a variable.
  6. Calls to library functions (e.g., scanf, printf).

The justification for this principle requires a detailed study of the machine instructions (primitive steps) of a typical computer. Each of the described operations can be done with some small number of machine instructions; often only one or two instructions are needed.
As a consequence, several kinds of statements in C can be executed in O(1) time, that is, in some constant amount of time independent of input. These simple include

  1. Assignment statements that do not involve function calls in their expressions.
  2. Read statements.
  3. Write statements that do not require function calls to evaluate arguments.
  4. The jump statements break, continue, goto, and return expression, where
    expression does not contain a function call.

In C, many for-loops are formed by initializing an index variable to some value and
incrementing that variable by 1 each time around the loop. The for-loop ends when
the index reaches some limit. For instance, the for-loop

for (i = 0; i < n-1; i++) 
{
    small = i;
    for (j = i+1; j < n; j++)
        if (A[j] < A[small])
            small = j;
    temp = A[small];
    A[small] = A[i];
    A[i] = temp;
}

uses index variable i. It increments i by 1 each time around the loop, and the iterations
stop when i reaches n − 1.

However, for the moment, focus on the simple form of for-loop, where the difference between the final and initial values, divided by the amount by which the index variable is incremented tells us how many times we go around the loop. That count is exact, unless there are ways to exit the loop via a jump statement; it is an upper bound on the number of iterations in any case.

For instance, the for-loop iterates ((n − 1) − 0)/1 = n − 1 times,
since 0 is the initial value of i, n − 1 is the highest value reached by i (i.e., when i
reaches n−1, the loop stops and no iteration occurs with i = n−1), and 1 is added
to i at each iteration of the loop.

In the simplest case, where the time spent in the loop body is the same for each
iteration, we can multiply the big-oh upper bound for the body by the number of
times around the loop
. Strictly speaking, we must then add O(1) time to initialize
the loop index and O(1) time for the first comparison of the loop index with the
limit
, because we test one more time than we go around the loop. However, unless
it is possible to execute the loop zero times, the time to initialize the loop and test
the limit once is a low-order term that can be dropped by the summation rule.


Now consider this example:

(1) for (j = 0; j < n; j++)
(2)   A[i][j] = 0;

We know that line (1) takes O(1) time. Clearly, we go around the loop n times, as
we can determine by subtracting the lower limit from the upper limit found on line
(1) and then adding 1. Since the body, line (2), takes O(1) time, we can neglect the
time to increment j and the time to compare j with n, both of which are also O(1).
Thus, the running time of lines (1) and (2) is the product of n and O(1), which is O(n).

Similarly, we can bound the running time of the outer loop consisting of lines
(2) through (4), which is

(2) for (i = 0; i < n; i++)
(3)     for (j = 0; j < n; j++)
(4)         A[i][j] = 0;

We have already established that the loop of lines (3) and (4) takes O(n) time.
Thus, we can neglect the O(1) time to increment i and to test whether i < n in
each iteration, concluding that each iteration of the outer loop takes O(n) time.

The initialization i = 0 of the outer loop and the (n + 1)st test of the condition
i < n likewise take O(1) time and can be neglected. Finally, we observe that we go
around the outer loop n times, taking O(n) time for each iteration, giving a total
O(n^2) running time.


A more practical example.

enter image description here

安人多梦 2024-10-22 00:26:19

如果您想凭经验估计代码的顺序而不是通过分析代码,您可以坚持使用一系列递增的 n 值并对代码进行计时。将你的时间绘制在对数刻度上。如果代码为 O(x^n),则值应落在斜率 n 的线上。

与仅仅研究代码相比,这有几个优点。一方面,您可以查看运行时间是否处于接近其渐近顺序的范围内。此外,您可能会发现一些您认为顺序为 O(x) 的代码实际上是顺序 O(x^2),例如,因为在库调用上花费了时间。

If you want to estimate the order of your code empirically rather than by analyzing the code, you could stick in a series of increasing values of n and time your code. Plot your timings on a log scale. If the code is O(x^n), the values should fall on a line of slope n.

This has several advantages over just studying the code. For one thing, you can see whether you're in the range where the run time approaches its asymptotic order. Also, you may find that some code that you thought was order O(x) is really order O(x^2), for example, because of time spent in library calls.

怀中猫帐中妖 2024-10-22 00:26:19

基本上,90% 的时间里出现的只是分析循环。有单层、双层、三层嵌套循环吗?你有 O(n)、O(n^2)、O(n^3) 运行时间。

很少(除非您正在编写一个具有广泛基础库(例如 .NET BCL 或 C++ 的 STL)的平台),您会遇到比仅查看循环(for statements、while、goto、 ETC...)

Basically the thing that crops up 90% of the time is just analyzing loops. Do you have single, double, triple nested loops? The you have O(n), O(n^2), O(n^3) running time.

Very rarely (unless you are writing a platform with an extensive base library (like for instance, the .NET BCL, or C++'s STL) you will encounter anything that is more difficult than just looking at your loops (for statements, while, goto, etc...)

╭⌒浅淡时光〆 2024-10-22 00:26:19

我认为一般不太有用,但为了完整起见,还有一个 Big Omega Ω,定义算法复杂度的下限,以及 Big Theta θ ,它定义了上限和下限。

Less useful generally, I think, but for the sake of completeness there is also a Big Omega Ω, which defines a lower-bound on an algorithm's complexity, and a Big Theta Θ, which defines both an upper and lower bound.

澜川若宁 2024-10-22 00:26:19

大 O 表示法很有用,因为它易于使用并隐藏了不必要的复杂性和细节(对于某些不必要的定义)。解决分而治之算法复杂性的一种好方法是树方法。假设您有一个带有中值过程的快速排序版本,因此您每次都将数组拆分为完全平衡的子数组。

现在构建一棵与您使用的所有数组相对应的树。在根处有原始数组,根有两个子数组,它们是子数组。重复此操作,直到底部有单个元素数组。

由于我们可以在 O(n) 时间内找到中位数,并在 O(n) 时间内将数组分成两部分,因此每个节点完成的工作是 O(k),其中 k 是数组的大小。树的每个级别(最多)包含整个数组,因此每个级别的工作量为 O(n)(子数组的大小加起来为 n,并且由于每个级别的复杂度为 O(k),我们可以将其相加) 。自从每次我们将输入减半以来,树中只有 log(n) 个级别。

因此,我们可以将工作量上限限制为 O(n*log(n))。

然而,Big O 隐藏了一些我们有时无法忽视的细节。考虑使用 和 计算斐波那契数列

a=0;
b=1;
for (i = 0; i <n; i++) {
    tmp = b;
    b = a + b;
    a = tmp;
}

,假设 a 和 b 是 Java 中的 BigIntegers 或可以处理任意大数字的东西。大多数人都会毫不犹豫地说这是一个 O(n) 算法。原因是 for 循环中有 n 次迭代,而循环内部的工作时间复杂度为 O(1)。

但斐波那契数很大,第 n 个斐波那契数是 n 的指数,因此仅存储它就需要 n 个字节的量级。对大整数执行加法将需要 O(n) 的工作量。所以这个过程中完成的总工作量是

1 + 2 + 3 + ... + n = n(n-1)/2 = O(n^2)

所以这个算法的运行时间是二次的!

Big O notation is useful because it's easy to work with and hides unnecessary complications and details (for some definition of unnecessary). One nice way of working out the complexity of divide and conquer algorithms is the tree method. Let's say you have a version of quicksort with the median procedure, so you split the array into perfectly balanced subarrays every time.

Now build a tree corresponding to all the arrays you work with. At the root you have the original array, the root has two children which are the subarrays. Repeat this until you have single element arrays at the bottom.

Since we can find the median in O(n) time and split the array in two parts in O(n) time, the work done at each node is O(k) where k is the size of the array. Each level of the tree contains (at most) the entire array so the work per level is O(n) (the sizes of the subarrays add up to n, and since we have O(k) per level we can add this up). There are only log(n) levels in the tree since each time we halve the input.

Therefore we can upper bound the amount of work by O(n*log(n)).

However, Big O hides some details which we sometimes can't ignore. Consider computing the Fibonacci sequence with

a=0;
b=1;
for (i = 0; i <n; i++) {
    tmp = b;
    b = a + b;
    a = tmp;
}

and lets just assume the a and b are BigIntegers in Java or something that can handle arbitrarily large numbers. Most people would say this is an O(n) algorithm without flinching. The reasoning is that you have n iterations in the for loop and O(1) work in side the loop.

But Fibonacci numbers are large, the n-th Fibonacci number is exponential in n so just storing it will take on the order of n bytes. Performing addition with big integers will take O(n) amount of work. So the total amount of work done in this procedure is

1 + 2 + 3 + ... + n = n(n-1)/2 = O(n^2)

So this algorithm runs in quadradic time!

最佳男配角 2024-10-22 00:26:19

熟悉我使用的算法/数据结构和/或快速浏览迭代嵌套分析。困难在于,当您调用库函数时,可能会多次调用 - 您常常不确定是否有时不必要地调用该函数,或者它们正在使用什么实现。也许库函数应该有一个复杂性/效率度量,无论是 Big O 还是其他一些度量,可以在文档中找到,甚至可以在 智能感知

Familiarity with the algorithms/data structures I use and/or quick glance analysis of iteration nesting. The difficulty is when you call a library function, possibly multiple times - you can often be unsure of whether you are calling the function unnecessarily at times or what implementation they are using. Maybe library functions should have a complexity/efficiency measure, whether that be Big O or some other metric, that is available in documentation or even IntelliSense.

悲喜皆因你 2024-10-22 00:26:19

我想从一个稍微不同的方面来解释 Big-O。

Big-O 只是比较程序的复杂性,这意味着当输入增加时它们的增长速度有多快,而不是执行操作所花费的确切时间。

恕我直言,在大 O 公式中,您最好不要使用更复杂的方程(您可能只坚持使用下图中的方程。)但是您仍然可以使用其他更精确的公式(例如 3^n、n^3、.. .) 但不仅如此,有时可能会产生误导!因此最好保持尽可能简单。

输入图像描述这里

我想再次强调,在这里我们不想得到我们算法的精确公式。我们只想展示当输入增长时它如何增长,并在这个意义上与其他算法进行比较。否则,您最好使用不同的方法,例如基准测试。

I would like to explain the Big-O in a little bit different aspect.

Big-O is just to compare the complexity of the programs which means how fast are they growing when the inputs are increasing and not the exact time which is spend to do the action.

IMHO in the big-O formulas you better not to use more complex equations (you might just stick to the ones in the following graph.) However you still might use other more precise formula (like 3^n, n^3, ...) but more than that can be sometimes misleading! So better to keep it as simple as possible.

enter image description here

I would like to emphasize once again that here we don't want to get an exact formula for our algorithm. We only want to show how it grows when the inputs are growing and compare with the other algorithms in that sense. Otherwise you would better use different methods like bench-marking.

嘿哥们儿 2024-10-22 00:26:19

首先,接受的答案是试图解释漂亮的花哨的东西,
但我认为,故意使 Big-Oh 复杂化并不是解决方案,
程序员(或者至少像我这样的人)搜索的内容。

Big Oh(简而言之)

function f(text) {
  var n = text.length;
  for (var i = 0; i < n; i++) {
    f(text.slice(0, n-1))
  }
  // ... other JS logic here, which we can ignore ...
}

上面的 Big Oh 是 f(n) = O(n!),其中 n 表示输入中项目的数量放,
f 代表每个项目完成的操作


Big-Oh 表示法是算法复杂度的渐近上限。
在编程中:假设最坏情况所花费的时间,
或对于输入的大小假设的逻辑最大重复计数。

计算

请记住(根据上述含义);我们只需要受N(输入大小)影响的最坏情况时间和/或最大重复次数
然后再看一下(接受的答案)示例:

for (i = 0; i < 2*n; i += 2) {  // line 123
    for (j=n; j > i; j--) {     // line 124
        foo();                  // line 125
    }
}
  1. 开始使用此搜索模式:

    • 找到N导致重复行为的第一行,
    • 或者导致执行的逻辑增加,
    • 但无论是否恒定,请忽略该行之前的任何内容。
  2. 似乎第 23 行就是我们正在搜索的内容;-)

    • 乍一看,线路似乎有 2*n 最大循环。
    • 但再看一遍,我们看到 i += 2(并且跳过了那一半)。
    • 因此,最大重复次数就是 n,将其写下来,例如 f(n) = O( n 但不要关闭括号。
  3. 重复搜索直到方法的结束,并找到与我们的搜索模式匹配的下一行,这里是第 124 行

    • 这很棘手,因为奇怪的条件和反向循环。
    • 但记住我们只需要考虑最大重复次数(或最坏情况下花费的时间)。
    • 这就像说“反向循环 jj=n 开头,对吗?是的,n 似乎是最大可能的重复次数”,因此:
      • n 添加到之前记录的末尾,
      • 但就像“( n”而不是“+ n”(因为这是在上一个循环内),
      • 仅当我们找到前一个循环之外的内容时才使用右括号。

,为什么?因为第 125 行(或之后的任何其他行)与我们的搜索模式不匹配。
我们现在可以关闭任何括号(在我们的记录中左开),结果如下:

f(n) = O( n( n ) )

尝试进一步缩短“n( n )”部分,例如:

  • n( n ) = n * n
  • = n2
  • 最后,用 Big Oh 表示法将其括起来,例如 O(n2)< /strong> 或 O(n^2),无需格式化。

First of all, the accepted answer is trying to explain nice fancy stuff,
but I think, intentionally complicating Big-Oh is not the solution,
which programmers (or at least, people like me) search for.

Big Oh (in short)

function f(text) {
  var n = text.length;
  for (var i = 0; i < n; i++) {
    f(text.slice(0, n-1))
  }
  // ... other JS logic here, which we can ignore ...
}

Big Oh of above is f(n) = O(n!) where n represents number of items in input set,
and f represents operation done per item.


Big-Oh notation is the asymptotic upper-bound of the complexity of an algorithm.
In programming: The assumed worst-case time taken,
or assumed maximum repeat count of logic, for size of the input.

Calculation

Keep in mind (from above meaning) that; We just need worst-case time and/or maximum repeat count affected by N (size of input),
Then take another look at (accepted answer's) example:

for (i = 0; i < 2*n; i += 2) {  // line 123
    for (j=n; j > i; j--) {     // line 124
        foo();                  // line 125
    }
}
  1. Begin with this search-pattern:

    • Find first line that N caused repeat behavior,
    • Or caused increase of logic executed,
    • But constant or not, ignore anything before that line.
  2. Seems line hundred-twenty-three is what we are searching ;-)

    • On first sight, line seems to have 2*n max-looping.
    • But looking again, we see i += 2 (and that half is skipped).
    • So, max repeat is simply n, write it down, like f(n) = O( n but don't close parenthesis yet.
  3. Repeat search till method's end, and find next line matching our search-pattern, here that's line 124

    • Which is tricky, because strange condition, and reverse looping.
    • But after remembering that we just need to consider maximum repeat count (or worst-case time taken).
    • It's as easy as saying "Reverse-Loop j starts with j=n, am I right? yes, n seems to be maximum possible repeat count", so:
      • Add n to previous write down's end,
      • but like "( n" instead of "+ n" (as this is inside previous loop),
      • and close parenthesis only if we find something outside of previous loop.

Search Done! why? because line 125 (or any other line after) does not match our search-pattern.
We can now close any parenthesis (left-open in our write down), resulting in below:

f(n) = O( n( n ) )

Try to further shorten "n( n )" part, like:

  • n( n ) = n * n
  • = n2
  • Finally, just wrap it with Big Oh notation, like O(n2) or O(n^2) without formatting.
铃予 2024-10-22 00:26:19

将算法分解为您知道大 O 表示法的各个部分,并通过大 O 运算符进行组合。这是我所知道的唯一方法。

有关详细信息,请查看有关该主题的维基百科页面

Break down the algorithm into pieces you know the big O notation for, and combine through big O operators. That's the only way I know of.

For more information, check the Wikipedia page on the subject.

葵雨 2024-10-22 00:26:19

至于“如何计算”大O,这是计算复杂性理论的一部分。对于某些(许多)特殊情况,您可能可以使用一些简单的启发式方法(例如将嵌套循环的循环计数相乘),尤其是。当你想要的只是任何上限估计,并且你不介意它是否过于悲观时 - 我想这可能就是你的问题所在。

如果您真的想回答任何算法的问题,最好的办法就是应用该理论。除了简单化的“最坏情况”分析之外,我发现摊销分析在实践中非常有用。

As to "how do you calculate" Big O, this is part of Computational complexity theory. For some (many) special cases you may be able to come with some simple heuristics (like multiplying loop counts for nested loops), esp. when all you want is any upper bound estimation, and you do not mind if it is too pessimistic - which I guess is probably what your question is about.

If you really want to answer your question for any algorithm the best you can do is to apply the theory. Besides of simplistic "worst case" analysis I have found Amortized analysis very useful in practice.

蓝眼睛不忧郁 2024-10-22 00:26:19

对于第一种情况,内部循环执行 ni 次,因此执行总数是 i0i 的总和。 ni 的 code>n-1(因为低于、不低于或等于)。最终得到n*(n + 1) / 2,因此O(n²/2) = O(n²)

对于第二个循环,i 位于外循环所包含的 0n 之间;那么当j严格大于n时就会执行内部循环,这是不可能的。

For the 1st case, the inner loop is executed n-i times, so the total number of executions is the sum for i going from 0 to n-1 (because lower than, not lower than or equal) of the n-i. You get finally n*(n + 1) / 2, so O(n²/2) = O(n²).

For the 2nd loop, i is between 0 and n included for the outer loop; then the inner loop is executed when j is strictly greater than n, which is then impossible.

难如初 2024-10-22 00:26:19

除了使用主方法(或其专业之一)之外,我还通过实验测试我的算法。这不能证明已实现任何特定的复杂性类别,但它可以保证数学分析是适当的。为了让大家放心,我将代码覆盖率工具与我的实验结合使用,以确保我正在练习所有案例。

作为一个非常简单的示例,假设您想要对 .NET 框架列表排序的速度进行健全性检查。您可以编写如下内容,然后在 Excel 中分析结果以确保它们没有超过 n*log(n) 曲线。

在此示例中,我测量了比较次数,但检查每个样本大小所需的实际时间也是谨慎的做法。然而,您必须更加小心,您只是测量算法,而不包括测试基础设施中的工件。

int nCmp = 0;
System.Random rnd = new System.Random();

// measure the time required to sort a list of n integers
void DoTest(int n)
{
   List<int> lst = new List<int>(n);
   for( int i=0; i<n; i++ )
      lst[i] = rnd.Next(0,1000);

   // as we sort, keep track of the number of comparisons performed!
   nCmp = 0;
   lst.Sort( delegate( int a, int b ) { nCmp++; return (a<b)?-1:((a>b)?1:0)); }

   System.Console.Writeline( "{0},{1}", n, nCmp );
}


// Perform measurement for a variety of sample sizes.
// It would be prudent to check multiple random samples of each size, but this is OK for a quick sanity check
for( int n = 0; n<1000; n++ )
   DoTest(n);

In addition to using the master method (or one of its specializations), I test my algorithms experimentally. This can't prove that any particular complexity class is achieved, but it can provide reassurance that the mathematical analysis is appropriate. To help with this reassurance, I use code coverage tools in conjunction with my experiments, to ensure that I'm exercising all the cases.

As a very simple example say you wanted to do a sanity check on the speed of the .NET framework's list sort. You could write something like the following, then analyze the results in Excel to make sure they did not exceed an n*log(n) curve.

In this example I measure the number of comparisons, but it's also prudent to examine the actual time required for each sample size. However then you must be even more careful that you are just measuring the algorithm and not including artifacts from your test infrastructure.

int nCmp = 0;
System.Random rnd = new System.Random();

// measure the time required to sort a list of n integers
void DoTest(int n)
{
   List<int> lst = new List<int>(n);
   for( int i=0; i<n; i++ )
      lst[i] = rnd.Next(0,1000);

   // as we sort, keep track of the number of comparisons performed!
   nCmp = 0;
   lst.Sort( delegate( int a, int b ) { nCmp++; return (a<b)?-1:((a>b)?1:0)); }

   System.Console.Writeline( "{0},{1}", n, nCmp );
}


// Perform measurement for a variety of sample sizes.
// It would be prudent to check multiple random samples of each size, but this is OK for a quick sanity check
for( int n = 0; n<1000; n++ )
   DoTest(n);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文