“Big O”的简单英语解释是什么? 符号?
我更喜欢尽可能少的正式定义和简单的数学。
I'd prefer as little formal definition as possible and simple mathematics.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
我更喜欢尽可能少的正式定义和简单的数学。
I'd prefer as little formal definition as possible and simple mathematics.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(30)
如果您的头脑中有合适的无穷概念,那么这里有一个非常简短的描述:
而且此外
如果您升级到可以以两倍速度运行算法的计算机,大 O 符号将不会注意到这一点。 常数因子的改进太小,甚至在大 O 表示法所使用的规模中都无法被注意到。 请注意,这是大 O 表示法设计的有意部分。
然而,尽管任何比常数因子“更大”的东西都可以被检测到。
当有兴趣进行大小“大”到足以被视为近似无穷大的计算时,那么大 O 表示法大约是解决问题的成本。
如果以上内容没有意义,那么您的头脑中就没有兼容的直观的无穷概念,您可能应该忽略以上所有内容; 我知道,要使这些想法变得严谨,或者在直观上没有用的情况下解释它们,唯一的方法就是首先教你大 O 表示法或类似的东西。 (不过,一旦你将来很好地理解了大O表示法,重新审视这些想法可能是值得的)
If you have a suitable notion of infinity in your head, then there is a very brief description:
And furthermore
If you upgrade to a computer that can run your algorithm twice as fast, big O notation won't notice that. Constant factor improvements are too small to even be noticed in the scale that big O notation works with. Note that this is an intentional part of the design of big O notation.
Although anything "larger" than a constant factor can be detected, however.
When interested in doing computations whose size is "large" enough to be considered as approximately infinity, then big O notation is approximately the cost of solving your problem.
If the above doesn't make sense, then you don't have a compatible intuitive notion of infinity in your head, and you should probably disregard all of the above; the only way I know to make these ideas rigorous, or to explain them if they aren't already intuitively useful, is to first teach you big O notation or something similar. (although, once you well understand big O notation in the future, it may be worthwhile to revisit these ideas)
假设您从亚马逊订购了《哈利·波特:完整 8 部电影集》[蓝光],并同时在线下载了相同的电影集。 您想测试哪种方法更快。 交付大约需要一天时间才能到达,并且下载提前大约 30 分钟完成。 伟大的! 所以这是一场激烈的比赛。
如果我订购《指环王》、《暮光之城》、《黑暗骑士三部曲》等多部蓝光电影并同时在线下载所有电影会怎么样? 这次,交付仍然需要一天才能完成,但在线下载需要三天才能完成。
对于网上购物,购买商品(输入)的数量不会影响交货时间。 输出是恒定的。 我们称之为O(1)。
对于在线下载,下载时间与电影文件大小(输入)成正比。 我们称之为O(n)。
从实验中,我们知道在线购物的规模比在线下载更好。 理解大 O 表示法非常重要,因为它可以帮助您分析算法的可扩展性和效率。
注意:大O表示法代表算法的最坏情况。 我们假设 O(1) 和 O(n) 是上例中最坏的情况。
参考:http://carlcheo.com/compsci
Say you order Harry Potter: Complete 8-Film Collection [Blu-ray] from Amazon and download the same film collection online at the same time. You want to test which method is faster. The delivery takes almost a day to arrive and the download completed about 30 minutes earlier. Great! So it’s a tight race.
What if I order several Blu-ray movies like The Lord of the Rings, Twilight, The Dark Knight Trilogy, etc. and download all the movies online at the same time? This time, the delivery still take a day to complete, but the online download takes 3 days to finish.
For online shopping, the number of purchased item (input) doesn’t affect the delivery time. The output is constant. We call this O(1).
For online downloading, the download time is directly proportional to the movie file sizes (input). We call this O(n).
From the experiments, we know that online shopping scales better than online downloading. It is very important to understand big O notation because it helps you to analyze the scalability and efficiency of algorithms.
Note: Big O notation represents the worst-case scenario of an algorithm. Let’s assume that O(1) and O(n) are the worst-case scenarios of the example above.
Reference : http://carlcheo.com/compsci
定义:- Big O 表示法是一种表示法,表示如果数据输入增加,算法性能将如何表现。
当我们谈论算法时,有 3 个重要支柱:输入、输出和算法处理。 Big O 是一种符号表示法,表示如果数据输入以何种速率增加,算法处理的性能会发生变化。
我鼓励您观看此 YouTube 视频,其中通过代码示例深入解释了大 O 表示法 。
例如,假设算法需要 5 条记录,处理这些记录所需的时间为 27 秒。 现在,如果我们将记录增加到 10 条,算法需要 105 秒。
简而言之,所花费的时间是记录数量的平方。 我们可以用O(n ^ 2)来表示。 这种符号表示法称为大 O 表示法。
现在请注意,输入中的单位可以是任何单位,可以是字节、位数、记录数,性能可以用任何单位进行测量,如秒、分钟、天等。 所以它不是确切的单位,而是关系。
例如,看看下面的函数“Function1”,它接受一个集合并对第一条记录进行处理。 现在,对于此函数,无论您放置 1000 条、10000 条还是 100000 条记录,性能都将相同。 所以我们可以用O(1)来表示它。
现在看下面的函数“Function2()”。 在这种情况下,处理时间将随着记录数量的增加而增加。 我们可以使用O(n)来表示该算法的性能。
当我们看到任何算法的 Big O 表示法时,我们可以将它们分为三类性能:-
因此,通过查看 Big O 表示法,我们可以对算法的好区和坏区进行分类。
我建议您观看这个 10 分钟的视频,其中通过示例代码讨论了 Big O
https://www.youtube.com/watch?v=k6kxtzICG_g
Definition :- Big O notation is a notation which says how a algorithm performance will perform if the data input increases.
When we talk about algorithms there are 3 important pillars Input , Output and Processing of algorithm. Big O is symbolic notation which says if the data input is increased in what rate will the performance vary of the algorithm processing.
I would encourage you to see this youtube video which explains Big O Notation in depth with code examples.
So for example assume that a algorithm takes 5 records and the time required for processing the same is 27 seconds. Now if we increase the records to 10 the algorithm takes 105 seconds.
In simple words the time taken is square of the number of records. We can denote this by O(n ^ 2). This symbolic representation is termed as Big O notation.
Now please note the units can be anything in inputs it can be bytes , bits number of records , the performance can be measured in any unit like second , minutes , days and so on. So its not the exact unit but rather the relationship.
For example look at the below function "Function1" which takes a collection and does processing on the first record. Now for this function the performance will be same irrespective you put 1000 , 10000 or 100000 records. So we can denote it by O(1).
Now see the below function "Function2()". In this case the processing time will increase with number of records. We can denote this algorithm performance using O(n).
When we see a Big O notation for any algorithm we can classify them in to three categories of performance :-
So by looking at Big O notation we categorize good and bad zones for algorithms.
I would recommend you to watch this 10 minutes video which discusses Big O with sample code
https://www.youtube.com/watch?v=k6kxtzICG_g
最简单的看待它的方法(用简单的英语)
我们正在尝试了解输入参数的数量如何影响算法的运行时间。 如果应用程序的运行时间与输入参数的数量成正比,则称为 Big O of n。
上面的说法是一个好的开始,但并不完全正确。
更准确的解释(数学)
假设
n=输入参数的数量
T(n)= 将算法运行时间表示为 n 的函数的实际函数
c= 常数
f(n) = 一个近似函数,将算法的运行时间表示为 n 的函数
那么就 Big O 而言,只要满足以下条件,近似值 f(n) 就被认为足够好。
等式读作
当 n 接近无穷大时,n 的 T 小于或等于 n 的 c 乘以 f。
在大 O 表示法中,这写为
This is read as T of n is in big O of n。
回到英语
根据上面的数学定义,如果你说你的算法是 n 的 Big O,则意味着它是 n(输入参数数量)的函数或更快。 如果你的算法是 Big O of n,那么它也自动是 Big O of n 平方。
Big O of n 意味着我的算法运行速度至少与此一样快。 你不能看着你的算法的 Big O 表示法并说它很慢。 只能说速度很快。
查看此,获取有关来自加州大学伯克利分校的大O。 这实际上是一个简单的概念。 如果你听到Shewchuck教授(又名神级老师)解释,你会说“哦,就是这样!”。
Simplest way to look at it (in plain English)
We are trying to see how the number of input parameters, affects the running time of an algorithm. If the running time of your application is proportional to the number of input parameters, then it is said to be in Big O of n.
The above statement is a good start but not completely true.
A more accurate explanation (mathematical)
Suppose
n=number of input parameters
T(n)= The actual function that expresses the running time of the algorithm as a function of n
c= a constant
f(n)= An approximate function that expresses the running time of the algorithm as a function of n
Then as far as Big O is concerned, the approximation f(n) is considered good enough as long as the below condition is true.
The equation is read as
As n approaches infinity, T of n, is less than or equal to c times f of n.
In big O notation this is written as
This is read as T of n is in big O of n.
Back to English
Based on the mathematical definition above, if you say your algorithm is a Big O of n, it means it is a function of n (number of input parameters) or faster. If your algorithm is Big O of n, then it is also automatically the Big O of n square.
Big O of n means my algorithm runs at least as fast as this. You cannot look at Big O notation of your algorithm and say its slow. You can only say its fast.
Check this out for a video tutorial on Big O from UC Berkley. It is actually a simple concept. If you hear professor Shewchuck (aka God level teacher) explaining it, you will say "Oh that's all it is!".
前言
算法:解决问题的过程/公式
如何分析算法以及我们如何相互比较算法?
示例:你和一个朋友被要求创建一个函数来对从 0 到 N 的数字求和。您得出 f(x),您的朋友得出 g(x)。 两个函数具有相同的结果,但算法不同。 为了客观地比较算法的效率,我们使用Big-O 表示法。
Big-O 表示法: 描述了当输入变得任意大时,运行时相对于输入的增长速度。
3 个关键要点:
空间复杂度: 除了时间复杂度之外,我们还关心空间复杂度(算法使用多少内存/空间)。 我们不检查操作时间,而是检查内存分配的大小。
Preface
algorithm: procedure/formula for solving a problem
How do analyze algorithms and how can we compare algorithms against each other?
example: you and a friend are asked to create a function to sum the numbers from 0 to N. You come up with f(x) and your friend comes up with g(x). Both functions have the same result, but a different algorithm. In order to objectively compare the efficiency of the algorithms we use Big-O notation.
Big-O notation: describes how quickly runtime will grow relative to the input as the input get arbitrarily large.
3 key takeaways:
Space complexity: aside from time complexity, we also care about space complexity (how much memory/space an algorithm uses). Instead of checking the time of operations, we check the size of the allocation of memory.
我发现了一个关于大 O 表示法的非常好的解释,特别是对于一个不太喜欢数学的人。
https://rob-bell.net /2009/06/a-beginners-guide-to-big-o-notation/
I found a really great explanation about big O notation especially for a someone who's not much into mathematics.
https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
快速说明,我的回答几乎肯定会混淆 Big Oh 表示法(这是一个上限)与 Big Theta 表示法“θ”(两侧边界)。 但根据我的经验,这实际上是非学术环境中讨论的典型情况。 对于造成的任何混乱,我们深表歉意。
BigOh 复杂性可以通过此图可视化:
我能为 Big Oh 表示法给出的最简单的定义是:
Big Oh 表示法是算法复杂性的相对表示。
该句子中有一些重要且有意选择的单词:
读完其余部分后,请回来重读上面的内容。
我能想到的 BigOh 最好的例子就是做算术。 取两个数字(123456 和 789012)。 我们在学校学到的基本算术运算是:
其中每一个都是一个操作或一个问题。 解决这些问题的方法称为算法。
添加是最简单的。 您将数字排列起来(向右),并将数字添加到一列中,并在结果中写入该加法的最后一个数字。 该数字的“十”部分将转到下一列。
我们假设这些数字的加法是该算法中最昂贵的操作。 按理说,要将这两个数字加在一起,我们必须将 6 位数字加在一起(并且可能带有第 7 位数字)。 如果我们将两个 100 位数字相加,我们必须进行 100 次加法。 如果我们将两个 10,000 位数字相加,我们就必须进行 10,000 次加法。
看到图案了吗? 复杂性(即运算次数)与较大数字中的位数n成正比。 我们称之为O(n)或线性复杂度。
减法类似(除了您可能需要借用而不是进位)。
乘法则不同。 将数字排列起来,取出底部数字中的第一位数字,然后将其依次与顶部数字中的每个数字相乘,依此类推。 因此,要将两个 6 位数字相乘,我们必须进行 36 次乘法。 我们可能还需要进行多达 10 或 11 列的添加才能获得最终结果。
如果我们有两个 100 位数字,我们需要进行 10,000 次乘法和 200 次加法。 对于两个一百万位数的数字,我们需要进行一万亿 (1012) 乘法和两百万次加法。
由于算法以 n 平方缩放,因此为 O(n2) 或二次复杂度。 现在是介绍另一个重要概念的好时机:
我们只关心复杂性中最重要的部分。
精明的人可能已经意识到我们可以将操作数表示为:n2< /sup> + 2n。 但正如您从我们的示例中看到的那样,每个数字都有两个百万位数字,第二项 (2n) 变得微不足道(占该阶段总操作的 0.0002%)。
人们可以注意到,我们在这里假设了最坏的情况。 在进行 6 位数字相乘时,如果其中一个是 4 位,另一个是 6 位,那么我们只有 24 次乘法。 尽管如此,我们还是计算了“n”的最坏情况,即两者都是 6 位数字时。 因此,Big Oh 表示法是关于算法的最坏情况。
电话簿
我能想到的下一个最好的例子是电话簿,通常称为“白页”或类似的名称,但不同国家的情况有所不同。 但我说的是按姓氏列出人员,然后列出名字首字母或名字,可能还列出地址,然后列出电话号码。
现在,如果您指示计算机在包含 1,000,000 个姓名的电话簿中查找“John Smith”的电话号码,您会做什么? 忽略您可以猜测 S 开始的距离这一事实(假设您不能),您会怎么做?
典型的实现可能是向中间开放,取第 500,000th 并将其与“Smith”进行比较。 如果碰巧是“史密斯,约翰”,我们真的很幸运。 更有可能的是“John Smith”出现在该名字之前或之后。 如果是在之后,我们将电话簿的后半部分分成两半并重复。 如果是在之前,那么我们将电话簿的前半部分分成两半并重复。 等等。
这称为二分搜索,无论您是否意识到,它每天都会在编程中使用。
因此,如果您想在包含 100 万个名字的电话簿中查找某个名字,实际上最多执行 20 次就可以找到任何名字。 在比较搜索算法时,我们决定这种比较就是我们的“n”。
这非常好,不是吗?
用 BigOh 术语来说,这是O(log n) 或对数复杂度。 现在所讨论的对数可以是 ln (以 e 为底)、log10、log2 或其他一些底数。 没关系,它仍然是 O(log n),就像 O(2n2) 和 O(100n2) 仍然都是 O(n2< /sup>)。
此时值得解释的是,BigOh 可用于通过算法确定三种情况:
通常我们不关心最好的情况。 我们对预期的和最坏的情况感兴趣。 有时其中一个或另一个会更重要。
回到电话簿。
如果您有电话号码并想查找姓名怎么办? 警方有反向电话簿,但公众无法进行此类查询。 或者他们是吗? 从技术上讲,您可以在普通电话簿中反向查找号码。 如何?
您从名字开始并比较数字。 如果匹配,那就太好了,如果不匹配,你就进入下一个。 您必须这样做,因为电话簿是无序的(无论如何都是按电话号码排序的)。
因此,要根据电话号码查找姓名(反向查找):
旅行商
这是计算机科学中一个相当著名的问题,值得一提。 在这个问题中,你有 N 个城镇。 这些城镇中的每一个都通过一定距离的道路与 1 个或多个其他城镇相连。 旅行商问题是找到访问每个城镇的最短旅行。
听起来很简单? 再想一想。
如果您有 3 个城镇 A、B 和 C,所有对之间都有道路,那么您可以走:
嗯,实际上比这要少,因为其中一些是等价的(A → B → C 和例如,C → B → A 是等价的,因为它们使用相同的道路,只是方向相反)。
事实上,有3种可能性。
这是称为阶乘的数学运算的函数。 基本上:
所以旅行商问题的 BigOh 是 O(n!) 或 阶乘或组合复杂性。
当你到达 200 个城镇时,宇宙中已经没有足够的时间来用传统计算机解决问题了。
需要考虑的事情。
多项式时间
我想快速提及的另一点是,任何复杂度为 O(na) 的算法都被认为具有多项式复杂度< /strong> 或可在多项式时间内求解。
O(n)、O(n2) 等都是多项式时间。 有些问题无法在多项式时间内解决。 因此,某些东西在世界上被使用。 公钥加密就是一个很好的例子。 在计算上很难找到一个非常大的数的两个素因数。 如果不是,我们就无法使用我们使用的公钥系统。
无论如何,这就是我对 BigOh(修订版)的解释(希望是简单的英语)。
Quick note, my answer is almost certainly confusing Big Oh notation (which is an upper bound) with Big Theta notation "Θ" (which is a two-side bound). But in my experience, this is actually typical of discussions in non-academic settings. Apologies for any confusion caused.
BigOh complexity can be visualized with this graph:
The simplest definition I can give for Big Oh notation is this:
Big Oh notation is a relative representation of the complexity of an algorithm.
There are some important and deliberately chosen words in that sentence:
Come back and reread the above when you've read the rest.
The best example of BigOh I can think of is doing arithmetic. Take two numbers (123456 and 789012). The basic arithmetic operations we learned in school were:
Each of these is an operation or a problem. A method of solving these is called an algorithm.
The addition is the simplest. You line the numbers up (to the right) and add the digits in a column writing the last number of that addition in the result. The 'tens' part of that number is carried over to the next column.
Let's assume that the addition of these numbers is the most expensive operation in this algorithm. It stands to reason that to add these two numbers together we have to add together 6 digits (and possibly carry a 7th). If we add two 100 digit numbers together we have to do 100 additions. If we add two 10,000 digit numbers we have to do 10,000 additions.
See the pattern? The complexity (being the number of operations) is directly proportional to the number of digits n in the larger number. We call this O(n) or linear complexity.
Subtraction is similar (except you may need to borrow instead of carry).
Multiplication is different. You line the numbers up, take the first digit in the bottom number and multiply it in turn against each digit in the top number and so on through each digit. So to multiply our two 6 digit numbers we must do 36 multiplications. We may need to do as many as 10 or 11 column adds to get the end result too.
If we have two 100-digit numbers we need to do 10,000 multiplications and 200 adds. For two one million digit numbers we need to do one trillion (1012) multiplications and two million adds.
As the algorithm scales with n-squared, this is O(n2) or quadratic complexity. This is a good time to introduce another important concept:
We only care about the most significant portion of complexity.
The astute may have realized that we could express the number of operations as: n2 + 2n. But as you saw from our example with two numbers of a million digits apiece, the second term (2n) becomes insignificant (accounting for 0.0002% of the total operations by that stage).
One can notice that we've assumed the worst case scenario here. While multiplying 6 digit numbers, if one of them has 4 digits and the other one has 6 digits, then we only have 24 multiplications. Still, we calculate the worst case scenario for that 'n', i.e when both are 6 digit numbers. Hence Big Oh notation is about the Worst-case scenario of an algorithm.
The Telephone Book
The next best example I can think of is the telephone book, normally called the White Pages or similar but it varies from country to country. But I'm talking about the one that lists people by surname and then initials or first name, possibly address and then telephone numbers.
Now if you were instructing a computer to look up the phone number for "John Smith" in a telephone book that contains 1,000,000 names, what would you do? Ignoring the fact that you could guess how far in the S's started (let's assume you can't), what would you do?
A typical implementation might be to open up to the middle, take the 500,000th and compare it to "Smith". If it happens to be "Smith, John", we just got really lucky. Far more likely is that "John Smith" will be before or after that name. If it's after we then divide the last half of the phone book in half and repeat. If it's before then we divide the first half of the phone book in half and repeat. And so on.
This is called a binary search and is used every day in programming whether you realize it or not.
So if you want to find a name in a phone book of a million names you can actually find any name by doing this at most 20 times. In comparing search algorithms we decide that this comparison is our 'n'.
That is staggeringly good, isn't it?
In BigOh terms this is O(log n) or logarithmic complexity. Now the logarithm in question could be ln (base e), log10, log2 or some other base. It doesn't matter it's still O(log n) just like O(2n2) and O(100n2) are still both O(n2).
It's worthwhile at this point to explain that BigOh can be used to determine three cases with an algorithm:
Normally we don't care about the best case. We're interested in the expected and worst case. Sometimes one or the other of these will be more important.
Back to the telephone book.
What if you have a phone number and want to find a name? The police have a reverse phone book but such look-ups are denied to the general public. Or are they? Technically you can reverse look-up a number in an ordinary phone book. How?
You start at the first name and compare the number. If it's a match, great, if not, you move on to the next. You have to do it this way because the phone book is unordered (by phone number anyway).
So to find a name given the phone number (reverse lookup):
The Traveling Salesman
This is quite a famous problem in computer science and deserves a mention. In this problem, you have N towns. Each of those towns is linked to 1 or more other towns by a road of a certain distance. The Traveling Salesman problem is to find the shortest tour that visits every town.
Sounds simple? Think again.
If you have 3 towns A, B, and C with roads between all pairs then you could go:
Well, actually there's less than that because some of these are equivalent (A → B → C and C → B → A are equivalent, for example, because they use the same roads, just in reverse).
In actuality, there are 3 possibilities.
This is a function of a mathematical operation called a factorial. Basically:
So the BigOh of the Traveling Salesman problem is O(n!) or factorial or combinatorial complexity.
By the time you get to 200 towns there isn't enough time left in the universe to solve the problem with traditional computers.
Something to think about.
Polynomial Time
Another point I wanted to make a quick mention of is that any algorithm that has a complexity of O(na) is said to have polynomial complexity or is solvable in polynomial time.
O(n), O(n2) etc. are all polynomial time. Some problems cannot be solved in polynomial time. Certain things are used in the world because of this. Public Key Cryptography is a prime example. It is computationally hard to find two prime factors of a very large number. If it wasn't, we couldn't use the public key systems we use.
Anyway, that's it for my (hopefully plain English) explanation of BigOh (revised).
它显示了算法如何根据输入大小进行缩放。
O(n2):称为二次复杂度
请注意,数字项目数量增加 10 倍,但时间增加 102。 基本上,n=10,因此 O(n2) 为我们提供了缩放因子 n2,即 102。
O(n):称为线性复杂度
这次项数增加了 10 倍,时间也是如此。 n=10,因此 O(n) 的缩放因子为 10。
O(1):称为恒定复杂度
item 的数量仍然以 10 倍的速度增加,但 O(1) 的缩放因子始终为 1。
O(log n):称为对数复杂度
计算次数仅增加输入值的对数。 因此,在这种情况下,假设每次计算需要 1 秒,输入
n
的对数就是所需的时间,因此为log n
。这就是它的要点。 他们减少了数学计算,因此它可能不完全是 n2 或他们所说的任何东西,但这将是缩放中的主导因素。
It shows how an algorithm scales based on input size.
O(n2): known as Quadratic complexity
Notice that the number of items increases by a factor of 10, but the time increases by a factor of 102. Basically, n=10 and so O(n2) gives us the scaling factor n2 which is 102.
O(n): known as Linear complexity
This time the number of items increases by a factor of 10, and so does the time. n=10 and so O(n)'s scaling factor is 10.
O(1): known as Constant complexity
The number of items is still increasing by a factor of 10, but the scaling factor of O(1) is always 1.
O(log n): known as Logarithmic complexity
The number of computations is only increased by a log of the input value. So in this case, assuming each computation takes 1 second, the log of the input
n
is the time required, hencelog n
.That's the gist of it. They reduce the maths down so it might not be exactly n2 or whatever they say it is, but that'll be the dominating factor in the scaling.
Big-O 表示法(也称为“渐近增长”表示法)是当您忽略常量因子和原点附近的东西时函数“看起来像”的。 我们用它来讨论事物如何扩展。
基础知识
对于“足够”大的输入...
f(x) ∈ O(upperbound)
意味着f
"增长速度不快于“upperbound
f(x) ∈ Ɵ(justlikethis)
”意味着f
“增长得与”justlikethis
f(x) ∈ Ω(lowerbound)
表示f
“增长不慢于”lowerbound
大 O 表示法不关心常数因素:据说函数
9x²
的“增长方式与”10x²
完全相同。 big-O 渐近表示法也不关心非渐近的东西(“接近原点的东西”或“当问题规模很小时会发生什么”):函数 < code>10x² 据说“完全像”10x² - x + 2
一样生长。为什么你要忽略方程的较小部分? 因为当你考虑越来越大的尺度时,它们与方程的大部分相比就显得相形见绌了; 他们的贡献变得微不足道且无关紧要。 (请参阅示例部分。)
换句话说,当您走向无穷大时,一切都与比率有关。 如果将实际花费的时间除以
O(...)
,您将得到大输入限制中的一个常数因子。直观上这是有道理的:函数如果您可以将一个相乘得到另一个,则彼此“缩放”。 这就是当我们说......这意味着对于“足够大”的问题大小N(如果我们忽略原点附近的东西),存在一些常数(如2.5,完全编造)使得:
常量有多种选择; 通常,“最佳”选择被称为算法的“常数因子”......但我们经常忽略它,就像我们忽略非最大项一样(请参阅常数因子部分了解为什么它们通常不重要)。 您还可以将上面的等式视为一个界限,即“在最坏的情况下,所花费的时间永远不会比大约
N*log(N)
差,在一个因子 2.5(我们不太关心的常数因子)”。一般来说,
O(...)
是最有用的,因为我们经常关心最坏情况的行为。 如果f(x)
表示“坏”的东西,例如处理器或内存使用情况,则“f(x) ∈ O(upperbound)
”表示“upperbound< /code> 是处理器/内存使用的最坏情况”。
应用
作为一种纯粹的数学构造,大 O 表示法不仅限于谈论处理时间和内存。 您可以使用它来讨论缩放有意义的任何事物的渐近性,例如:
N
个人之间可能握手的次数(Ɵ(N²)
,特别是N(N-1)/2
,但重要的是它的“规模类似于”N²
)示例
对于上面的握手示例,房间中的每个人都与其他人握手。 在该示例中,
#handshakes ∈ Ɵ(N²)
。 为什么?退一步说:握手的次数正好是 n-choose-2 或 N*(N-1)/2(N 个人中的每一个与其他 N-1 个人握手,但是这个双次握手,因此除以 2):
然而,对于大量的人来说,线性项
N
显得相形见绌,实际上对比率的贡献为 0(在图表中:随着参与者数量的增加,对角线上的空框占总框的比例会变小)。 因此,缩放行为是阶N²
,或者握手次数“像N²一样增长”。就好像图表对角线上的空框(N*(N-1)/2 个复选标记)甚至不存在(N2 个复选标记渐近)。
(暂时离题“简单英语”:)如果您想向自己证明这一点,您可以对比率进行一些简单的代数运算,将其分解为多个项(
lim
意思是“在极限中考虑” of”,如果您没有看到过,请忽略它,它只是“并且 N 真的很大”的表示法):tl;dr:握手次数“看起来像”x² 对于大值来说非常多,如果我们写下比率#handshakes/x²,那么我们完全不需要x²握手的事实甚至不会在任意长的时间内以小数形式显示。
建立直觉
这让我们可以做出如下陈述:
[对于数学爱好者,您可以将鼠标悬停在剧透上查看小旁注]
(归功于 https://stackoverflow.com/a/487292/711085)
(从技术上讲,常数因子在一些更深奥的示例中可能很重要,但我已经 这些是程序员和应用计算机科学家用作参考点的
基本增长顺序。 他们一直看到这些。 (因此,虽然从技术上讲,您可能会认为“输入加倍会使 O(√N) 算法慢 1.414 倍”,但最好将其视为“这比对数更糟糕,但比线性更好”。)
常数因子< /strong>
通常,我们不关心具体的常数因子是什么,因为它们不会影响函数增长的方式。 例如,两种算法可能都需要 O(N) 时间才能完成,但其中一种算法的速度可能是另一种算法的两倍。 除非因子非常大,否则我们通常不会太在意,因为优化是一件棘手的事情(优化何时过早?); 此外,仅仅选择具有更好大 O 的算法通常就能将性能提高几个数量级。
一些渐近优越的算法(例如非比较
O(N log(log(N)))
排序)可能具有如此大的常数因子(例如100000*N log(log(N ))
),或者相对较大的开销,例如带有隐藏+ 100*N
的O(N log(log(N)))
,它们即使在“大数据”上也很少值得使用。为什么 O(N) 有时是你能做到的最好的,即为什么我们需要数据结构
O(N)
算法在某种意义上是“最好的”算法,如果你需要读取您的所有数据。 读取一堆数据的行为是一个O(N)
操作。 将其加载到内存中通常是O(N)
(如果有硬件支持,则速度更快,或者如果您已经读取了数据,则根本不需要时间)。 但是,如果您接触甚至查看每条数据(甚至其他所有数据),您的算法将花费O(N)
时间来执行此查找。 无论您的实际算法需要多长时间,它都至少是O(N)
,因为它花费了这些时间来查看所有数据。对于写作行为来说也是如此。 所有打印出 N 个东西的算法都将花费 N 时间,因为输出至少那么长(例如,打印出一组 N 个扑克牌的所有排列(重新排列的方式)是阶乘的:
O(N!)
(这就是为什么在这些情况下,好的程序将确保迭代使用 O(1) 内存并且不会打印或存储每个中间步骤))。这激发了数据结构的使用:数据结构只需要读取数据一次(通常
O(N)
时间),加上一些任意数量的预处理(例如>O(N)
或O(N log(N))
或O(N²)
),我们尽量保持较小的值。 此后,修改数据结构(插入/删除等)和对数据进行查询只需很少的时间,例如O(1)
或O(log(N))< /代码>。 然后您继续进行大量查询! 一般来说,您愿意提前做的工作越多,以后要做的工作就越少。
例如,假设您有数百万个路段的纬度和经度坐标,并且想要找到所有街道交叉口。
O(N)
的朴素方法就不会出现问题,但如果您想执行多次(在本例中)例如,N
次,每个段一次),我们必须执行O(N²)
工作,即 1000000²=1000000000000 次操作。 不好(现代计算机每秒可以执行大约十亿次操作)。O(N²)
变成了可管理的O(N)
,我们所要做的就是支付少量的成本来制作哈希表。这个故事的寓意是:数据结构可以让我们加快操作速度。 更重要的是,高级数据结构可以让您以极其聪明的方式组合、延迟甚至忽略操作。 不同的问题会有不同的类比,但它们都涉及以利用我们关心的某些结构的方式组织数据,或者我们人为地强加给它进行簿记。 我们提前做工作(基本上是计划和组织),现在重复的任务容易多了!
实际示例:编码时可视化增长顺序
渐近表示法的核心是与编程完全分开的。 渐近符号是一种数学框架,用于思考事物如何扩展以及如何在许多不同领域中使用。 这就是说......这就是您应用渐近符号进行编码的方式。
基础知识:每当我们与大小为 A 的集合(例如数组、集合、映射的所有键等)中的每个元素交互,或执行循环的 A 次迭代时,这就是大小为 A 的乘法因子为什么我说“乘法因子”? - 因为循环和函数(几乎根据定义)具有乘法运行时间:迭代次数、循环中完成的工作时间(或对于函数:调用的次数)函数,函数中完成的工作的时间)。 (如果我们不做任何花哨的事情,比如跳过循环或提前退出循环,或者根据参数更改函数中的控制流,这也是很常见的。)以下是可视化技术的一些示例,以及随附的伪代码。
(这里,
x
代表恒定时间的工作单位、处理器指令、解释器操作码等)示例 2:
示例 3:
如果我们做一些稍微复杂的事情,您可能仍然可以直观地想象发生了什么:
在这里,你能画出的最小的可识别轮廓才是最重要的; 三角形是二维形状(0.5 A^2),就像正方形是二维形状(A^2)一样; 这里的常数因子 2 仍然是两者之间的渐近比,但是,我们像所有因子一样忽略它......(这种技术有一些不幸的细微差别,我不会在这里讨论;它可能会误导您。
)当然这并不意味着循环和函数不好; 相反,它们是现代编程语言的构建块,我们喜欢它们。 然而,我们可以看到,我们将循环、函数和条件与数据(控制流等)编织在一起的方式模仿了程序的时间和空间使用! 如果时间和空间的使用成为一个问题,那就是当我们诉诸聪明并找到一个我们没有考虑过的简单算法或数据结构时,以某种方式减少增长的顺序。 然而,这些可视化技术(尽管它们并不总是有效)可以让您对最坏情况的运行时间做出天真的猜测。
这是我们可以直观地识别的另一件事:
我们可以重新排列它并看到它的 O(N):
或者也许您对数据进行 log(N) 次传递,总时间为 O(N*log(N)):
不相关,但是值得再次提及的是:如果我们执行哈希(例如字典/哈希表查找),那就是 O(1) 的因子。 那是相当快的。
如果我们做一些非常复杂的事情,例如使用递归函数或分而治之算法,
您可以使用主定理(通常有效),或者在荒谬的情况下使用 Akra-Bazzi 定理(几乎总是有效),您可以在维基百科上查找算法的运行时间。但是,程序员不会这样想,因为最终,算法直觉会成为第二天性。 您将开始编写一些低效的代码,并立即想到“我正在做一些效率极低的事情吗?”。 如果答案是“是”并且您预见到它实际上很重要,那么您可以退后一步,想出各种技巧来使事情运行得更快(答案几乎总是“使用哈希表”,很少“使用树”,并且很少有更复杂的事情)。
摊销和平均情况复杂性
还有“摊销”和/或“平均情况”的概念(请注意,它们是不同的)。
平均情况:这只不过是使用大 O 表示法来表示函数的期望值,而不是函数本身。 在通常情况下,您认为所有输入的可能性相同,平均情况只是运行时间的平均值。 例如,对于快速排序,即使对于一些非常糟糕的输入,最坏的情况是
O(N^2)
,但平均情况是通常的O(N log(N))
code> (真正糟糕的输入数量非常少,以至于我们在一般情况下不会注意到它们)。摊销最坏情况:某些数据结构可能具有很大的最坏情况复杂性,但保证如果您执行许多此类操作,您所做的平均工作量将比最坏情况更好 -案件。 例如,您可能有一个通常需要恒定
O(1)
时间的数据结构。 然而,偶尔它会“打嗝”,并需要O(N)
时间进行一个随机操作,因为它可能需要做一些簿记或垃圾收集或其他事情......但它向你保证,如果它打嗝了,再进行N次操作就不会再打嗝了。 最坏情况下的每次操作成本仍然为O(N)
,但多次运行的摊销成本为O(N)/N
=每次操作O(1)
。 由于大型操作非常罕见,因此大量的临时工作可以被视为与其余工作混合在一起作为一个恒定因素。 我们说这项工作在足够多的调用中“摊销”,以至于它渐近消失。平均情况和摊销最坏情况之间的比较:
不过,如果您< a href="https://www.usenix.org/conference/12th-usenix-security-symposium/denial-service-algorithmic-complexity-attacks" rel="noreferrer">合理担心攻击者,除了摊销和平均情况之外,还有许多其他算法攻击向量需要担心。)
平均情况和摊销都是在考虑扩展的情况下进行思考和设计时非常有用的工具。
(如果对此子主题感兴趣,请参阅平均情况与摊销分析之间的差异。)
多维大O
大多数时候,人们并没有意识到有不止一个变量在起作用。 例如,在字符串搜索算法中,您的算法可能需要时间
O([文本长度] + [查询长度])
,即它在两个变量中是线性的,例如O( N+M)
。 其他更简单的算法可能是O([文本长度]*[查询长度])
或O(N*M)
。 忽略多个变量是我在算法分析中看到的最常见的疏忽之一,并且可能会在设计算法时给您带来障碍。整个故事
请记住,大 O 并不是故事的全部。 您可以通过使用缓存、使它们忽略缓存、通过使用 RAM 而不是磁盘来避免瓶颈、使用并行化或提前完成工作来显着加快某些算法的速度 - 这些技术通常独立增长顺序“big-O”表示法,尽管您经常会在并行算法的 big-O 表示法中看到核心数量。
另请记住,由于程序的隐藏约束,您可能并不真正关心渐近行为。 您可能正在使用有限数量的值,例如:
O(N log(N))
快速排序; 您想使用插入排序,它恰好在小输入上表现良好。 这些情况通常出现在分而治之的算法中,将问题分解为越来越小的子问题,例如递归排序、快速傅立叶变换或矩阵乘法。在实践中,即使在具有相同或相似渐近性能的算法中,它们的相对优点实际上可能是由其他因素驱动的,例如:其他性能因素(快速排序和归并排序都是
O(N log(N))< /code>,但快速排序利用了 CPU 缓存); 非性能考虑因素,例如易于实施; 图书馆是否可用,以及该图书馆的信誉和维护情况如何。
与 2GHz 计算机相比,程序在 500MHz 计算机上运行速度也会更慢。 我们并不真正将其视为资源界限的一部分,因为我们考虑的是机器资源(例如每个时钟周期)的扩展,而不是每秒的扩展。 但是,有一些类似的事情可能会“秘密”影响性能,例如您是否在模拟下运行,或者编译器是否优化了代码。 这些可能会使一些基本操作花费更长的时间(甚至相对于彼此),甚至渐近地加速或减慢某些操作(甚至相对于彼此)。 不同的实现和/或环境之间的影响可能很小或很大。 你是否会切换语言或机器来完成那些额外的工作? 这取决于一百个其他原因(必要性、技能、同事、程序员的生产力、时间的货币价值、熟悉程度、解决方法、为什么不使用汇编或 GPU 等),这些原因可能比性能更重要。
上述问题,例如选择使用哪种编程语言的影响,几乎从未被视为恒定因素的一部分(也不应该被视为); 然而人们应该意识到它们,因为有时(尽管很少)它们可能会影响事物。 例如,在 cpython 中,对于您选择插入或查找,本机优先级队列实现渐近非最优(
O(log(N))
而不是O(1)
-分钟); 你使用另一个实现吗? 可能不会,因为 C 实现可能更快,并且其他地方可能存在其他类似问题。 需要权衡; 有时它们很重要,有时则不重要。(编辑:“简单的英语”解释到此结束。)
数学附录
为了完整起见,大 O 表示法的精确定义如下:
f(x) ∈ O(g(x))
表示“f 渐近上界为 const*g”:忽略 x 的某个有限值以下的所有内容,存在一个常量使得| f(x)| f(x)| ≤ const * |g(x)|
。 (其他符号如下:就像O
表示≤,Ω
表示≥。还有小写变体:o
表示<,和ω
表示 >。)f(x) ε Ɵ(g(x))
表示f(x) ε O(g(x))< /code> 和
f(x) ∈ Ω(g(x))
(以 g 为上限和下限):存在一些常数,使得 f 始终位于两者之间的“带”内const1*g(x)
和const2*g(x)
。 这是您可以做出的最强渐近语句,大致相当于==
。 (抱歉,为了清楚起见,我选择推迟到现在才提及绝对值符号;特别是因为我从未在计算机科学背景中见过负值。)人们经常使用
= O(...)
,这可能是更正确的“comp-sci”表示法,并且完全可以合法使用; “f = O(...)”被读作“f 是阶... / f 是 xxx-界于...”并且被认为是“f 是渐近线为...的某个表达式”。 我被教导要使用更严格的ε O(...)
。ε
表示“是……的一个元素”(仍读作以前的内容)。 在这种特殊情况下,O(N²)
包含类似 {2 N²
、3 N²
、1/2 N²
的元素>,2 N² + log(N)
,- N² + N^1.9
, ...} 并且无限大,但它仍然是一个集合。O 和 Ω 不对称(n = O(n²),但 n² 不是 O(n)),但 Ɵ 是对称的,并且(因为这些关系都是传递和自反的)Ɵ 因此是对称且传递和自反的,因此将所有函数集划分为等价类。 等价类是我们认为相同的一组事物。 也就是说,给定你能想到的任何函数,你都可以找到该类的规范/唯一的“渐近代表”(通常通过取极限......我认为); 就像你可以将所有整数分组为奇数或偶数一样,你可以将所有带有 Ɵ 的函数分组为 x-ish、log(x)^2-ish 等...通过基本上忽略较小的项(但有时你可能会陷入困境更复杂的函数本身是单独的类)。
= 表示法可能是更常见的一种,甚至被世界著名的计算机科学家在论文中使用。 此外,通常情况下,在休闲环境中,人们在表达
Ɵ(...)
时会说O(...)
; 这在技术上是正确的,因为事物集合Ɵ(exactlyThis)
是O(noGreaterThanThis)
的子集...并且更容易键入。 ;-)Big-O notation (also called "asymptotic growth" notation) is what functions "look like" when you ignore constant factors and stuff near the origin. We use it to talk about how thing scale.
Basics
for "sufficiently" large inputs...
f(x) ∈ O(upperbound)
meansf
"grows no faster than"upperbound
f(x) ∈ Ɵ(justlikethis)
meanf
"grows exactly like"justlikethis
f(x) ∈ Ω(lowerbound)
meansf
"grows no slower than"lowerbound
big-O notation doesn't care about constant factors: the function
9x²
is said to "grow exactly like"10x²
. Neither does big-O asymptotic notation care about non-asymptotic stuff ("stuff near the origin" or "what happens when the problem size is small"): the function10x²
is said to "grow exactly like"10x² - x + 2
.Why would you want to ignore the smaller parts of the equation? Because they become completely dwarfed by the big parts of the equation as you consider larger and larger scales; their contribution becomes dwarfed and irrelevant. (See example section.)
Put another way, it's all about the ratio as you go to infinity. If you divide the actual time it takes by the
O(...)
, you will get a constant factor in the limit of large inputs. Intuitively this makes sense: functions "scale like" one another if you can multiply one to get the other. That is when we say...... this means that for "large enough" problem sizes N (if we ignore stuff near the origin), there exists some constant (e.g. 2.5, completely made up) such that:
There are many choices of constant; often the "best" choice is known as the "constant factor" of the algorithm... but we often ignore it like we ignore non-largest terms (see Constant Factors section for why they don't usually matter). You can also think of the above equation as a bound, saying "In the worst-case scenario, the time it takes will never be worse than roughly
N*log(N)
, within a factor of 2.5 (a constant factor we don't care much about)".In general,
O(...)
is the most useful one because we often care about worst-case behavior. Iff(x)
represents something "bad" like the processor or memory usage, then "f(x) ∈ O(upperbound)
" means "upperbound
is the worst-case scenario of processor/memory usage".Applications
As a purely mathematical construct, big-O notation is not limited to talking about processing time and memory. You can use it to discuss the asymptotics of anything where scaling is meaningful, such as:
N
people at a party (Ɵ(N²)
, specificallyN(N-1)/2
, but what matters is that it "scales like"N²
)Example
For the handshake example above, everyone in a room shakes everyone else's hand. In that example,
#handshakes ∈ Ɵ(N²)
. Why?Back up a bit: the number of handshakes is exactly n-choose-2 or
N*(N-1)/2
(each of N people shakes the hands of N-1 other people, but this double-counts handshakes so divide by 2):However, for very large numbers of people, the linear term
N
is dwarfed and effectively contributes 0 to the ratio (in the chart: the fraction of empty boxes on the diagonal over total boxes gets smaller as the number of participants becomes larger). Therefore the scaling behavior isorder N²
, or the number of handshakes "grows like N²".It's as if the empty boxes on the diagonal of the chart (N*(N-1)/2 checkmarks) weren't even there (N2 checkmarks asymptotically).
(temporary digression from "plain English":) If you wanted to prove this to yourself, you could perform some simple algebra on the ratio to split it up into multiple terms (
lim
means "considered in the limit of", just ignore it if you haven't seen it, it's just notation for "and N is really really big"):tl;dr: The number of handshakes 'looks like' x² so much for large values, that if we were to write down the ratio #handshakes/x², the fact that we don't need exactly x² handshakes wouldn't even show up in the decimal for an arbitrarily large while.
Building Intuition
This lets us make statements like...
[for the mathematically inclined, you can mouse over the spoilers for minor sidenotes]
(with credit to https://stackoverflow.com/a/487292/711085 )
(technically the constant factor could maybe matter in some more esoteric examples, but I've phrased things above (e.g. in log(N)) such that it doesn't)
These are the bread-and-butter orders of growth that programmers and applied computer scientists use as reference points. They see these all the time. (So while you could technically think "Doubling the input makes an O(√N) algorithm 1.414 times slower," it's better to think of it as "this is worse than logarithmic but better than linear".)
Constant factors
Usually, we don't care what the specific constant factors are, because they don't affect the way the function grows. For example, two algorithms may both take
O(N)
time to complete, but one may be twice as slow as the other. We usually don't care too much unless the factor is very large since optimizing is tricky business ( When is optimisation premature? ); also the mere act of picking an algorithm with a better big-O will often improve performance by orders of magnitude.Some asymptotically superior algorithms (e.g. a non-comparison
O(N log(log(N)))
sort) can have so large a constant factor (e.g.100000*N log(log(N))
), or overhead that is relatively large likeO(N log(log(N)))
with a hidden+ 100*N
, that they are rarely worth using even on "big data".Why O(N) is sometimes the best you can do, i.e. why we need datastructures
O(N)
algorithms are in some sense the "best" algorithms if you need to read all your data. The very act of reading a bunch of data is anO(N)
operation. Loading it into memory is usuallyO(N)
(or faster if you have hardware support, or no time at all if you've already read the data). However, if you touch or even look at every piece of data (or even every other piece of data), your algorithm will takeO(N)
time to perform this looking. No matter how long your actual algorithm takes, it will be at leastO(N)
because it spent that time looking at all the data.The same can be said for the very act of writing. All algorithms which print out N things will take N time because the output is at least that long (e.g. printing out all permutations (ways to rearrange) a set of N playing cards is factorial:
O(N!)
(which is why in those cases, good programs will ensure an iteration uses O(1) memory and doesn't print or store every intermediate step)).This motivates the use of data structures: a data structure requires reading the data only once (usually
O(N)
time), plus some arbitrary amount of preprocessing (e.g.O(N)
orO(N log(N))
orO(N²)
) which we try to keep small. Thereafter, modifying the data structure (insertions/deletions/ etc.) and making queries on the data take very little time, such asO(1)
orO(log(N))
. You then proceed to make a large number of queries! In general, the more work you're willing to do ahead of time, the less work you'll have to do later on.For example, say you had the latitude and longitude coordinates of millions of road segments and wanted to find all street intersections.
O(N)
work only once, but if you want to do it many times (in this case,N
times, once for each segment), we'd have to doO(N²)
work, or 1000000²=1000000000000 operations. Not good (a modern computer can perform about a billion operations per second).O(N)
time. Thereafter, it only takes constant time on average to look up something by its key (in this case, our key is the latitude and longitude coordinates, rounded into a grid; we search the adjacent gridspaces of which there are only 9, which is a constant).O(N²)
to a manageableO(N)
, and all we had to do was pay a minor cost to make a hash table.The moral of the story: a data structure lets us speed up operations. Even more, advanced data structures can let you combine, delay, or even ignore operations in incredibly clever ways. Different problems would have different analogies, but they'd all involve organizing the data in a way that exploits some structure we care about, or which we've artificially imposed on it for bookkeeping. We do work ahead of time (basically planning and organizing), and now repeated tasks are much much easier!
Practical example: visualizing orders of growth while coding
Asymptotic notation is, at its core, quite separate from programming. Asymptotic notation is a mathematical framework for thinking about how things scale and can be used in many different fields. That said... this is how you apply asymptotic notation to coding.
The basics: Whenever we interact with every element in a collection of size A (such as an array, a set, all keys of a map, etc.), or perform A iterations of a loop, that is a multiplicative factor of size A. Why do I say "a multiplicative factor"?--because loops and functions (almost by definition) have multiplicative running time: the number of iterations, times work done in the loop (or for functions: the number of times you call the function, times work done in the function). (This holds if we don't do anything fancy, like skip loops or exit the loop early, or change control flow in the function based on arguments, which is very common.) Here are some examples of visualization techniques, with accompanying pseudocode.
(here, the
x
s represent constant-time units of work, processor instructions, interpreter opcodes, whatever)Example 2:
Example 3:
If we do something slightly complicated, you might still be able to imagine visually what's going on:
Here, the smallest recognizable outline you can draw is what matters; a triangle is a two dimensional shape (0.5 A^2), just like a square is a two-dimensional shape (A^2); the constant factor of two here remains in the asymptotic ratio between the two, however, we ignore it like all factors... (There are some unfortunate nuances to this technique I don't go into here; it can mislead you.)
Of course this does not mean that loops and functions are bad; on the contrary, they are the building blocks of modern programming languages, and we love them. However, we can see that the way we weave loops and functions and conditionals together with our data (control flow, etc.) mimics the time and space usage of our program! If time and space usage becomes an issue, that is when we resort to cleverness and find an easy algorithm or data structure we hadn't considered, to reduce the order of growth somehow. Nevertheless, these visualization techniques (though they don't always work) can give you a naive guess at a worst-case running time.
Here is another thing we can recognize visually:
We can just rearrange this and see it's O(N):
Or maybe you do log(N) passes of the data, for O(N*log(N)) total time:
Unrelatedly but worth mentioning again: If we perform a hash (e.g. a dictionary/hashtable lookup), that is a factor of O(1). That's pretty fast.
If we do something very complicated, such as with a recursive function or divide-and-conquer algorithm,
you can use the Master Theorem (usually works), or in ridiculous cases the Akra-Bazzi Theorem (almost always works)you look up the running time of your algorithm on Wikipedia.But, programmers don't think like this because eventually, algorithm intuition just becomes second nature. You will start to code something inefficient and immediately think "am I doing something grossly inefficient?". If the answer is "yes" AND you foresee it actually mattering, then you can take a step back and think of various tricks to make things run faster (the answer is almost always "use a hashtable", rarely "use a tree", and very rarely something a bit more complicated).
Amortized and average-case complexity
There is also the concept of "amortized" and/or "average case" (note that these are different).
Average Case: This is no more than using big-O notation for the expected value of a function, rather than the function itself. In the usual case where you consider all inputs to be equally likely, the average case is just the average of the running time. For example with quicksort, even though the worst-case is
O(N^2)
for some really bad inputs, the average case is the usualO(N log(N))
(the really bad inputs are very small in number, so few that we don't notice them in the average case).Amortized Worst-Case: Some data structures may have a worst-case complexity that is large, but guarantee that if you do many of these operations, the average amount of work you do will be better than worst-case. For example, you may have a data structure that normally takes constant
O(1)
time. However, occasionally it will 'hiccup' and takeO(N)
time for one random operation, because maybe it needs to do some bookkeeping or garbage collection or something... but it promises you that if it does hiccup, it won't hiccup again for N more operations. The worst-case cost is stillO(N)
per operation, but the amortized cost over many runs isO(N)/N
=O(1)
per operation. Because the big operations are sufficiently rare, the massive amount of occasional work can be considered to blend in with the rest of the work as a constant factor. We say the work is "amortized" over a sufficiently large number of calls that it disappears asymptotically.Comparison between average-case and amortized worst-case:
Though, if you're reasonably worried about an attacker, there are many other algorithmic attack vectors to worry about besides amortization and average-case.)
Both average-case and amortization are incredibly useful tools for thinking about and designing with scaling in mind.
(See Difference between average case and amortized analysis if interested in this subtopic.)
Multidimensional big-O
Most of the time, people don't realize that there's more than one variable at work. For example, in a string-search algorithm, your algorithm may take time
O([length of text] + [length of query])
, i.e. it is linear in two variables likeO(N+M)
. Other more naive algorithms may beO([length of text]*[length of query])
orO(N*M)
. Ignoring multiple variables is one of the most common oversights I see in algorithm analysis, and can handicap you when designing an algorithm.The whole story
Keep in mind that big-O is not the whole story. You can drastically speed up some algorithms by using caching, making them cache-oblivious, avoiding bottlenecks by working with RAM instead of disk, using parallelization, or doing work ahead of time -- these techniques are often independent of the order-of-growth "big-O" notation, though you will often see the number of cores in the big-O notation of parallel algorithms.
Also keep in mind that due to hidden constraints of your program, you might not really care about asymptotic behavior. You may be working with a bounded number of values, for example:
O(N log(N))
quicksort; you want to use insertion sort, which happens to perform well on small inputs. These situations often come up in divide-and-conquer algorithms, where you split up the problem into smaller and smaller subproblems, such as recursive sorting, fast Fourier transforms, or matrix multiplication.In practice, even among algorithms which have the same or similar asymptotic performance, their relative merit may actually be driven by other things, such as: other performance factors (quicksort and mergesort are both
O(N log(N))
, but quicksort takes advantage of CPU caches); non-performance considerations, like ease of implementation; whether a library is available, and how reputable and maintained the library is.Programs will also run slower on a 500MHz computer vs 2GHz computer. We don't really consider this as part of the resource bounds, because we think of the scaling in terms of machine resources (e.g. per clock cycle), not per real second. However, there are similar things which can 'secretly' affect performance, such as whether you are running under emulation, or whether the compiler optimized code or not. These might make some basic operations take longer (even relative to each other), or even speed up or slow down some operations asymptotically (even relative to each other). The effect may be small or large between different implementation and/or environment. Do you switch languages or machines to eke out that little extra work? That depends on a hundred other reasons (necessity, skills, coworkers, programmer productivity, the monetary value of your time, familiarity, workarounds, why not assembly or GPU, etc...), which may be more important than performance.
The above issues, like the effect of the choice of which programming language is used, are almost never considered as part of the constant factor (nor should they be); yet one should be aware of them because sometimes (though rarely) they may affect things. For example in cpython, the native priority queue implementation is asymptotically non-optimal (
O(log(N))
rather thanO(1)
for your choice of insertion or find-min); do you use another implementation? Probably not, since the C implementation is probably faster, and there are probably other similar issues elsewhere. There are tradeoffs; sometimes they matter and sometimes they don't.(edit: The "plain English" explanation ends here.)
Math addenda
For completeness, the precise definition of big-O notation is as follows:
f(x) ∈ O(g(x))
means that "f is asymptotically upper-bounded by const*g": ignoring everything below some finite value of x, there exists a constant such that|f(x)| ≤ const * |g(x)|
. (The other symbols are as follows: just likeO
means ≤,Ω
means ≥. There are lowercase variants:o
means <, andω
means >.)f(x) ∈ Ɵ(g(x))
means bothf(x) ∈ O(g(x))
andf(x) ∈ Ω(g(x))
(upper- and lower-bounded by g): there exists some constants such that f will always lie in the "band" betweenconst1*g(x)
andconst2*g(x)
. It is the strongest asymptotic statement you can make and roughly equivalent to==
. (Sorry, I elected to delay the mention of the absolute-value symbols until now, for clarity's sake; especially because I have never seen negative values come up in a computer science context.)People will often use
= O(...)
, which is perhaps the more correct 'comp-sci' notation, and entirely legitimate to use; "f = O(...)" is read "f is order ... / f is xxx-bounded by ..." and is thought of as "f is some expression whose asymptotics are ...". I was taught to use the more rigorous∈ O(...)
.∈
means "is an element of" (still read as before). In this particular case,O(N²)
contains elements like {2 N²
,3 N²
,1/2 N²
,2 N² + log(N)
,- N² + N^1.9
, ...} and is infinitely large, but it's still a set.O and Ω are not symmetric (n = O(n²), but n² is not O(n)), but Ɵ is symmetric, and (since these relations are all transitive and reflexive) Ɵ, therefore, is symmetric and transitive and reflexive, and therefore partitions the set of all functions into equivalence classes. An equivalence class is a set of things that we consider to be the same. That is to say, given any function you can think of, you can find a canonical/unique 'asymptotic representative' of the class (by generally taking the limit... I think); just like you can group all integers into odds or evens, you can group all functions with Ɵ into x-ish, log(x)^2-ish, etc... by basically ignoring smaller terms (but sometimes you might be stuck with more complicated functions which are separate classes unto themselves).
The
=
notation might be the more common one and is even used in papers by world-renowned computer scientists. Additionally, it is often the case that in a casual setting, people will sayO(...)
when they meanƟ(...)
; this is technically true since the set of thingsƟ(exactlyThis)
is a subset ofO(noGreaterThanThis)
... and it's easier to type. ;-)编辑:快速说明,这几乎肯定会混淆 Big O 符号 (这是一个上限)与Theta 表示法(既是上限又是下限)。 根据我的经验,这实际上是非学术环境中讨论的典型情况。 对于造成的任何混乱,我们深表歉意。
一句话简介:随着你的工作规模越来越大,完成它需要多长时间?
显然,这只是使用“尺寸”作为输入,“所用时间”作为输出——如果你想谈论内存使用等,同样的想法也适用。
下面是一个例子,我们有 N 件想要烘干的 T 恤。 我们假设将它们置于干燥位置的速度非常快(即人与人之间的相互作用可以忽略不计)。 当然,现实生活中的情况并非如此...
在室外使用晾衣绳:假设您有一个无限大的后院,则衣物在 O(1) 时间内晾干。 无论你有多少衣服,它都会获得相同的阳光和新鲜空气,因此尺寸不会影响干燥时间。
使用滚筒烘干机:每次放入 10 件衬衫,一小时后烘干。 (忽略此处的实际数字 — 它们无关紧要。)因此,烘干 50 件衬衫所需的时间大约是烘干 10 件衬衫的 5 倍。
把所有东西都放进通风柜里:如果我们把所有东西都堆成一大堆,然后让一般的温暖来处理,那么中间的衬衫需要很长时间才能变干。 我不想猜测细节,但我怀疑这至少是 O(N^2) - 当您增加洗涤负荷时,干燥时间会增加得更快。
“big O”表示法的一个重要方面是,它没有说明对于给定大小哪种算法更快。 采用哈希表(字符串键,整数值)与对数组(字符串,整数)。 基于字符串在哈希表中查找键或在数组中查找元素更快吗? (即对于数组,“找到字符串部分与给定键匹配的第一个元素。”)哈希表通常是摊销的(~=“平均”)O(1)——一旦设置完毕,应该需要大约在 100 个条目的表中查找条目的时间与在 1,000,000 个条目的表中查找条目的时间相同。 在数组中查找元素(基于内容而不是索引)是线性的,即 O(N) — 平均而言,您将不得不查看一半的条目。
这是否会使哈希表比数组的查找速度更快? 不必要。 如果您有一个非常小的条目集合,那么数组可能会更快 - 您可能能够在计算您正在查看的字符串的哈希码所需的时间内检查所有字符串。 然而,随着数据集变得越来越大,哈希表最终将击败数组。
EDIT: Quick note, this is almost certainly confusing Big O notation (which is an upper bound) with Theta notation (which is both an upper and lower bound). In my experience this is actually typical of discussions in non-academic settings. Apologies for any confusion caused.
In one sentence: As the size of your job goes up, how much longer does it take to complete it?
Obviously that's only using "size" as the input and "time taken" as the output — the same idea applies if you want to talk about memory usage etc.
Here's an example where we have N T-shirts which we want to dry. We'll assume it's incredibly quick to get them in the drying position (i.e. the human interaction is negligible). That's not the case in real life, of course...
Using a washing line outside: assuming you have an infinitely large back yard, washing dries in O(1) time. However much you have of it, it'll get the same sun and fresh air, so the size doesn't affect the drying time.
Using a tumble dryer: you put 10 shirts in each load, and then they're done an hour later. (Ignore the actual numbers here — they're irrelevant.) So drying 50 shirts takes about 5 times as long as drying 10 shirts.
Putting everything in an airing cupboard: If we put everything in one big pile and just let general warmth do it, it will take a long time for the middle shirts to get dry. I wouldn't like to guess at the detail, but I suspect this is at least O(N^2) — as you increase the wash load, the drying time increases faster.
One important aspect of "big O" notation is that it doesn't say which algorithm will be faster for a given size. Take a hashtable (string key, integer value) vs an array of pairs (string, integer). Is it faster to find a key in the hashtable or an element in the array, based on a string? (i.e. for the array, "find the first element where the string part matches the given key.") Hashtables are generally amortised (~= "on average") O(1) — once they're set up, it should take about the same time to find an entry in a 100 entry table as in a 1,000,000 entry table. Finding an element in an array (based on content rather than index) is linear, i.e. O(N) — on average, you're going to have to look at half the entries.
Does this make a hashtable faster than an array for lookups? Not necessarily. If you've got a very small collection of entries, an array may well be faster — you may be able to check all the strings in the time that it takes to just calculate the hashcode of the one you're looking at. As the data set grows larger, however, the hashtable will eventually beat the array.
Big O 描述了函数增长行为的上限,例如当输入变大时程序的运行时间。
示例:
O(n):如果我将输入大小加倍,运行时间也会加倍
O(n2): 如果输入大小加倍,运行时四倍
O(log n):如果输入大小加倍运行时间增加一
O(2n):如果输入大小增加一,则运行时间增加一运行时加倍
输入大小通常是表示输入所需的空间(以位为单位)。
Big O describes an upper limit on the growth behaviour of a function, for example the runtime of a program, when inputs become large.
Examples:
O(n): If I double the input size the runtime doubles
O(n2): If the input size doubles the runtime quadruples
O(log n): If the input size doubles the runtime increases by one
O(2n): If the input size increases by one, the runtime doubles
The input size is usually the space in bits needed to represent the input.
大 O 表示法最常被程序员用作计算(算法)完成所需时间的近似度量,表示为输入集大小的函数。
Big O 对于比较两种算法随着输入数量的增加而扩展的效果非常有用。
更准确地说,Big O 表示法 用于表达函数的渐近行为。 这意味着函数在接近无穷大时的行为方式。
在许多情况下,算法的“O”将落入以下情况之一:
当输入大小向无穷大增加时,Big O 会忽略那些对函数的增长曲线没有有意义贡献的因素。 这意味着与函数相加或相乘的常量将被忽略。
Big O notation is most commonly used by programmers as an approximate measure of how long a computation (algorithm) will take to complete expressed as a function of the size of the input set.
Big O is useful to compare how well two algorithms will scale up as the number of inputs is increased.
More precisely Big O notation is used to express the asymptotic behavior of a function. That means how the function behaves as it approaches infinity.
In many cases the "O" of an algorithm will fall into one of the following cases:
Big O ignores factors that do not contribute in a meaningful way to the growth curve of a function as the input size increases towards infinity. This means that constants that are added to or multiplied by the function are simply ignored.
大O只是用一种常见的方式“表达”自己的一种方式,“运行我的代码需要多少时间/空间?”。
你可能经常会看到O(n)、O(n2)、O(nlogn)等等,这些只是展示的方式; 算法如何改变?
O(n) 意味着大 O 是 n,现在你可能会想,“n 是什么!?” “n”是元素的数量。 想象一下您想要在数组中搜索项目。 您必须查看每个元素并作为“您是正确的元素/项目吗?” 在最坏的情况下,该项目位于最后一个索引,这意味着它花费的时间与列表中的项目一样多,因此为了通用,我们说“哦嘿,n 是一个公平的给定数量的值!” 。
那么你可能会理解“n2”的含义,但更具体地说,考虑一下你有一个简单的、最简单的排序算法; 冒泡排序。 该算法需要遍历整个列表中的每个项目。
我的列表
这里的流程是:
这是 O n2 因为,您需要查看列表中的所有项目,其中有“n”个项目。 对于每个项目,您再次查看所有项目,为了进行比较,这也是“n”,因此对于每个项目,您查看“n”次,意味着 n*n = n2
我希望这样就是你想要的那么简单。
但请记住,大O只是以时间和空间的方式表达自己的一种方式。
Big O is just a way to "Express" yourself in a common way, "How much time / space does it take to run my code?".
You may often see O(n), O(n2), O(nlogn) and so forth, all these are just ways to show; How does an algorithm change?
O(n) means Big O is n, and now you might think, "What is n!?" Well "n" is the amount of elements. Imaging you want to search for an Item in an Array. You would have to look on Each element and as "Are you the correct element/item?" in the worst case, the item is at the last index, which means that it took as much time as there are items in the list, so to be generic, we say "oh hey, n is a fair given amount of values!".
So then you might understand what "n2" means, but to be even more specific, play with the thought you have a simple, the simpliest of the sorting algorithms; bubblesort. This algorithm needs to look through the whole list, for each item.
My list
The flow here would be:
This is O n2 because, you need to look at all items in the list there are "n" items. For each item, you look at all items once more, for comparing, this is also "n", so for every item, you look "n" times meaning n*n = n2
I hope this is as simple as you want it.
But remember, Big O is just a way to experss yourself in the manner of time and space.
Big O 描述了算法的基本缩放性质。
Big O 没有告诉您有关给定算法的大量信息。 它切入要点,仅提供有关算法缩放性质的信息,特别是算法的资源使用(思考时间或内存)如何根据“输入大小”进行缩放。
考虑一下蒸汽机和火箭之间的区别。 它们不仅是同一事物的不同变体(例如,普锐斯发动机与兰博基尼发动机),而且从本质上讲,它们是截然不同的推进系统。 蒸汽发动机可能比玩具火箭更快,但没有蒸汽活塞发动机能够达到轨道运载火箭的速度。 这是因为这些系统在达到给定速度(“输入大小”)所需燃料(“资源使用”)的关系方面具有不同的缩放特性。
为什么这个这么重要? 因为软件处理的问题的大小可能相差高达一万亿倍。 考虑一下。 前往月球所需的速度与人类行走速度之比小于 10,000:1,与软件可能面临的输入大小范围相比,这绝对是很小的。 由于软件可能面临输入大小的天文数字范围,因此算法的 Big O 复杂性(它的基本扩展性质)有可能胜过任何实现细节。
考虑规范排序的例子。 冒泡排序的复杂度为 O(n2),而合并排序的复杂度为 O(n log n)。 假设您有两个排序应用程序,应用程序 A 使用冒泡排序,应用程序 B 使用合并排序,并且假设对于大约 30 个元素的输入大小,应用程序 A 的排序速度比应用程序 B 快 1,000 倍。 如果您不需要对超过 30 个元素进行排序,那么显然您应该更喜欢应用程序 A,因为它在这些输入大小下要快得多。 但是,如果您发现可能必须对一千万个项目进行排序,那么您会期望在这种情况下应用程序 B 实际上比应用程序 A 快数千倍,这完全是由于每种算法的扩展方式所致。
Big O describes the fundamental scaling nature of an algorithm.
There is a lot of information that Big O does not tell you about a given algorithm. It cuts to the bone and gives only information about the scaling nature of an algorithm, specifically how the resource use (think time or memory) of an algorithm scales in response to the "input size".
Consider the difference between a steam engine and a rocket. They are not merely different varieties of the same thing (as, say, a Prius engine vs. a Lamborghini engine) but they are dramatically different kinds of propulsion systems, at their core. A steam engine may be faster than a toy rocket, but no steam piston engine will be able to achieve the speeds of an orbital launch vehicle. This is because these systems have different scaling characteristics with regards to the relation of fuel required ("resource usage") to reach a given speed ("input size").
Why is this so important? Because software deals with problems that may differ in size by factors up to a trillion. Consider that for a moment. The ratio between the speed necessary to travel to the Moon and human walking speed is less than 10,000:1, and that is absolutely tiny compared to the range in input sizes software may face. And because software may face an astronomical range in input sizes there is the potential for the Big O complexity of an algorithm, it's fundamental scaling nature, to trump any implementation details.
Consider the canonical sorting example. Bubble-sort is O(n2) while merge-sort is O(n log n). Let's say you have two sorting applications, application A which uses bubble-sort and application B which uses merge-sort, and let's say that for input sizes of around 30 elements application A is 1,000x faster than application B at sorting. If you never have to sort much more than 30 elements then it's obvious that you should prefer application A, as it is much faster at these input sizes. However, if you find that you may have to sort ten million items then what you'd expect is that application B actually ends up being thousands of times faster than application A in this case, entirely due to the way each algorithm scales.
这是我在解释 Big-O 的常见变体时倾向于使用的简单英语动物寓言。
在所有情况下,与列表中较低的算法相比,更喜欢列表中较高的算法。 然而,迁移到更昂贵的复杂性级别的成本差异很大。
O(1):
没有增长。 无论问题有多大,你都可以在相同的时间内解决它。 这有点类似于广播,无论广播范围内有多少人,在给定距离上广播都需要相同的能量。
O(log n):
这个复杂度与 O(1) 相同,只是稍微差一点。 出于所有实际目的,您可以将其视为非常大的常数缩放。 处理 1000 件物品和处理 10 亿件物品之间的工作量差异仅为六分之一。
O(n):
解决问题的成本与问题的规模成正比。 如果问题的规模加倍,那么解决方案的成本也会加倍。 由于大多数问题都必须以某种方式扫描到计算机中,例如数据输入、磁盘读取或网络流量,因此这通常是一个负担得起的缩放因子。
O(n log n):
这种复杂性与O(n)非常相似强>。 就所有实际目的而言,两者是等效的。 这种复杂程度通常仍被认为是可扩展的。 通过调整假设,一些 O(n log n) 算法可以转换为 O(n) 算法。 例如,限制键的大小可将排序从 O(n log n) 减少到 O(n)。
O(n2):
成长为一个正方形,其中 n 是正方形边长。 这与“网络效应”的增长率相同,即网络中的每个人都可能认识网络中的其他人。 增长是昂贵的。 大多数可扩展的解决方案都无法在不进行大量操作的情况下使用具有这种复杂程度的算法。 这通常也适用于所有其他多项式复杂性 - O(nk)。
O(2n):
不缩放。 你没有希望解决任何不平凡的问题。 对于了解要避免什么以及让专家找到 O(nk) 的近似算法很有用。
Here is the plain English bestiary I tend to use when explaining the common varieties of Big-O
In all cases, prefer algorithms higher up on the list to those lower on the list. However, the cost of moving to a more expensive complexity class varies significantly.
O(1):
No growth. Regardless of how big as the problem is, you can solve it in the same amount of time. This is somewhat analogous to broadcasting where it takes the same amount of energy to broadcast over a given distance, regardless of the number of people that lie within the broadcast range.
O(log n):
This complexity is the same as O(1) except that it's just a little bit worse. For all practical purposes, you can consider this as a very large constant scaling. The difference in work between processing 1 thousand and 1 billion items is only a factor six.
O(n):
The cost of solving the problem is proportional to the size of the problem. If your problem doubles in size, then the cost of the solution doubles. Since most problems have to be scanned into the computer in some way, as data entry, disk reads, or network traffic, this is generally an affordable scaling factor.
O(n log n):
This complexity is very similar to O(n). For all practical purposes, the two are equivalent. This level of complexity would generally still be considered scalable. By tweaking assumptions some O(n log n) algorithms can be transformed into O(n) algorithms. For example, bounding the size of keys reduces sorting from O(n log n) to O(n).
O(n2):
Grows as a square, where n is the length of the side of a square. This is the same growth rate as the "network effect", where everyone in a network might know everyone else in the network. Growth is expensive. Most scalable solutions cannot use algorithms with this level of complexity without doing significant gymnastics. This generally applies to all other polynomial complexities - O(nk) - as well.
O(2n):
Does not scale. You have no hope of solving any non-trivially sized problem. Useful for knowing what to avoid, and for experts to find approximate algorithms which are in O(nk).
Big O 是算法相对于其输入大小使用多少时间/空间的度量。
如果算法的复杂度为 O(n),那么时间/空间将以与其输入相同的速率增加。
如果算法为 O(n2),则时间/空间以其输入平方的速率增加。
等等。
Big O is a measure of how much time/space an algorithm uses relative to the size of its input.
If an algorithm is O(n) then the time/space will increase at the same rate as its input.
If an algorithm is O(n2) then the time/space increase at the rate of its input squared.
and so on.
测量软件程序的速度是非常困难的,当我们尝试时,答案可能非常复杂,并且充满了例外和特殊情况。 这是一个大问题,因为当我们想要比较两个不同的程序以找出哪个“最快”时,所有这些异常和特殊情况都会分散注意力并且毫无帮助。
由于所有这些无益的复杂性,人们试图使用尽可能最小和最不复杂的(数学)表达式来描述软件程序的速度。 这些表达式是非常非常粗略的近似值:尽管运气好的话,它们将捕捉到软件快还是慢的“本质”。
因为它们是近似值,所以我们在表达式中使用字母“O”(Big Oh),作为惯例,向读者表明我们正在过度简化。 (并确保没有人错误地认为该表达在任何方面都是准确的)。
如果你把“Oh”理解为“大约”或“大约”的意思,你就不会错得太离谱。 (我认为大哦的选择可能是一种幽默的尝试)。
这些“Big-Oh”表达式试图做的唯一一件事就是描述当我们增加软件必须处理的数据量时软件会减慢多少。 如果我们需要处理的数据量增加一倍,那么软件是否需要两倍的时间才能完成其工作? 十倍长? 在实践中,您会遇到并需要担心的 big-Oh 表达式的数量非常有限:
好处:
O(1)
Constant:程序采用无论输入有多大,运行时间相同。O(log n)
对数:即使输入大小大幅增加,程序运行时间也只会缓慢增加。坏处:
O(n)
线性:程序运行时间与输入大小成比例增加。O(n^k)
多项式: - 随着输入大小的增加,处理时间变得越来越快 - 作为多项式函数。...以及丑陋的:
O(k^n)
指数 程序运行时间增加得非常快,即使问题的规模适度增加 - 这只是使用指数算法处理小数据集非常实用。O(n!)
阶乘 程序运行时间将比您等待的时间更长,除了最小和看起来最琐碎的数据集之外。It is very difficult to measure the speed of software programs, and when we try, the answers can be very complex and filled with exceptions and special cases. This is a big problem, because all those exceptions and special cases are distracting and unhelpful when we want to compare two different programs with one another to find out which is "fastest".
As a result of all this unhelpful complexity, people try to describe the speed of software programs using the smallest and least complex (mathematical) expressions possible. These expressions are very very crude approximations: Although, with a bit of luck, they will capture the "essence" of whether a piece of software is fast or slow.
Because they are approximations, we use the letter "O" (Big Oh) in the expression, as a convention to signal to the reader that we are making a gross oversimplification. (And to make sure that nobody mistakenly thinks that the expression is in any way accurate).
If you read the "Oh" as meaning "on the order of" or "approximately" you will not go too far wrong. (I think the choice of the Big-Oh might have been an attempt at humour).
The only thing that these "Big-Oh" expressions try to do is to describe how much the software slows down as we increase the amount of data that the software has to process. If we double the amount of data that needs to be processed, does the software need twice as long to finish it's work? Ten times as long? In practice, there are a very limited number of big-Oh expressions that you will encounter and need to worry about:
The good:
O(1)
Constant: The program takes the same time to run no matter how big the input is.O(log n)
Logarithmic: The program run-time increases only slowly, even with big increases in the size of the input.The bad:
O(n)
Linear: The program run-time increases proportionally to the size of the input.O(n^k)
Polynomial: - Processing time grows faster and faster - as a polynomial function - as the size of the input increases.... and the ugly:
O(k^n)
Exponential The program run-time increases very quickly with even moderate increases in the size of the problem - it is only practical to process small data sets with exponential algorithms.O(n!)
Factorial The program run-time will be longer than you can afford to wait for anything but the very smallest and most trivial-seeming datasets.对 Big-O 表示法需求的简单英语解释:
当我们编程时,我们正在尝试解决问题。 我们编写的代码称为算法。 大 O 表示法允许我们以标准化方式比较算法的最坏情况性能。 硬件规格随着时间的推移而变化,硬件的改进可以减少算法运行的时间。 但是更换硬件并不意味着我们的算法会随着时间的推移而更好或改进,因为我们的算法仍然是相同的。 因此,为了让我们能够比较不同的算法,以确定一种算法是否更好,我们使用 Big O 表示法。
大 O 表示法的简单英语解释:
并非所有算法都在相同的时间内运行,并且可能会根据输入中的项目数量而有所不同,这我们将调用n。 基于此,我们考虑最坏的情况分析,或者当n变得越来越大时运行时间的上限。 我们必须知道 n 是什么,因为许多 Big O 符号都引用了它。
A Plain English Explanation of the Need for Big-O Notation:
When we program, we are trying to solve a problem. What we code is called an algorithm. Big O notation allows us to compare the worse case performance of our algorithms in a standardized way. Hardware specs vary over time and improvements in hardware can reduce the time it takes an algorithms to run. But replacing the hardware does not mean our algorithm is any better or improved over time, as our algorithm is still the same. So in order to allow us to compare different algorithms, to determine if one is better or not, we use Big O notation.
A Plain English Explanation of What Big O Notation is:
Not all algorithms run in the same amount of time, and can vary based on the number of items in the input, which we'll call n. Based on this, we consider the worse case analysis, or an upper-bound of the run-time as n get larger and larger. We must be aware of what n is, because many of the Big O notations reference it.
好吧,我的 2 美分。
Big-O,是程序消耗的资源增长率,与问题实例大小有关
资源:可能是总 CPU 时间,也可能是最大 RAM 空间。 默认情况下是指 CPU 时间。
假设问题是“求和”,
问题实例= {5,10,15} ==> 问题实例大小= 3,循环迭代= 3
问题实例= {5,10,15,20,25} ==> 问题实例大小 = 5 循环迭代 = 5
对于大小为“n”的输入,程序以数组中“n”次迭代的速度增长。 因此 Big-O 是 N,表示为 O(n)
假设问题是“查找组合”,
problem-instance= {5,10,15} ==> 问题实例大小 = 3,总迭代次数 = 3*3 = 9
问题实例= {5,10,15,20,25} ==> Problem-instance-size = 5,total-iterations= 5*5 =25
对于大小为“n”的输入,程序以数组中“n*n”次迭代的速度增长。 因此 Big-O 是 N2 ,表示为 O(n2)
Ok, my 2cents.
Big-O, is rate of increase of resource consumed by program, w.r.t. problem-instance-size
Resource : Could be total-CPU time, could be maximum RAM space. By default refers to CPU time.
Say the problem is "Find the sum",
problem-instance= {5,10,15} ==> problem-instance-size = 3, iterations-in-loop= 3
problem-instance= {5,10,15,20,25} ==> problem-instance-size = 5 iterations-in-loop = 5
For input of size "n" the program is growing at speed of "n" iterations in array. Hence Big-O is N expressed as O(n)
Say the problem is "Find the Combination",
problem-instance= {5,10,15} ==> problem-instance-size = 3, total-iterations = 3*3 = 9
problem-instance= {5,10,15,20,25} ==> problem-instance-size = 5, total-iterations= 5*5 =25
For input of size "n" the program is growing at speed of "n*n" iterations in array. Hence Big-O is N2 expressed as O(n2)
一个简单直接的答案可以是:
大 O 代表该算法最糟糕的可能时间/空间。 该算法永远不会占用超过该限制的更多空间/时间。 大O代表极端情况下的时间/空间复杂度。
A simple straightforward answer can be:
Big O represents the worst possible time/space for that algorithm. The algorithm will never take more space/time above that limit. Big O represents time/space complexity in the extreme case.
大 O 表示法是一种用空间或运行时间来描述算法上限的方法。 n 是问题中的元素数量(即数组的大小、树中的节点数量等)。我们有兴趣描述 n 变大时的运行时间。
当我们说某个算法的复杂度为 O(f(n)) 时,我们是说该算法的运行时间(或所需空间)总是低于某个常数倍 f(n)。
说二分查找的运行时间为 O(logn) 就是说存在某个常数 c,您可以将它乘以 log(n),该常数将始终大于二分查找的运行时间。 在这种情况下,您将始终有一些 log(n) 比较的常数因子。
换句话说,当 g(n) 是算法的运行时间时,我们说 g(n) = O(f(n)) 当 g(n) <= c*f(n) 当 n > 时 k,其中 c 和 k 是一些常数。
Big O notation is a way of describing the upper bound of an algorithm in terms of space or running time. The n is the number of elements in the the problem (i.e size of an array, number of nodes in a tree, etc.) We are interested in describing the running time as n gets big.
When we say some algorithm is O(f(n)) we are saying that the running time (or space required) by that algorithm is always lower than some constant times f(n).
To say that binary search has a running time of O(logn) is to say that there exists some constant c which you can multiply log(n) by that will always be larger than the running time of binary search. In this case you will always have some constant factor of log(n) comparisons.
In other words where g(n) is the running time of your algorithm, we say that g(n) = O(f(n)) when g(n) <= c*f(n) when n > k, where c and k are some constants.
这样一个美丽简单而简短的问题似乎至少值得一个同样简短的答案,就像学生在辅导期间可能得到的答案一样。
(*在美妙的、无单位的时间感中!)
(**这才是最重要的,因为人们总是想要更多,无论他们活在今天还是明天)
那么,如果 Big O 表示法就是这样的话,它有什么奇妙之处呢?
实际上,Big O 分析非常有用且重要,因为 Big O 将重点完全放在算法自身的复杂性上,而完全忽略任何仅仅是比例常数的东西——比如 JavaScript 引擎、CPU 的速度、互联网连接,以及所有那些很快就会像 Model T 一样过时的东西。 Big O 仅以对生活在现在或未来的人们同等重要的方式关注绩效。
大 O 符号也直接聚焦于计算机编程/工程最重要的原理,这一事实激励所有优秀的程序员不断思考和梦想:超越技术缓慢前进的成果的唯一方法是发明更好的算法。
Such a beautifully simple and short question seems at least to deserve an equally short answer, like a student might receive during tutoring.
( *in a wonderful, unit-free sense of time!)
(**which is what matters, because people will always want more, whether they live today or tomorrow)
Well, what's so wonderful about Big O notation if that's what it does?
Practically speaking, Big O analysis is so useful and important because Big O puts the focus squarely on the algorithm's own complexity and completely ignores anything that is merely a proportionality constant—like a JavaScript engine, the speed of a CPU, your Internet connection, and all those things which become quickly become as laughably outdated as a Model T. Big O focuses on performance only in the way that matters equally as much to people living in the present or in the future.
Big O notation also shines a spotlight directly on the most important principle of computer programming/engineering, the fact which inspires all good programmers to keep thinking and dreaming: the only way to achieve results beyond the slow forward march of technology is to invent a better algorithm.
算法示例(Java):
算法描述:
该算法逐项搜索列表,寻找键,
迭代列表中的每个项目,如果是键则返回 True,
如果循环完成但没有找到键,则返回 False。
Big-O 表示法代表复杂性的上限(时间、空间、..)
要找到时间复杂性的 Big-O:
计算多少最坏情况所需的时间(关于输入大小):
最坏情况:列表中不存在该键。
时间(最坏情况)= 4n+1
时间:O(4n+1) = O(n) | 在 Big-O 中,常量被忽略
O(n) ~ Linear
还有 Big-Omega,代表 Best-Case 的复杂度:
Best-Case:关键是第一项。
时间(最佳情况)= 4
时间:Ω(4) = O(1) ~ 瞬时\常数
Algorithm example (Java):
Algorithm description:
This algorithm searches a list, item by item, looking for a key,
Iterating on each item in the list, if it's the key then return True,
If the loop has finished without finding the key, return False.
Big-O notation represents the upper-bound on the Complexity (Time, Space, ..)
To find The Big-O on Time Complexity:
Calculate how much time (regarding input size) the worst case takes:
Worst-Case: the key doesn't exist in the list.
Time(Worst-Case) = 4n+1
Time: O(4n+1) = O(n) | in Big-O, constants are neglected
O(n) ~ Linear
There's also Big-Omega, which represent the complexity of the Best-Case:
Best-Case: the key is the first item.
Time(Best-Case) = 4
Time: Ω(4) = O(1) ~ Instant\Constant
大 O 表示法是一种描述算法在给定任意数量的输入参数(我们称之为“n”)的情况下运行速度的方法。 它在计算机科学中很有用,因为不同的机器以不同的速度运行,简单地说算法需要 5 秒并不能告诉你太多信息,因为虽然你可能正在运行带有 4.5 Ghz 八核处理器的系统,但我可能正在运行一个已有 15 年历史的 800 Mhz 系统,无论算法如何,都可能需要更长的时间。 因此,我们不是用时间来指定算法的运行速度,而是用输入参数的数量或“n”来表示算法的运行速度。 通过以这种方式描述算法,我们可以比较算法的速度,而不必考虑计算机本身的速度。
Big O notation is a way of describing how quickly an algorithm will run given an arbitrary number of input parameters, which we'll call "n". It is useful in computer science because different machines operate at different speeds, and simply saying that an algorithm takes 5 seconds doesn't tell you much because while you may be running a system with a 4.5 Ghz octo-core processor, I may be running a 15 year old, 800 Mhz system, which could take longer regardless of the algorithm. So instead of specifying how fast an algorithm runs in terms of time, we say how fast it runs in terms of number of input parameters, or "n". By describing algorithms in this way, we are able to compare the speeds of algorithms without having to take into account the speed of the computer itself.
Big O
f(x) = O(g(x)) 当 x 转到 a 时(例如,a = +∞)意味着存在一个函数k,使得:
f(x) = k(x)g (x)
k 在某个邻域内有界a(如果 a = +∞,这意味着存在数字 N 和 M,使得对于每个 x > N,|k(x)| < M)。
换句话说,用简单的英语来说:f(x) = O(g(x)), x → a,意味着在 a 的邻域中, f 分解为 g 和某个有界函数的乘积。
小o
顺便说一下,这里是为了比较小o的定义。
f(x) = o(g(x)) 当 x 转到 a 时,意味着存在一个函数 k 使得:
f(x) = k(x)g(x)
k当 x 变为 a 时,(x) 变为 0。
示例
sin x = O(x),当 x → 0 时。
sin x = O(1),当 x → 0 时x → +∞,
x2 + x = O(x) 当 x → 0 时,
x2 + x = O(x2) 当 x → +∞,
ln(x) = o(x) = O(x),当 x → +无穷大时。
注意! 等号“=”使用了“假等式”:o(g(x)) = O(g(x)) 为真,但 O( g(x)) = o(g(x))。 类似地,写“ln(x) = o(x) when x → +∞”是可以的,但公式“o(x) = ln(x)”就没有意义了。
更多示例
O(1) = O(n) = O(n2) 当 n → +∞ 时(但反之则不然,等式为"fake"),
O(n) + O(n2) = O(n2) 当 n → +∞
O(O(n2)) = O( n2) 当 n → +∞
O(n2)O(n3) = O(n5) 当 n → +∞
这是维基百科文章: < a href="https://en.wikipedia.org/wiki/Big_O_notation">https://en.wikipedia.org/wiki/Big_O_notation
Big O
f(x) = O(g(x)) when x goes to a (for example, a = +∞) means that there is a function k such that:
f(x) = k(x)g(x)
k is bounded in some neighborhood of a (if a = +∞, this means that there are numbers N and M such that for every x > N, |k(x)| < M).
In other words, in plain English: f(x) = O(g(x)), x → a, means that in a neighborhood of a, f decomposes into the product of g and some bounded function.
Small o
By the way, here is for comparison the definition of small o.
f(x) = o(g(x)) when x goes to a means that there is a function k such that:
f(x) = k(x)g(x)
k(x) goes to 0 when x goes to a.
Examples
sin x = O(x) when x → 0.
sin x = O(1) when x → +∞,
x2 + x = O(x) when x → 0,
x2 + x = O(x2) when x → +∞,
ln(x) = o(x) = O(x) when x → +∞.
Attention! The notation with the equal sign "=" uses a "fake equality": it is true that o(g(x)) = O(g(x)), but false that O(g(x)) = o(g(x)). Similarly, it is ok to write "ln(x) = o(x) when x → +∞", but the formula "o(x) = ln(x)" would make no sense.
More examples
O(1) = O(n) = O(n2) when n → +∞ (but not the other way around, the equality is "fake"),
O(n) + O(n2) = O(n2) when n → +∞
O(O(n2)) = O(n2) when n → +∞
O(n2)O(n3) = O(n5) when n → +∞
Here is the Wikipedia article: https://en.wikipedia.org/wiki/Big_O_notation
您想了解有关大 O 的一切吗? 我也是。
因此,要谈论大 O,我将使用只有一个节拍的单词。 每个字一个声音。 小话很快。 你知道这些词,我也知道。我们将使用同一个声音的词。 他们很小。 我相信您会知道我们将使用的所有单词!
现在,我们来谈谈工作吧。 大多数时候,我不喜欢工作。 你喜欢工作吗? 也许你会这样做,但我确信我不会。
我不喜欢去上班。 我不喜欢把时间花在工作上。 如果可以的话,我只想玩,做有趣的事情。 你是不是也和我有一样的感觉呢?
现在有时,我确实必须去上班。 这很悲伤,但却是事实。 所以,当我工作时,我有一个规则:我尽量少做工作。 尽我所能,几乎没有工作。 那我去玩啦!
所以这是一个大消息:大O可以帮助我不做工作! 如果我了解大O,我可以有更多的时间玩耍。更少的工作,更多的玩耍! 这就是大O帮助我做的事情。
现在我还有一些工作。 我有这个清单:一、二、三、四、五、六。 我必须将所有内容添加到此列表中。
哇,我讨厌工作。 但是哦,好吧,我必须这样做。 那么我就开始吧。
一加二等于三……加三等于六……四等于……我不知道。 我迷路了。 这对我来说太难了。 我不太喜欢这种工作。
所以我们不要做这项工作。 让你和我想想这有多难。 我需要做多少工作才能将六个数字相加?
好吧,走着瞧。 我必须将一和二相加,然后将其与三相加,然后将其与四相加……总而言之,我数了六个相加。 我必须做六次添加才能解决这个问题。
大O来了,告诉我们这个数学有多难。
大O说:我们必须做六次添加才能解决这个问题。 从一到六的每一项都加一。 六小部分工作……每一项工作都是一项附加工作。
好吧,我现在不会做添加它们的工作。 但我知道这会有多难。 这将是六个添加。
哦不,现在我还有更多的工作。 谢什。 谁制造这种东西?!
现在他们要我从一加到十! 我为什么要这么做? 我不想将一加到六。 从一加到十……嗯……那就更难了!
还要困难多少? 我还需要做多少工作? 我需要更多还是更少的步骤?
好吧,我想我必须做十次添加……从一到十的每件事都添加一个。 十比六还多。 从一加到十,我需要付出更多的努力,而不是从一加到六!
我现在不想添加。 我只是想想想添加这么多可能有多困难。 而且,我希望能尽快参加比赛。
从一加到六需要一些工作。 但是您是否发现,从一加到十,这需要更多的工作?
大O是你的朋友,也是我的朋友。 大 O 帮助我们思考我们需要做多少工作,以便我们可以制定计划。 而且,如果我们和大O是朋友,他可以帮助我们选择不那么难的工作!
现在我们必须做新的工作。 不好了。 我根本不喜欢这种工作。
新的工作是:把所有的东西从一加到n。
等待! n是什么? 我错过了吗? 如果你不告诉我n是什么,我怎么能把1加到n呢?
好吧,我不知道 n 是什么。 没人告诉我。 你是吗? 不? 那好吧。 所以我们无法完成这项工作。 呼。
但是,尽管我们现在不做这项工作,但我们可以猜测,如果我们知道 n,它会有多难。 我们必须将 n 项相加,对吗? 当然!
现在大O来了,他会告诉我们这项工作有多辛苦。 他说:把所有的东西从一到N,一一相加,是O(n)。 要添加所有这些内容,[我知道我必须添加 n 次。][1] 那是个大 O! 他告诉我们做某种工作有多么困难。
对我来说,我认为大O就像一个大而缓慢的老板。 他想着工作,但他不去做。 他可能会说:“这项工作很快。” 或者,他可能会说:“这项工作又慢又辛苦!” 但他不做这项工作。 他只是看一下作品,然后告诉我们可能需要多长时间。
我很关心大O。为什么? 我不喜欢工作! 没有人喜欢工作。 这就是为什么我们都喜欢大O! 他告诉我们我们可以多快地工作。 他帮助我们思考工作有多么辛苦。
呃哦,还有更多工作。 现在,我们不做这项工作。 但是,让我们制定一个计划,一步一步地去做。
他们给了我们一副十张牌。 它们全都混在一起了:七、四、二、六……一点也不直。 现在...我们的工作是对它们进行分类。
呃。 听起来工作量很大!
我们如何对这套牌进行排序? 我有个计划。
我会从头到尾,一对一对地浏览整副牌。 如果一对中的第一张牌很大而下一张牌很小,我会交换它们。 否则,我会去下一对,依此类推……很快,牌组就完成了。
当这副牌完成后,我会问:我在那一轮中交换了牌吗? 如果是这样,我必须从头开始再做一次。
在某些时候,在某些时候,不会有交换,我们的套牌就会完成。 这么多工作!
那么,按照这些规则对卡片进行排序需要多少工作量呢?
我有十张卡。 而且,大多数时候——也就是说,如果我运气不好的话——我必须浏览整副牌最多十次,每次浏览这副牌时最多交换十张牌。
大O,救救我吧!
大 O 进来并说:对于一副 n 张牌,以这种方式排序将在 O(N 平方) 时间内完成。
为什么他说n平方?
嗯,你知道 n 的平方是 n 乘以 n。 现在,我明白了:检查了n张牌,最多可能是n次通过这副牌。 这是两个循环,每个循环有 n 个步骤。 需要完成的工作量是n平方的。 肯定有很多工作!
现在,当大 O 说需要 O(n 平方) 的工作时,他并不是说 n 平方加起来。 对于某些情况,可能会少一些。 但在最坏的情况下,排序这副牌需要花费近 n 平方步的工作量。
现在,大 O 就是我们的朋友了。
Big O 指出:随着 n 变大,当我们对卡片进行排序时,这项工作比旧的仅添加这些东西的工作要困难得多。 我们怎么知道呢?
好吧,如果 n 变得非常大,我们并不关心我们可能会向 n 或 n 的平方添加什么。
对于大 n,n 的平方比 n 大。
大O告诉我们,对事物进行排序比添加事物更困难。 对于大 n,O(n 平方) 大于 O(n)。 这意味着:如果 n 变得非常大,那么对 n 个混合的东西进行排序必须花费更多的时间,而不是仅仅添加 n 个混合的东西。
大O并没有帮我们解决工作。 大O告诉我们工作有多辛苦。
我有一副纸牌。 我确实对它们进行了排序。 你帮忙了。 谢谢。
有没有更快速的方法来对卡片进行排序? 大O可以帮助我们吗?
是的,还有更快捷的方法! 学习需要一些时间,但它确实有效……而且速度相当快。 你也可以尝试一下,但每一步都要慢慢来,不要失去你的位置。
在这种对一副牌进行排序的新方法中,我们不再像以前那样检查成对的牌。 以下是对这套牌进行排序的新规则:
一:我在我们现在处理的牌组部分中选择一张牌。 如果你喜欢的话可以给我选一款。 (当然,我们第一次这样做时,“我们现在处理的牌组的一部分”是整个牌组。)
二:我将牌组展开在您选择的那张牌上。 这是什么? 我该如何发挥? 好吧,我从起始牌开始,一张一张地往下看,然后寻找一张比展开牌更高的牌。
三:我从底牌开始,寻找一张比八张牌低的牌。
一旦我找到了这两张牌,我就会交换它们,然后继续寻找更多的牌来交换。 也就是说,我回到第二步,并在您选择的牌上再打一些。
在某个时刻,这个循环(从二到三)将会结束。 当搜索的两半在展开卡处相遇时,它就结束了。 然后,我们刚刚用您在第一步中选择的牌展开了牌组。 现在,所有接近开始的牌都比八张牌低; 并且接近末尾的牌比八字牌更高。 很酷的技巧!
四(这是有趣的部分):我现在有两副小牌,一副比八张牌低,一副比牌高。 现在我进入第一步,在每个小甲板上! 也就是说,我从第一个小牌组的第一步开始,当这项工作完成后,我从下一个小牌组的第一步开始。
我把牌组分成几个部分,然后对每个部分进行分类,越来越小,有时我没有更多的工作要做。 现在,考虑到所有规则,这可能看起来很慢。 但相信我,它一点也不慢。 这比第一种排序方法要少得多!
这种叫什么? 这就是所谓的快速排序! 这种排序是由一个名叫 CAR Hoare 的人做出的,他称之为快速排序。 现在,快速排序一直被使用!
快速排序将大牌分解为小牌。 也就是说,它把大任务分解为小任务。
嗯。 我想这里面可能有一个规律。 要将大任务变小,请将它们分解。
这种排序相当快。 多快? 大 O 告诉我们:在平均情况下,这种排序需要 O(n log n) 的工作才能完成。
它比第一种排序快还是慢? 大O,请帮忙!
第一种排序的时间复杂度为 O(n 平方)。 但快速排序的复杂度是 O(n log n)。 您知道对于大 n,n log n 小于 n 平方,对吧? 嗯,这就是我们知道快速排序很快的原因!
如果你必须对牌组进行排序,最好的方法是什么? 好吧,你可以做你想做的,但我会选择快速排序。
为什么我选择快速排序? 我当然不喜欢工作! 我希望尽快完成工作。
我怎么知道快速排序的工作量更少? 我知道 O(n log n) 小于 O(n square)。 O 更小,因此快速排序的工作量更少!
现在你认识我的朋友大O了。他帮助我们减少工作量。 如果你认识大O,你也可以做更少的工作!
你和我一起学会了这一切! 你真聪明! 太感谢了!
现在工作完成了,我们去玩吧!
[1]:有一种方法可以作弊,一次性添加从 1 到 n 的所有内容。 一个名叫高斯的孩子在八岁时发现了这一点。 不过我没那么聪明,所以别问我他是怎么做到的。
You want to know all there is to know of big O? So do I.
So to talk of big O, I will use words that have just one beat in them. One sound per word. Small words are quick. You know these words, and so do I. We will use words with one sound. They are small. I am sure you will know all of the words we will use!
Now, let’s you and me talk of work. Most of the time, I do not like work. Do you like work? It may be the case that you do, but I am sure I do not.
I do not like to go to work. I do not like to spend time at work. If I had my way, I would like just to play, and do fun things. Do you feel the same as I do?
Now at times, I do have to go to work. It is sad, but true. So, when I am at work, I have a rule: I try to do less work. As near to no work as I can. Then I go play!
So here is the big news: the big O can help me not to do work! I can play more of the time, if I know big O. Less work, more play! That is what big O helps me do.
Now I have some work. I have this list: one, two, three, four, five, six. I must add all things in this list.
Wow, I hate work. But oh well, I have to do this. So here I go.
One plus two is three… plus three is six... and four is... I don’t know. I got lost. It is too hard for me to do in my head. I don’t much care for this kind of work.
So let's not do the work. Let's you and me just think how hard it is. How much work would I have to do, to add six numbers?
Well, let’s see. I must add one and two, and then add that to three, and then add that to four… All in all, I count six adds. I have to do six adds to solve this.
Here comes big O, to tell us just how hard this math is.
Big O says: we must do six adds to solve this. One add, for each thing from one to six. Six small bits of work... each bit of work is one add.
Well, I will not do the work to add them now. But I know how hard it would be. It would be six adds.
Oh no, now I have more work. Sheesh. Who makes this kind of stuff?!
Now they ask me to add from one to ten! Why would I do that? I did not want to add one to six. To add from one to ten… well… that would be even more hard!
How much more hard would it be? How much more work would I have to do? Do I need more or less steps?
Well, I guess I would have to do ten adds… one for each thing from one to ten. Ten is more than six. I would have to work that much more to add from one to ten, than one to six!
I do not want to add right now. I just want to think on how hard it might be to add that much. And, I hope, to play as soon as I can.
To add from one to six, that is some work. But do you see, to add from one to ten, that is more work?
Big O is your friend and mine. Big O helps us think on how much work we have to do, so we can plan. And, if we are friends with big O, he can help us choose work that is not so hard!
Now we must do new work. Oh, no. I don’t like this work thing at all.
The new work is: add all things from one to n.
Wait! What is n? Did I miss that? How can I add from one to n if you don’t tell me what n is?
Well, I don’t know what n is. I was not told. Were you? No? Oh well. So we can’t do the work. Whew.
But though we will not do the work now, we can guess how hard it would be, if we knew n. We would have to add up n things, right? Of course!
Now here comes big O, and he will tell us how hard this work is. He says: to add all things from one to N, one by one, is O(n). To add all these things, [I know I must add n times.][1] That is big O! He tells us how hard it is to do some type of work.
To me, I think of big O like a big, slow, boss man. He thinks on work, but he does not do it. He might say, "That work is quick." Or, he might say, "That work is so slow and hard!" But he does not do the work. He just looks at the work, and then he tells us how much time it might take.
I care lots for big O. Why? I do not like to work! No one likes to work. That is why we all love big O! He tells us how fast we can work. He helps us think of how hard work is.
Uh oh, more work. Now, let’s not do the work. But, let’s make a plan to do it, step by step.
They gave us a deck of ten cards. They are all mixed up: seven, four, two, six… not straight at all. And now... our job is to sort them.
Ergh. That sounds like a lot of work!
How can we sort this deck? I have a plan.
I will look at each pair of cards, pair by pair, through the deck, from first to last. If the first card in one pair is big and the next card in that pair is small, I swap them. Else, I go to the next pair, and so on and so on... and soon, the deck is done.
When the deck is done, I ask: did I swap cards in that pass? If so, I must do it all once more, from the top.
At some point, at some time, there will be no swaps, and our sort of the deck would be done. So much work!
Well, how much work would that be, to sort the cards with those rules?
I have ten cards. And, most of the time -- that is, if I don’t have lots of luck -- I must go through the whole deck up to ten times, with up to ten card swaps each time through the deck.
Big O, help me!
Big O comes in and says: for a deck of n cards, to sort it this way will be done in O(N squared) time.
Why does he say n squared?
Well, you know n squared is n times n. Now, I get it: n cards checked, up to what might be n times through the deck. That is two loops, each with n steps. That is n squared much work to be done. A lot of work, for sure!
Now when big O says it will take O(n squared) work, he does not mean n squared adds, on the nose. It might be some small bit less, for some case. But in the worst case, it will be near n squared steps of work to sort the deck.
Now here is where big O is our friend.
Big O points out this: as n gets big, when we sort cards, the job gets MUCH MUCH MORE HARD than the old just-add-these-things job. How do we know this?
Well, if n gets real big, we do not care what we might add to n or n squared.
For big n, n squared is more large than n.
Big O tells us that to sort things is more hard than to add things. O(n squared) is more than O(n) for big n. That means: if n gets real big, to sort a mixed deck of n things MUST take more time, than to just add n mixed things.
Big O does not solve the work for us. Big O tells us how hard the work is.
I have a deck of cards. I did sort them. You helped. Thanks.
Is there a more fast way to sort the cards? Can big O help us?
Yes, there is a more fast way! It takes some time to learn, but it works... and it works quite fast. You can try it too, but take your time with each step and do not lose your place.
In this new way to sort a deck, we do not check pairs of cards the way we did a while ago. Here are your new rules to sort this deck:
One: I choose one card in the part of the deck we work on now. You can choose one for me if you like. (The first time we do this, “the part of the deck we work on now” is the whole deck, of course.)
Two: I splay the deck on that card you chose. What is this splay; how do I splay? Well, I go from the start card down, one by one, and I look for a card that is more high than the splay card.
Three: I go from the end card up, and I look for a card that is more low than the splay card.
Once I have found these two cards, I swap them, and go on to look for more cards to swap. That is, I go back to step Two, and splay on the card you chose some more.
At some point, this loop (from Two to Three) will end. It ends when both halves of this search meet at the splay card. Then, we have just splayed the deck with the card you chose in step One. Now, all the cards near the start are more low than the splay card; and the cards near the end are more high than the splay card. Cool trick!
Four (and this is the fun part): I have two small decks now, one more low than the splay card, and one more high. Now I go to step one, on each small deck! That is to say, I start from step One on the first small deck, and when that work is done, I start from step One on the next small deck.
I break up the deck in parts, and sort each part, more small and more small, and at some time I have no more work to do. Now this may seem slow, with all the rules. But trust me, it is not slow at all. It is much less work than the first way to sort things!
What is this sort called? It is called Quick Sort! That sort was made by a man called C. A. R. Hoare and he called it Quick Sort. Now, Quick Sort gets used all the time!
Quick Sort breaks up big decks in small ones. That is to say, it breaks up big tasks in small ones.
Hmmm. There may be a rule in there, I think. To make big tasks small, break them up.
This sort is quite quick. How quick? Big O tells us: this sort needs O(n log n) work to be done, in the mean case.
Is it more or less fast than the first sort? Big O, please help!
The first sort was O(n squared). But Quick Sort is O(n log n). You know that n log n is less than n squared, for big n, right? Well, that is how we know that Quick Sort is fast!
If you have to sort a deck, what is the best way? Well, you can do what you want, but I would choose Quick Sort.
Why do I choose Quick Sort? I do not like to work, of course! I want work done as soon as I can get it done.
How do I know Quick Sort is less work? I know that O(n log n) is less than O(n squared). The O's are more small, so Quick Sort is less work!
Now you know my friend, Big O. He helps us do less work. And if you know big O, you can do less work too!
You learned all that with me! You are so smart! Thank you so much!
Now that work is done, let’s go play!
[1]: There is a way to cheat and add all the things from one to n, all at one time. Some kid named Gauss found this out when he was eight. I am not that smart though, so don't ask me how he did it.
不确定我是否对这个主题有进一步的贡献,但仍然认为我会分享:我曾经发现 这篇博文 提供了一些非常有用(尽管非常基本)的解释和信息 Big O 的示例:
通过示例,这有助于让我的龟甲般的头骨了解最基本的知识,所以我认为这是一本相当不错的 10 分钟读物,可以让你朝着正确的方向前进。
Not sure I'm further contributing to the subject but still thought I'd share: I once found this blog post to have some quite helpful (though very basic) explanations & examples on Big O:
Via examples, this helped get the bare basics into my tortoiseshell-like skull, so I think it's a pretty descent 10-minute read to get you headed in the right direction.
我有更简单的方法来理解时间复杂度
计算时间复杂度的最常见指标是 Big O 表示法。 这消除了所有常数因素,以便当 N 接近无穷大时,可以根据 N 来估计运行时间。 一般来说,你可以这样想:
是常数。 语句的运行时间不会发生变化,与 N
呈线性关系。 循环的运行时间与 N 成正比。当 N 加倍时,运行时间也会加倍。
是二次方的。 两个循环的运行时间与N的平方成正比。当N加倍时,运行时间增加N * N。
是对数。 算法的运行时间与 N 除以 2 的次数成正比。这是因为算法每次迭代都会将工作区域分成两半。
是 N * log ( N )。 运行时间由 N 个对数循环(迭代或递归)组成,因此该算法是线性和对数的组合。
一般来说,对一维中的每个项目执行某项操作是线性的,对二维中的每个项目执行某项操作是二次方,将工作区域分成两半是对数方。 还有其他大 O 度量,例如三次、指数和平方根,但它们并不常见。 大 O 表示法被描述为 O ( ),其中 是度量。 快速排序算法将被描述为 O ( N * log ( N ) )。
注意:所有这些都没有考虑最佳、平均和最坏情况的衡量标准。 每个都有自己的大 O 表示法。 另请注意,这是一个非常简单的解释。 大 O 是最常见的,但它也比我展示的更复杂。 还有其他符号,例如大 omega、小 o 和大 theta。 在算法分析课程之外,您可能不会遇到它们。
I've more simpler way to understand the time complexity
he most common metric for calculating time complexity is Big O notation. This removes all constant factors so that the running time can be estimated in relation to N as N approaches infinity. In general you can think of it like this:
Is constant. The running time of the statement will not change in relation to N
Is linear. The running time of the loop is directly proportional to N. When N doubles, so does the running time.
Is quadratic. The running time of the two loops is proportional to the square of N. When N doubles, the running time increases by N * N.
Is logarithmic. The running time of the algorithm is proportional to the number of times N can be divided by 2. This is because the algorithm divides the working area in half with each iteration.
Is N * log ( N ). The running time consists of N loops (iterative or recursive) that are logarithmic, thus the algorithm is a combination of linear and logarithmic.
In general, doing something with every item in one dimension is linear, doing something with every item in two dimensions is quadratic, and dividing the working area in half is logarithmic. There are other Big O measures such as cubic, exponential, and square root, but they're not nearly as common. Big O notation is described as O ( ) where is the measure. The quicksort algorithm would be described as O ( N * log ( N ) ).
Note: None of this has taken into account best, average, and worst case measures. Each would have its own Big O notation. Also note that this is a VERY simplistic explanation. Big O is the most common, but it's also more complex that I've shown. There are also other notations such as big omega, little o, and big theta. You probably won't encounter them outside of an algorithm analysis course.
非常快速的注释:
“Big O”中的 O 指的是“Order”(或准确地说“order of”)
所以你可以从字面上理解它的意思,它是用来订购一些东西来比较它们的。
“Big O”做了两件事:
符号
实现了上述两个目的。有七种最常用的符号
1
步骤完成了任务,非常好,订购第一logN
步的任务,很好,订购No.2N
步完成任务,公平,订单 3O(NlogN)
步结束任务,不好,订单号4N^2
步完成任务,很糟糕,订单号 52^N
步完成一个任务,太可怕了,订单号6N!
步完成任务,太糟糕了,7号订单假设您得到符号
O(N^2)
,不仅您很清楚该方法需要 N*N 个步骤来完成任务,而且您还可以从其排名中看到它不如O(NlogN)
。请注意行尾的顺序,以便您更好地理解。如果考虑所有可能性,则有超过 7 个符号。
在计算机科学中,完成任务的一组步骤称为算法。
在术语中,大O表示法用于描述算法的性能或复杂性。
此外,Big O 建立最坏情况或测量上限步骤。
您可以参考 Big-Ω(Big-Omega)以获得最佳情况。
大欧米茄 (Big-Omega) ) 符号(文章)| 可汗学院
摘要
“Big O”描述了算法的性能并对其进行了评估。
或者正式地解决它,“Big O”对算法进行分类并标准化比较过程。
Very Quick Note:
The O in "Big O" refers to as "Order"(or precisely "order of")
so you could get its idea literally that it's used to order something to compare them.
"Big O" does two things:
Notations
.There are seven most used notations
1
step, it's excellent, Ordered No.1logN
steps, its good, Ordered No.2N
steps, its fair, Order No.3O(NlogN)
steps, it's not good, Order No.4N^2
steps, it's bad, Order No.52^N
steps, it's horrible, Order No.6N!
steps, it's terrible, Order No.7Suppose you get notation
O(N^2)
, not only you are clear the method takes N*N steps to accomplish a task, also you see that it's not good asO(NlogN)
from its ranking.Please note the order at line end, just for your better understanding.There's more than 7 notations if all possibilities considered.
In CS, the set of steps to accomplish a task is called algorithms.
In Terminology, Big O notation is used to describe the performance or complexity of an algorithm.
In addition, Big O establishes the worst-case or measure the Upper-Bound steps.
You could refer to Big-Ω (Big-Omega) for best case.
Big-Ω (Big-Omega) notation (article) | Khan Academy
Summary
"Big O" describes the algorithm's performance and evaluates it.
or address it formally, "Big O" classifies the algorithms and standardize the comparison process.
假设我们正在讨论一种算法 A,它应该对大小为 n 的数据集执行某些操作。
那么
O(<涉及n的某个表达式X>)
用简单的英语来说意味着:碰巧,某些函数(将它们视为 X(n) 的实现)往往会经常出现。 这些是众所周知且易于比较的(示例:
1
、Log N
、N
、N^2
、< code>N! 等..)通过在谈论 A 和其他算法时比较这些,很容易根据算法可能<的操作数量对算法进行排名/em>(最坏情况)需要完成。
一般来说,我们的目标是找到或构造一个算法A,使其具有返回尽可能低的数字的函数
X(n)
。Assume we're talking about an algorithm A, which should do something with a dataset of size n.
Then
O( <some expression X involving n> )
means, in simple English:As it happens, there are certain functions (think of them as implementations of X(n)) that tend to occur quite often. These are well known and easily compared (Examples:
1
,Log N
,N
,N^2
,N!
, etc..)By comparing these when talking about A and other algorithms, it is easy to rank the algorithms according to the number of operations they may (worst-case) require to complete.
In general, our goal will be to find or structure an algorithm A in such a way that it will have a function
X(n)
that returns as low a number as possible.