欧几里得算法的时间复杂度
我很难确定欧几里得最大公分母算法的时间复杂度是多少。这个算法的伪代码是:
function gcd(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a
它似乎依赖于a和b。我的想法是时间复杂度是O(a % b)。这是正确的吗?有更好的写法吗?
I am having difficulty deciding what the time complexity of Euclid's greatest common denominator algorithm is. This algorithm in pseudo-code is:
function gcd(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a
It seems to depend on a and b. My thinking is that the time complexity is O(a % b). Is that correct? Is there a better way to write that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
分析欧几里得算法时间复杂度的一个技巧是遵循两次迭代中发生的情况:
现在 a 和 b 都会减少,而不是只减少一个,这使得分析更容易。您可以将其分为以下几种情况:
2a <= b
2b <= a
2a > b
但a < b
2b> a
但b < a
a == b
现在我们将证明,每种情况都会使
a+b
总数减少至少四分之一:2a <= b
,因此b
至少减少一半,因此a+b
至少减少25%
a % b
b
且2b <= a
,因此a
至少减少一半,因此a+b
至少减少25%
b
会变成ba
,小于b/2
,减少a+b
至少25%
。a
将变为ab
,小于a/2
,将a+b
减少 at至少25%
。a+b
下降到0
,这显然使a+b
减少了至少25%
。因此,通过案例分析,每双步减少
a+b
至少25%
。在a+b
被迫降至1
以下之前,这种情况可能发生的次数是有上限的。直到达到 0 为止的总步数 (S
) 必须满足(4/3)^S <= A+B
。现在就开始吧:所以迭代次数与输入位数成线性关系。对于适合 CPU 寄存器的数字,将迭代建模为采用恒定时间并假设 gcd 的总运行时间是线性的是合理的。
当然,如果您正在处理大整数,则必须考虑到每次迭代中的模运算没有恒定成本这一事实。粗略地说,总渐近运行时间将是多对数因子的 n^2 倍。 类似于
n^2 lg(n) 2^O (log* n)
。可以通过使用二进制 gcd 来避免多对数因子。One trick for analyzing the time complexity of Euclid's algorithm is to follow what happens over two iterations:
Now a and b will both decrease, instead of only one, which makes the analysis easier. You can divide it into cases:
2a <= b
2b <= a
2a > b
buta < b
2b > a
butb < a
a == b
Now we'll show that every single case decreases the total
a+b
by at least a quarter:b % (a % b) < a
and2a <= b
, sob
is decreased by at least half, soa+b
decreased by at least25%
a % b < b
and2b <= a
, soa
is decreased by at least half, soa+b
decreased by at least25%
b
will becomeb-a
, which is less thanb/2
, decreasinga+b
by at least25%
.a
will becomea-b
, which is less thana/2
, decreasinga+b
by at least25%
.a+b
drops to0
, which is obviously decreasinga+b
by at least25%
.Therefore, by case analysis, every double-step decreases
a+b
by at least25%
. There's a maximum number of times this can happen beforea+b
is forced to drop below1
. The total number of steps (S
) until we hit 0 must satisfy(4/3)^S <= A+B
. Now just work it:So the number of iterations is linear in the number of input digits. For numbers that fit into cpu registers, it's reasonable to model the iterations as taking constant time and pretend that the total running time of the gcd is linear.
Of course, if you're dealing with big integers, you must account for the fact that the modulus operations within each iteration don't have a constant cost. Roughly speaking, the total asymptotic runtime is going to be n^2 times a polylogarithmic factor. Something like
n^2 lg(n) 2^O(log* n)
. The polylogarithmic factor can be avoided by instead using a binary gcd.分析算法的合适方法是确定其最坏情况。
当涉及斐波那契对时,欧几里得 GCD 最坏的情况就会发生。
void EGCD(fib[i], fib[i - 1])
,其中 i > 0.例如,我们选择被除数为 55、除数为 34 的情况(回想一下,我们仍在处理斐波那契数)。
正如您可能注意到的,此操作花费了 8 次迭代(或递归调用)。
让我们尝试更大的斐波那契数,即 121393 和 75025。我们在这里还可以注意到,它花费了 24 次迭代(或递归调用)。
您还可以注意到每次迭代都会产生一个斐波那契数。这就是为什么我们有如此多的业务。仅用斐波那契数确实无法获得类似的结果。
因此,这次时间复杂度将用较小的 Oh(上限)来表示。直观上,下界是 Omega(1):例如 500 除以 2 的情况。
我们来求解递推关系:
那么我们可以说欧几里得 GCD 可以进行 log(xy) 运算 最多。
The suitable way to analyze an algorithm is by determining its worst case scenarios.
Euclidean GCD's worst case occurs when Fibonacci Pairs are involved.
void EGCD(fib[i], fib[i - 1])
, where i > 0.For instance, let's opt for the case where the dividend is 55, and the divisor is 34 (recall that we are still dealing with fibonacci numbers).
As you may notice, this operation costed 8 iterations (or recursive calls).
Let's try larger Fibonacci numbers, namely 121393 and 75025. We can notice here as well that it took 24 iterations (or recursive calls).
You can also notice that each iterations yields a Fibonacci number. That's why we have so many operations. We can't obtain similar results only with Fibonacci numbers indeed.
Hence, the time complexity is going to be represented by small Oh (upper bound), this time. The lower bound is intuitively Omega(1): case of 500 divided by 2, for instance.
Let's solve the recurrence relation:
We may say then that Euclidean GCD can make log(xy) operation at most.
维基百科文章对此进行了很好的阐述。
它甚至还有一个很好的值对复杂度图。
它不是
O(a%b)
。众所周知(参见文章),它所采取的步数永远不会超过较小数字的位数的五倍。因此,最大步数随着位数
(ln b)
的增加而增加。每个步骤的成本也会随着位数的增加而增加,因此复杂度受O(ln^2 b)
限制,其中 b 是较小的数字。这是一个上限,实际时间通常会更少。There's a great look at this on the wikipedia article.
It even has a nice plot of complexity for value pairs.
It is not
O(a%b)
.It is known (see article) that it will never take more steps than five times the number of digits in the smaller number. So the max number of steps grows as the number of digits
(ln b)
. The cost of each step also grows as the number of digits, so the complexity is bound byO(ln^2 b)
where b is the smaller number. That's an upper limit, and the actual time is usually less.请参阅此处。
特别是这部分:
因此 < code>O(log min(a, b)) 是一个很好的上限。
See here.
In particular this part:
So
O(log min(a, b))
is a good upper bound.下面直观的了解一下欧几里得算法的运行时复杂度。形式证明在各种文本中都有介绍,例如《算法简介》和《TAOCP 第 2 卷》。
首先考虑一下如果我们尝试对两个斐波那契数 F(k+1) 和 F(k) 进行 gcd 会怎样。您可能很快就会发现欧几里得算法迭代到 F(k) 和 F(k-1)。也就是说,每次迭代我们都会在斐波那契数列中向下移动一个数字。由于斐波那契数为 O(Phi ^ k),其中 Phi 是黄金比例,我们可以看到 GCD 的运行时间为 O(log n),其中 n=max(a, b) 且 log 的底数为 Phi。接下来,我们可以通过观察斐波那契数始终产生对来证明这将是最坏的情况,其中余数在每次迭代中都保持足够大,并且在到达系列开始之前永远不会变为零。
我们可以使 O(log n) 其中 n=max(a, b) 的界限更加紧密。假设 b >= a,因此我们可以将边界写为 O(log b)。首先,观察 GCD(ka, kb) = GCD(a, b)。由于 k 的最大值是 gcd(a,c),我们可以在运行时将 b 替换为 b/gcd(a,b),从而得到更严格的 O(log b/gcd(a,b)) 界限。
Here's intuitive understanding of runtime complexity of Euclid's algorithm. The formal proofs are covered in various texts such as Introduction to Algorithms and TAOCP Vol 2.
First think about what if we tried to take gcd of two Fibonacci numbers F(k+1) and F(k). You might quickly observe that Euclid's algorithm iterates on to F(k) and F(k-1). That is, with each iteration we move down one number in Fibonacci series. As Fibonacci numbers are O(Phi ^ k) where Phi is golden ratio, we can see that runtime of GCD was O(log n) where n=max(a, b) and log has base of Phi. Next, we can prove that this would be the worst case by observing that Fibonacci numbers consistently produces pairs where the remainders remains large enough in each iteration and never become zero until you have arrived at the start of the series.
We can make O(log n) where n=max(a, b) bound even more tighter. Assume that b >= a so we can write bound at O(log b). First, observe that GCD(ka, kb) = GCD(a, b). As biggest values of k is gcd(a,c), we can replace b with b/gcd(a,b) in our runtime leading to more tighter bound of O(log b/gcd(a,b)).
以下是 Mark Allen Weiss 所著的C 语言数据结构和算法分析一书中的分析(第二版,2.4.4):
这是代码:
这是我们将要使用的定理:
证明:
那么,我们可以做出如下推论:
Here is the analysis in the book Data Structures and Algorithm Analysis in C by Mark Allen Weiss (second edition, 2.4.4):
Here is the code:
Here is a THEOREM that we are going to use:
PROOF:
So, we can make the following inference:
当 n 和 m 都是连续的斐波那契数时,会出现最坏的情况。
gcd(Fn,Fn−1)=gcd(Fn−1,Fn−2)=⋯=gcd(F1,F0)=1 并且第 n 个斐波那契数是 1.618^n,其中 1.618 是黄金比例。
因此,要找到 gcd(n,m),递归调用的次数将为 θ(logn)。
Worst case will arise when both n and m are consecutive Fibonacci numbers.
gcd(Fn,Fn−1)=gcd(Fn−1,Fn−2)=⋯=gcd(F1,F0)=1 and nth Fibonacci number is 1.618^n, where 1.618 is the Golden ratio.
So, to find gcd(n,m), number of recursive calls will be Θ(logn).
欧几里得算法的最坏情况是每一步的余数都是最大的,即。斐波那契数列的两个连续项。
当n和m是a和b的位数时,假设n>=m,则该算法使用O(m)次除法。
请注意,复杂性始终以输入的大小来表示,在本例中是位数。
The worst case of Euclid Algorithm is when the remainders are the biggest possible at each step, ie. for two consecutive terms of the Fibonacci sequence.
When n and m are the number of digits of a and b, assuming n >= m, the algorithm uses O(m) divisions.
Note that complexities are always given in terms of the sizes of inputs, in this case the number of digits.
加布里埃尔·拉梅定理将步数限制为 log(1/sqrt(5)*(a+1/2))-2,其中对数的底数为 (1+sqrt(5))/2。这是针对算法的最坏情况场景,当输入是连续的斐班诺契数时会发生这种情况。
稍微更自由的界限是:log a,其中 log 的底数是 (sqrt(2)),由 Koblitz 隐含。
出于加密目的,我们通常考虑算法的按位复杂度,同时考虑到位大小大约由 k=loga 给出。
以下是对欧几里得算法的按位复杂度的详细分析:
尽管在大多数参考文献中,欧几里得算法的按位复杂度由 O(loga)^3 给出,但存在更严格的界限,即 O(loga)^2。
考虑; r0=a,r1=b,r0=q1.r1+r2 。 。 。 ,ri-1=qi.ri+ri+1, . 。 。 ,rm-2=qm-1.rm-1+rm rm-1=qm.rm
观察:a=r0>=b=r1>r2>r3...>rm-1>rm>0 .. ........(1)
rm 是a 和b 的最大公约数。
根据Koblitz的书(数论与密码学课程)中的一个主张可以证明: ri+1<(ri-1)/2 .................( 2)
同样,在 Koblitz 中,将 k 位正整数除以 l 位正整数(假设 k>=l)所需的位运算数给出为: (k-l+1).l .... ........................(3)
根据 (1) 和 (2),除法次数为 O(loga),因此根据 (3),总复杂度为 O(loga)^ 3.
现在,通过 Koblitz 的评论,这可以减少到 O(loga)^2。
考虑 ki= logri +1
通过 (1) 和 (2) 我们有: ki+1<=ki 对于 i=0,1,...,m-2,m-1 且 ki+2<=(ki) -1 对于 i=0,1,...,m-2
且通过 (3),m 个划分的总成本由以下公式界定: SUM [(ki-1)-((ki)-1))]* ki for i=0,1,2,..,m
重新排列: SUM [(ki-1)-((ki)-1))]*ki<=4*k0^2
所以欧几里得算法的按位复杂度是 O(loga)^2。
Gabriel Lame's Theorem bounds the number of steps by log(1/sqrt(5)*(a+1/2))-2, where the base of the log is (1+sqrt(5))/2. This is for the the worst case scenerio for the algorithm and it occurs when the inputs are consecutive Fibanocci numbers.
A slightly more liberal bound is: log a, where the base of the log is (sqrt(2)) is implied by Koblitz.
For cryptographic purposes we usually consider the bitwise complexity of the algorithms, taking into account that the bit size is given approximately by k=loga.
Here is a detailed analysis of the bitwise complexity of Euclid Algorith:
Although in most references the bitwise complexity of Euclid Algorithm is given by O(loga)^3 there exists a tighter bound which is O(loga)^2.
Consider; r0=a, r1=b, r0=q1.r1+r2 . . . ,ri-1=qi.ri+ri+1, . . . ,rm-2=qm-1.rm-1+rm rm-1=qm.rm
observe that: a=r0>=b=r1>r2>r3...>rm-1>rm>0 ..........(1)
and rm is the greatest common divisor of a and b.
By a Claim in Koblitz's book( A course in number Theory and Cryptography) is can be proven that: ri+1<(ri-1)/2 .................(2)
Again in Koblitz the number of bit operations required to divide a k-bit positive integer by an l-bit positive integer (assuming k>=l) is given as: (k-l+1).l ...................(3)
By (1) and (2) the number of divisons is O(loga) and so by (3) the total complexity is O(loga)^3.
Now this may be reduced to O(loga)^2 by a remark in Koblitz.
consider ki= logri +1
by (1) and (2) we have: ki+1<=ki for i=0,1,...,m-2,m-1 and ki+2<=(ki)-1 for i=0,1,...,m-2
and by (3) the total cost of the m divisons is bounded by: SUM [(ki-1)-((ki)-1))]*ki for i=0,1,2,..,m
rearranging this: SUM [(ki-1)-((ki)-1))]*ki<=4*k0^2
So the bitwise complexity of Euclid's Algorithm is O(loga)^2.
然而,对于迭代算法,我们有:
对于斐波那契对,
iterativeEGCD()
和iterativeEGCDForWorstCase()
之间没有区别,后者如下所示:是的,对于斐波那契数对,
n = a % n
和n = a - n
,这是完全相同的事情。我们还知道,在同一问题的早期回答中,存在一个普遍存在的递减因子:
factor = m / (n % m)
。因此,为了以定义的形式塑造欧几里得 GCD 的迭代版本,我们可以将其描述为这样的“模拟器”:
基于 Jauhar Ali 博士的工作(最后一张幻灯片),上面的循环是对数的。
是的,小哦,因为模拟器会告诉最多迭代次数。当在欧几里得 GCD 上进行探测时,非斐波那契对将比斐波那契对需要更少的迭代次数。
For the iterative algorithm, however, we have:
With Fibonacci pairs, there is no difference between
iterativeEGCD()
anditerativeEGCDForWorstCase()
where the latter looks like the following:Yes, with Fibonacci Pairs,
n = a % n
andn = a - n
, it is exactly the same thing.We also know that, in an earlier response for the same question, there is a prevailing decreasing factor:
factor = m / (n % m)
.Therefore, to shape the iterative version of the Euclidean GCD in a defined form, we may depict as a "simulator" like this:
Based on the work (last slide) of Dr. Jauhar Ali, the loop above is logarithmic.
Yes, small Oh because the simulator tells the number of iterations at most. Non Fibonacci pairs would take a lesser number of iterations than Fibonacci, when probed on Euclidean GCD.
每一步都有两种情况
b >= a / 2,那么 a,b = b, a % b 将使 b 最多为之前值的一半
b < a / 2,则 a, b = b, a % b 将使 a 至多为之前值的一半,因为 b 小于 a / 2
所以在每一步,算法都会将至少一个数字减少到至少一半较少的。
在最多O(log a)+O(log b)步骤中,这将被简化为简单的情况。这产生一个 O(log n) 算法,其中 n 是 a 和 b 的上限。
我在这里找到了它
At every step, there are two cases
b >= a / 2, then a, b = b, a % b will make b at most half of its previous value
b < a / 2, then a, b = b, a % b will make a at most half of its previous value, since b is less than a / 2
So at every step, the algorithm will reduce at least one number to at least half less.
In at most O(log a)+O(log b) step, this will be reduced to the simple cases. Which yield an O(log n) algorithm, where n is the upper limit of a and b.
I have found it here