如何优化我的 C/x86 代码?
int lcm_old(int a, int b) {
int n;
for(n=1;;n++)
if(n%a == 0 && n%b == 0)
return n;
}
int lcm(int a,int b) {
int n = 0;
__asm {
lstart:
inc n;
mov eax, n;
mov edx, 0;
idiv a;
mov eax, 0;
cmp eax, edx;
jne lstart;
mov eax, n;
mov edx, 0;
idiv b;
mov eax, 0;
cmp eax, edx;
jnz lstart;
}
return n;
}
我试图用我自己的函数(底部)击败/匹配顶部函数的代码。您对如何优化我的日常工作有什么想法吗?
PS。这只是为了好玩。
int lcm_old(int a, int b) {
int n;
for(n=1;;n++)
if(n%a == 0 && n%b == 0)
return n;
}
int lcm(int a,int b) {
int n = 0;
__asm {
lstart:
inc n;
mov eax, n;
mov edx, 0;
idiv a;
mov eax, 0;
cmp eax, edx;
jne lstart;
mov eax, n;
mov edx, 0;
idiv b;
mov eax, 0;
cmp eax, edx;
jnz lstart;
}
return n;
}
I'm trying to beat/match the code for the top function with my own function (bottom). Have you got any ideas how I can optimize my routine?
PS. This is just for fun.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我会使用不同的算法进行优化。像你一样线性搜索是非常慢的。事实上,两个自然数的最小公倍数是它们的乘积除以它们的最大公约数的商。您可以使用欧几里得算法快速计算最大公约数。
因此:
您需要在其中实现
gcd(int, int)
。由于欧几里得算法的平均步骤数为O(log n)
,我们轻松击败了简单的线性搜索。对于这个问题还有其他方法。如果您有一种可以快速分解整数的算法(例如量子计算机),那么您也可以像这样解决这个问题。如果将
a
和b
分别写入其规范素数分解,则
a
和b
的最小公倍数是通过对a
和b
至少其中一个分解中出现的每个质因数取它在a
分解中出现的最大指数来获得的代码>a或b
。例如:所有这些都是为了
指出,朴素的方法因其简单性而令人钦佩,但您可以花一整天的时间来优化它们的每一纳秒,但仍然无法击败高级算法。
I would optimize by using a different algorithm. Searching linearly like you are doing is super-slow. It's a fact that the least common mulitple of two natural numbers is the quotient of their product divided by their greatest common divisor. You can compute the greatest common divisor quickly using the Euclidean algorithm.
Thus:
where you need to implement
gcd(int, int)
. As the average number of steps in the Euclidean algorithm isO(log n)
, we beat the naive linear search hands down.There are other approaches to this problem. If you had an algorithm that could quickly factor integers (say a quantum computer) then you can also solve this problem like so. If you write each of
a
andb
into its canonical prime factorizationthen the least common multiple of
a
andb
is the obtained by taking for each prime factor appearing in at least one of the factorizations ofa
andb
, taking it with the maximum exponent that it appears in the factorization ofa
orb
. For example:so that
All of this is to point out that naive approaches are admirable in their simplicity but you can spend all day optimizing every last nanosecond out of them and still not beat a superior algorithm.
我假设您想保留相同的算法。这至少应该是一个稍微更有效的实现。主要区别在于循环中的代码仅使用寄存器,而不使用内存。
然而,正如 Jason 指出的那样,这确实不是一个非常有效的算法——乘法、求 GCD 和除法通常会更快(除非
a
和b
相当小)。编辑:还有另一种算法几乎更容易理解,它也应该快得多(比原来的算法——不是乘法,然后除以 GCD)。不要生成连续的数字,直到找到一个能整除
a
和b
的数字,而是生成一个的连续倍数(最好是较大的数字),直到找到一个能被另一个整除的数字:这仍然很容易理解,但应该比原来有相当大的改进。
I'm going to assume you want to keep the same algorithm. This should at least be a slightly more efficient implementation of it. The main difference is that the code in the loop only uses registers, not memory.
As Jason pointed out, however, this really isn't a very efficient algorithm -- multiplying, finding the GCD, and dividing will normally be faster (unless
a
andb
are quite small).Edit: there is another algorithm that's almost simpler to understand, that should also be a lot faster (than the original -- not than multiplying, then dividing by GCD). Instead of generating consecutive numbers until you find one that divides both
a
andb
, generate consecutive multiples of one (preferably the larger) until you find one that divides evenly by the other:This remains dead simple to understand, but should give a considerable improvement over the original.