64 位模运算的奇怪性能行为

发布于 2024-08-18 06:33:27 字数 2057 浏览 7 评论 0原文

最后三个方法调用大约需要花费时间。比前四个时间多了一倍。

唯一的区别是它们的参数不再适合整数。但这有关系吗？该参数被声明为long，因此无论如何都应该使用long进行计算。对于数字>maxint，模运算是否使用另一种算法？

我使用的是 amd athlon64 3200+、winxp sp3 和 vs2008。

       Stopwatch sw = new Stopwatch();
       TestLong(sw, int.MaxValue - 3l);
       TestLong(sw, int.MaxValue - 2l);
       TestLong(sw, int.MaxValue - 1l);
       TestLong(sw, int.MaxValue);
       TestLong(sw, int.MaxValue + 1l);
       TestLong(sw, int.MaxValue + 2l);
       TestLong(sw, int.MaxValue + 3l);
       Console.ReadLine();

    static void TestLong(Stopwatch sw, long num)
    {
        long n = 0;
        sw.Reset();
        sw.Start();
        for (long i = 3; i < 20000000; i++)
        {
            n += num % i;
        }
        sw.Stop();
        Console.WriteLine(sw.Elapsed);            
    }

编辑： 我现在尝试使用 C 进行相同的操作，但问题不会出现在这里，所有模运算在发布和调试模式下都需要相同的时间，无论是否打开优化：

#include "stdafx.h"
#include "time.h"
#include "limits.h"

static void TestLong(long long num)
{
    long long n = 0;

    clock_t t = clock();
    for (long long i = 3; i < 20000000LL*100; i++)
    {
        n += num % i;
    }

    printf("%d - %lld\n", clock()-t, n);  
}

int main()
{
    printf("%i %i %i %i\n\n", sizeof (int), sizeof(long), sizeof(long long), sizeof(void*));

    TestLong(3);
    TestLong(10);
    TestLong(131);
    TestLong(INT_MAX - 1L);
    TestLong(UINT_MAX +1LL);
    TestLong(INT_MAX + 1LL);
    TestLong(LLONG_MAX-1LL);

    getchar();
    return 0;
}

编辑2：

感谢您的宝贵建议。我发现 .net 和 c（在调试和发布模式下）都不会使用原子 cpu 指令来计算余数，但它们会调用一个函数来计算余数。

在c程序中我可以得到它的名称“_allrem”。它还显示了该文件的完整源注释，因此我发现该算法特例使用 32 位除数而不是 .net 应用程序中的除数。

我还发现，c 程序的性能实际上只受除数的值影响，而不受除数的影响。另一项测试表明，.net 程序中余数函数的性能取决于被除数和除数。

顺便说一句：即使是 long long 值的简单加法也是通过连续的 add 和 adc 指令来计算的。因此，即使我的处理器称自己为 64 位，它实际上也不是 :(

EDIT3:

我现在在 Windows 7 x64 版本上运行 c 应用程序，使用 Visual Studio 2010 编译。有趣的是，尽管现在（我检查了汇编源代码）使用了真正的 64 位指令，但性能行为保持不变。

原文

The last three of these method calls take approx. double the time than the first four.

The only difference is that their arguments doesn't fit in integer anymore. But should this matter? The parameter is declared to be long, so it should use long for calculation anyway. Does the modulo operation use another algorithm for numbers>maxint?

I am using amd athlon64 3200+, winxp sp3 and vs2008.

       Stopwatch sw = new Stopwatch();
       TestLong(sw, int.MaxValue - 3l);
       TestLong(sw, int.MaxValue - 2l);
       TestLong(sw, int.MaxValue - 1l);
       TestLong(sw, int.MaxValue);
       TestLong(sw, int.MaxValue + 1l);
       TestLong(sw, int.MaxValue + 2l);
       TestLong(sw, int.MaxValue + 3l);
       Console.ReadLine();

    static void TestLong(Stopwatch sw, long num)
    {
        long n = 0;
        sw.Reset();
        sw.Start();
        for (long i = 3; i < 20000000; i++)
        {
            n += num % i;
        }
        sw.Stop();
        Console.WriteLine(sw.Elapsed);            
    }

EDIT:
I now tried the same with C and the issue does not occur here, all modulo operations take the same time, in release and in debug mode with and without optimizations turned on:

#include "stdafx.h"
#include "time.h"
#include "limits.h"

static void TestLong(long long num)
{
    long long n = 0;

    clock_t t = clock();
    for (long long i = 3; i < 20000000LL*100; i++)
    {
        n += num % i;
    }

    printf("%d - %lld\n", clock()-t, n);  
}

int main()
{
    printf("%i %i %i %i\n\n", sizeof (int), sizeof(long), sizeof(long long), sizeof(void*));

    TestLong(3);
    TestLong(10);
    TestLong(131);
    TestLong(INT_MAX - 1L);
    TestLong(UINT_MAX +1LL);
    TestLong(INT_MAX + 1LL);
    TestLong(LLONG_MAX-1LL);

    getchar();
    return 0;
}

EDIT2:

Thanks for the great suggestions. I found that both .net and c (in debug as well as in release mode) does't not use atomically cpu instructions to calculate the remainder but they call a function that does.

In the c program I could get the name of it which is "_allrem". It also displayed full source comments for this file so I found the information that this algorithm special cases the 32bit divisors instead of dividends which was the case in the .net application.

I also found out that the performance of the c program really is only affected by the value of the divisor but not the dividend. Another test showed that the performance of the remainder function in the .net program depends on both the dividend and divisor.

BTW: Even simple additions of long long values are calculated by a consecutive add and adc instructions. So even if my processor calls itself 64bit, it really isn't :(

EDIT3:

I now ran the c app on a windows 7 x64 edition, compiled with visual studio 2010. The funny thing is, the performance behavior stays the same, although now (I checked the assembly source) true 64 bit instructions are used.

分享到QQ

分享到微博