生成 CPU 缓存未命中时的性能

发布于 2024-11-15 23:14:42 字数 2903 浏览 3 评论 0原文

我正在尝试了解 .NET 世界中的 CPU 缓存性能。具体来说,我正在阅读 Igor Ostovsky 的有关处理器缓存效果的文章

我已经浏览了他文章中的前三个例子,并记录了与他的结果有很大不同的结果。我想我一定做错了什么,因为我的机器上的性能显示的结果与他在文章中显示的结果几乎完全相反。我没有看到缓存未命中所带来的巨大影响。

我做错了什么? (错误的代码、编译器设置等)

以下是我的机器上的性能结果:

在此处输入图像描述

在此处输入图像描述

,我机器上的处理器是 Intel Core i7-2630QM。以下是有关我的处理器缓存的信息:

在此处输入图像描述

我已在 x64 发布模式下进行编译。

下面是我的源代码:

class Program
    {

        static Stopwatch watch = new Stopwatch();

        static int[] arr = new int[64 * 1024 * 1024];

        static void Main(string[] args)
        {
            Example1();
            Example2();
            Example3();


            Console.ReadLine();
        }

        static void Example1()
        {
            Console.WriteLine("Example 1:");

            // Loop 1
            watch.Restart();
            for (int i = 0; i < arr.Length; i++) arr[i] *= 3;
            watch.Stop();
            Console.WriteLine("     Loop 1: " + watch.ElapsedMilliseconds.ToString() + " ms");

            // Loop 2
            watch.Restart();
            for (int i = 0; i < arr.Length; i += 32) arr[i] *= 3;
            watch.Stop();
            Console.WriteLine("     Loop 2: " + watch.ElapsedMilliseconds.ToString() + " ms");

            Console.WriteLine();
        }

        static void Example2()
        {

            Console.WriteLine("Example 2:");

            for (int k = 1; k <= 1024; k *= 2)
            {

                watch.Restart();
                for (int i = 0; i < arr.Length; i += k) arr[i] *= 3;
                watch.Stop();
                Console.WriteLine("     K = "+ k + ": " + watch.ElapsedMilliseconds.ToString() + " ms");

            }
            Console.WriteLine();
        }

        static void Example3()
        {   

            Console.WriteLine("Example 3:");

            for (int k = 1; k <= 1024*1024; k *= 2)
            {

                //256* 4bytes per 32 bit int * k = k Kilobytes
                arr = new int[256*k];



                int steps = 64 * 1024 * 1024; // Arbitrary number of steps
                int lengthMod = arr.Length - 1;

                watch.Restart();
                for (int i = 0; i < steps; i++)
                {
                    arr[(i * 16) & lengthMod]++; // (x & lengthMod) is equal to (x % arr.Length)
                }

                watch.Stop();
                Console.WriteLine("     Array size = " + arr.Length * 4 + " bytes: " + (int)(watch.Elapsed.TotalMilliseconds * 1000000.0 / arr.Length) + " nanoseconds per element");

            }
            Console.WriteLine();
        }

    }

I am trying to learn about CPU cache performance in the world of .NET. Specifically I am working through Igor Ostovsky's article about Processor Cache Effects.

I have gone through the first three examples in his article and have recorded results that widely differ from his. I think I must be doing something wrong because the performance on my machine is showing almost the exact opposite results of what he shows in his article. I am not seeing the large effects from cache misses that I would expect.

What am I doing wrong? (bad code, compiler setting, etc.)

Here are the performance results on my machine:

enter image description here

enter image description here

enter image description here

If it helps, the processor on my machine is an Intel Core i7-2630QM. Here is info on my processor's cache:

enter image description here

I have compiled in x64 Release mode.

Below is my source code:

class Program
    {

        static Stopwatch watch = new Stopwatch();

        static int[] arr = new int[64 * 1024 * 1024];

        static void Main(string[] args)
        {
            Example1();
            Example2();
            Example3();


            Console.ReadLine();
        }

        static void Example1()
        {
            Console.WriteLine("Example 1:");

            // Loop 1
            watch.Restart();
            for (int i = 0; i < arr.Length; i++) arr[i] *= 3;
            watch.Stop();
            Console.WriteLine("     Loop 1: " + watch.ElapsedMilliseconds.ToString() + " ms");

            // Loop 2
            watch.Restart();
            for (int i = 0; i < arr.Length; i += 32) arr[i] *= 3;
            watch.Stop();
            Console.WriteLine("     Loop 2: " + watch.ElapsedMilliseconds.ToString() + " ms");

            Console.WriteLine();
        }

        static void Example2()
        {

            Console.WriteLine("Example 2:");

            for (int k = 1; k <= 1024; k *= 2)
            {

                watch.Restart();
                for (int i = 0; i < arr.Length; i += k) arr[i] *= 3;
                watch.Stop();
                Console.WriteLine("     K = "+ k + ": " + watch.ElapsedMilliseconds.ToString() + " ms");

            }
            Console.WriteLine();
        }

        static void Example3()
        {   

            Console.WriteLine("Example 3:");

            for (int k = 1; k <= 1024*1024; k *= 2)
            {

                //256* 4bytes per 32 bit int * k = k Kilobytes
                arr = new int[256*k];



                int steps = 64 * 1024 * 1024; // Arbitrary number of steps
                int lengthMod = arr.Length - 1;

                watch.Restart();
                for (int i = 0; i < steps; i++)
                {
                    arr[(i * 16) & lengthMod]++; // (x & lengthMod) is equal to (x % arr.Length)
                }

                watch.Stop();
                Console.WriteLine("     Array size = " + arr.Length * 4 + " bytes: " + (int)(watch.Elapsed.TotalMilliseconds * 1000000.0 / arr.Length) + " nanoseconds per element");

            }
            Console.WriteLine();
        }

    }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

堇色安年 2024-11-22 23:14:43

为什么在第二个循环中使用 i += 32 。您正在以这种方式跨过缓存线。 32*4 = 128 字节比所需的 64 字节大得多。

Why are you using i += 32 in the second loop. You are stepping over cache lines in this way. 32*4 = 128bytes way bigger then 64bytes needed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文