这段代码是否填满了CPU缓存？

发布于 2024-09-10 13:36:44 字数 1246 浏览 2 评论 0原文

我有两种方法来编程相同的功能。

方法 1：

doTheWork(int action)
{
    for(int i = 0 i < 1000000000; ++i)
    {
        doAction(action);
    }
}

方法 2：

doTheWork(int action)
{
    switch(action)
    {
    case 1:
        for(int i = 0 i < 1000000000; ++i)
        {
            doAction<1>();
        }
        break;
    case 2:
        for(int i = 0 i < 1000000000; ++i)
        {
            doAction<2>();
        }
        break;
    //-----------------------------------------------
    //... (there are 1000000 cases here)
    //-----------------------------------------------
    case 1000000:
        for(int i = 0 i < 1000000000; ++i)
        {
            doAction<1000000>();
        }
        break;
    }
}

假设函数 doAction(int action) 和函数 templatedoAction() 由大约 10 行代码组成，这些代码将在编译时内联。调用 doAction(#) 在功能上等同于 doAction<#>()，但非模板化 doAction(int value) 是比 template稍慢一些doAction()，因为当编译时参数值已知时，可以在代码中完成一些很好的优化。

所以我的问题是，在模板化函数的情况下，是否所有数百万行代码都会填充 CPU L1 缓存（以及更多）（从而显着降低性能），或者仅填充 doAction<#> 行当前正在运行的循环内的;()是否被缓存？

原文

I have two ways to program the same functionality.

Method 1:

doTheWork(int action)
{
    for(int i = 0 i < 1000000000; ++i)
    {
        doAction(action);
    }
}

Method 2:

doTheWork(int action)
{
    switch(action)
    {
    case 1:
        for(int i = 0 i < 1000000000; ++i)
        {
            doAction<1>();
        }
        break;
    case 2:
        for(int i = 0 i < 1000000000; ++i)
        {
            doAction<2>();
        }
        break;
    //-----------------------------------------------
    //... (there are 1000000 cases here)
    //-----------------------------------------------
    case 1000000:
        for(int i = 0 i < 1000000000; ++i)
        {
            doAction<1000000>();
        }
        break;
    }
}

Let's assume that the function doAction(int action) and the function template<int Action> doAction() consist of about 10 lines of code that will get inlined at compile-time. Calling doAction(#) is equiavalent to doAction<#>() in functionality, but the non-templated doAction(int value) is somewhat slower than template<int Value> doAction(), since some nice optimizations can be done in the code when the argument value is known at compile time.

So my question is, do all the millions of lines of code fill the CPU L1 cache (and more) in the case of the templated function (and thus degrade performance considerably), or does only the lines of doAction<#>() inside the loop currently being run get cached?

分享到QQ

分享到微博