如何以编程方式获取 C++ 中的 CPU 缓存行大小?

发布于 2024-07-06 04:45:03 字数 252 浏览 8 评论 0原文

我希望我的程序能够读取它在 C++ 中运行的 CPU 的缓存行大小。

我知道这不能移植,所以我需要一个适用于 Linux 的解决方案和另一个适用于 Windows 的解决方案(其他系统的解决方案可能对其他人有用,所以如果您知道的话,请发布它们)。

对于Linux,我可以读取/proc/cpuinfo 的内容并解析以cache_alignment 开头的行。 也许有更好的方法涉及调用 API。

对于 Windows 我根本不知道。

I'd like my program to read the cache line size of the CPU it's running on in C++.

I know that this can't be done portably, so I will need a solution for Linux and another for Windows (Solutions for other systems could be useful to others, so post them if you know them).

For Linux I could read the content of /proc/cpuinfo and parse the line beginning with cache_alignment. Maybe there is a better way involving a call to an API.

For Windows I simply have no idea.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

千里故人稀 2024-07-13 04:45:03

在 Win32 上,GetLogicalProcessorInformation 将为您返回a SYSTEM_LOGICAL_PROCESSOR_INFORMATION 其中包含 CACHE_DESCRIPTOR,其中包含您需要的信息。

On Win32, GetLogicalProcessorInformation will give you back a SYSTEM_LOGICAL_PROCESSOR_INFORMATION which contains a CACHE_DESCRIPTOR, which has the information you need.

掩于岁月 2024-07-13 04:45:03

看起来至少是 SCO unix (http://uw714doc.sco .com/en/man/html.3C/sysconf.3C.html)有 sysconf 的 _SC_CACHE_LINE。 也许其他平台也有类似的东西?

Looks like at least SCO unix (http://uw714doc.sco.com/en/man/html.3C/sysconf.3C.html) has _SC_CACHE_LINE for sysconf. Perhaps other platforms have something similar?

浅笑轻吟梦一曲 2024-07-13 04:45:03

在 Linux 上尝试 proccpuinfo 库,这是一个独立于体系结构的 C API,用于读取 /proc/cpuinfo

On Linux try the proccpuinfo library, an architecture independent C API for reading /proc/cpuinfo

不羁少年 2024-07-13 04:45:03

对于 x86,CPUID 指令。 快速的 google 搜索显示了一些用于 win32 和 c++ 的。 我也通过内联汇编器使用了 CPUID。

更多信息:

For x86, the CPUID instruction. A quick google search reveals some libraries for win32 and c++. I have used CPUID via inline assembler as well.

Some more info:

回忆追雨的时光 2024-07-13 04:45:03

对于那些想知道如何在接受的答案中使用该函数的人来说,以下是示例代码:

#include <new>
#include <iostream>
#include <Windows.h>


void ShowCacheSize()
{
    using CPUInfo = SYSTEM_LOGICAL_PROCESSOR_INFORMATION;
    DWORD len = 0;
    CPUInfo* buffer = nullptr;

    // Determine required length of a buffer
    if ((GetLogicalProcessorInformation(buffer, &len) == FALSE) && (GetLastError() == ERROR_INSUFFICIENT_BUFFER))
    {
        // Allocate buffer of required size
        buffer = new (std::nothrow) CPUInfo[len]{ };

        if (buffer == nullptr)
        {
            std::cout << "Buffer allocation of " << len << " bytes failed" << std::endl;
        }
        else if (GetLogicalProcessorInformation(buffer, &len) != FALSE)
        {
            const DWORD count = len / sizeof(CPUInfo);
            for (DWORD i = 0; i < count; ++i)
            {
                // This will be true for multiple returned caches, we need just one
                if (buffer[i].Relationship == RelationCache)
                {
                    std::cout << "Cache line size is: " << buffer[i].Cache.LineSize << " bytes" << std::endl;
                    break;
                }
            }
        }
        else
        {
            std::cout << "ERROR: " << GetLastError() << std::endl;
        }

        delete[] buffer;
    }
}

Here is sample code for those who wonder how to to utilize the function in accepted answer:

#include <new>
#include <iostream>
#include <Windows.h>


void ShowCacheSize()
{
    using CPUInfo = SYSTEM_LOGICAL_PROCESSOR_INFORMATION;
    DWORD len = 0;
    CPUInfo* buffer = nullptr;

    // Determine required length of a buffer
    if ((GetLogicalProcessorInformation(buffer, &len) == FALSE) && (GetLastError() == ERROR_INSUFFICIENT_BUFFER))
    {
        // Allocate buffer of required size
        buffer = new (std::nothrow) CPUInfo[len]{ };

        if (buffer == nullptr)
        {
            std::cout << "Buffer allocation of " << len << " bytes failed" << std::endl;
        }
        else if (GetLogicalProcessorInformation(buffer, &len) != FALSE)
        {
            const DWORD count = len / sizeof(CPUInfo);
            for (DWORD i = 0; i < count; ++i)
            {
                // This will be true for multiple returned caches, we need just one
                if (buffer[i].Relationship == RelationCache)
                {
                    std::cout << "Cache line size is: " << buffer[i].Cache.LineSize << " bytes" << std::endl;
                    break;
                }
            }
        }
        else
        {
            std::cout << "ERROR: " << GetLastError() << std::endl;
        }

        delete[] buffer;
    }
}
你怎么这么可爱啊 2024-07-13 04:45:03

在 Windows 上

#include <Windows.h>
#include <iostream>

using std::cout; using std::endl;

int main()
{
    SYSTEM_INFO systemInfo;
    GetSystemInfo(&systemInfo);
    cout << "Page Size Is: " << systemInfo.dwPageSize;
    getchar();
}

在 Linux 上

http://linux .die.net/man/2/getpagesize

On Windows

#include <Windows.h>
#include <iostream>

using std::cout; using std::endl;

int main()
{
    SYSTEM_INFO systemInfo;
    GetSystemInfo(&systemInfo);
    cout << "Page Size Is: " << systemInfo.dwPageSize;
    getchar();
}

On Linux

http://linux.die.net/man/2/getpagesize

情话难免假 2024-07-13 04:45:03

如果您的实现支持,C++17 std::hardware_pressive_interference_size 会给你一个上限(和 ..._constructive_... 一个下限),考虑到像行对的硬件预取这样的东西。

但这些是编译时常量,因此在允许不同行大小的 ISA 的所有微体系结构上都不是正确的。 (例如,Pentium III 等较旧的 x86 CPU 具有 32 字节行,但所有后来的 x86 CPU 都使用 64 字节行,包括所有 x86-64。理论上,某些未来的微体系结构可能会使用 128 字节行,但多线程针对 64 字节行进行调整的二进制文件很普遍,因此对于 x86 来说这可能不太可能。)

因此,某些当前的实现选择根本不实现该 C++ 功能。 GCC 确实实现了它,clang 没有实现它(神箭)。 当代码在结构布局中使用它时,它就成为 ABI 的一部分,因此编译器将来无法更改它以匹配同一目标的未来 CPU。


GCC 将建设性和破坏性定义为 64 x86-64,忽略相邻行预取可能导致的破坏性干扰,例如在 Intel Sandybridge 系列上。 在高争用情况下,它不像缓存行中的错误共享那么灾难性,因此您可以选择仅使用 64 字节对齐来分隔不同线程将独立访问的对象。

If supported by your implementation, C++17 std::hardware_destructive_interference_size would give you an upper bound (and ..._constructive_... a lower bound), taking into account stuff like hardware prefetch of pairs of lines.

But those are compile-time constants, so can't be correct on all microarchitectures for ISAs which allow different line sizes. (e.g. older x86 CPUs like Pentium III had 32-byte lines, but all later x86 CPUs have used 64-byte lines, including all x86-64. It's theoretically possible that some future microarchitecture will use 128-byte lines, but multi-threaded binaries tuned for 64-byte lines are widespread so that's perhaps unlikely for x86.)

For this reason, some current implementations choose not to implement that C++ feature at all. GCC does implement it, clang doesn't (Godbolt). It becomes part of the ABI when code uses it in struct layouts, so it's not something compilers can change in future to match future CPUs for the same target.


GCC defines both constructive and destructive as 64 x86-64, neglecting the destructive interference that adjacent-line prefetch can cause, e.g. on Intel Sandybridge-family. It's not nearly as disastrous as false sharing within a cache line in a high-contention case, so you might choose to only use 64-byte alignment to separate objects that different threads will be accessing independently.

傲影 2024-07-13 04:45:03

I think you need NtQuerySystemInformation from ntdll.dll.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文