缓存行对齐(需要文章澄清)

发布于 2024-08-12 04:49:01 字数 785 浏览 7 评论 0原文

我最近在我的应用程序中遇到了我认为是错误共享的问题,我查找了 Sutter 的文章,介绍如何将数据与缓存行对齐。他建议使用以下 C++ 代码:

// C++ (using C++0x alignment syntax)
template<typename T>
struct cache_line_storage {
   [[ align(CACHE_LINE_SIZE) ]] T data;
   char pad[ CACHE_LINE_SIZE > sizeof(T)
        ? CACHE_LINE_SIZE - sizeof(T)
        : 1 ];
};

我可以看到当 CACHE_LINE_SIZE > 时这将如何工作。 sizeof(T) 为 true —— struct cache_line_storage 最终会占用一整行内存缓存。但是,当 sizeof(T) 大于单个缓存行时,我认为我们应该将数据填充 CACHE_LINE_SIZE - T % CACHE_LINE_SIZE 字节,以便生成的结构体的大小是缓存行大小的整数倍。我的理解有什么问题吗?为什么填充 1 个字节就足够了?

I've recently encountered what I think is a false-sharing problem in my application, and I've looked up Sutter's article on how to align my data to cache lines. He suggests the following C++ code:

// C++ (using C++0x alignment syntax)
template<typename T>
struct cache_line_storage {
   [[ align(CACHE_LINE_SIZE) ]] T data;
   char pad[ CACHE_LINE_SIZE > sizeof(T)
        ? CACHE_LINE_SIZE - sizeof(T)
        : 1 ];
};

I can see how this would work when CACHE_LINE_SIZE > sizeof(T) is true -- the struct cache_line_storage just ends up taking up one full cache line of memory. However, when the sizeof(T) is larger than a single cache line, I would think that we should pad the data by CACHE_LINE_SIZE - T % CACHE_LINE_SIZE bytes, so that the resulting struct has a size that is an integral multiple of the cache line size. What is wrong with my understanding? Why does padding with 1 byte suffice?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

一抹苦笑 2024-08-19 04:49:01

数组的大小不能为 0,因此需要 1 才能编译。然而,当前的规范草案版本表示这种填充是不必要的;编译器必须填充结构的对齐方式。

另请注意,如果 CACHE_LINE_SIZE 小于 alignof(T),则此代码格式错误。要解决此问题,您可能应该使用 [[align(CACHE_LINE_SIZE),align(T)]],这将确保永远不会选择较小的对齐方式。

You can't have arrays of size 0, so 1 is required to make it compile. However, the current draft version of the spec says that such padding is unecessary; the compiler must pad up to the struct's alignment.

Note also that this code is ill-formed if CACHE_LINE_SIZE is smaller than alignof(T). To fix this, you should probably use [[align(CACHE_LINE_SIZE), align(T)]], which will ensure that a smaller alignment is never picked.

骑趴 2024-08-19 04:49:01

想象

#define CACHE_LINE_SIZE 32
sizeof(T) == 48

一下,现在考虑一下 [[align(CACHE_LINE_SIZE) ]] 是如何工作的。例如:

[[ align(32) ]] Foo foo;

对于某些n,这将强制sizeof(Foo) == 32n。如有必要,例如,align() 将为您填充,以便诸如 Foo foo[10]; 之类的内容按要求对齐每个 foo[i]

因此,在我们的例子中,对于 sizeof(T) == 48,这意味着 sizeof(cache_line_storage) == 64

因此,对齐方式为您提供了您所希望的填充。

然而,这是模板中的一个“错误”。考虑这种情况:

#define CACHE_LINE_SIZE 32
sizeof(T) == 32

这里我们最终得到 char pad[1];。这意味着 sizeof(cache_line_storage) == 64。可能不是你想要的!

我认为模板需要进行一些修改:

template <typename T, int padding>
struct pad_or_not
{
   T data;
   char pad[padding];
};

// specialize the 0 case
// As it is late, I am SURE I've got the specialization syntax wrong...
template <typename T, int>
struct pad_or_not<0>
{
   T data;
};

template<typename T>
struct cache_line_storage {
   [[ align(CACHE_LINE_SIZE) ]] pad_or_not<T, (sizeof(T) > CACHE_LINE_SIZE ? 0 : CACHE_LINE_SIZE - sizeof(T) ) > data;
};

或者类似的东西。

Imagine

#define CACHE_LINE_SIZE 32
sizeof(T) == 48

Now, consider how [[ align(CACHE_LINE_SIZE) ]], works. eg:

[[ align(32) ]] Foo foo;

This will force sizeof(Foo) == 32n for some n. ie align() will pad for you, if necessary, in order for things like Foo foo[10]; to have each foo[i] aligned as requested.

So, in our case, with sizeof(T) == 48, this means sizeof(cache_line_storage<T>) == 64.

So the alignment gives you the padding you were hoping for.

However, this is one 'error' in the template. Consider this case:

#define CACHE_LINE_SIZE 32
sizeof(T) == 32

Here we end up with char pad[1];. Which means sizeof(cache_line_storage<T>) == 64. Probably not what you want!

I think the template would need to be modified somewhat:

template <typename T, int padding>
struct pad_or_not
{
   T data;
   char pad[padding];
};

// specialize the 0 case
// As it is late, I am SURE I've got the specialization syntax wrong...
template <typename T, int>
struct pad_or_not<0>
{
   T data;
};

template<typename T>
struct cache_line_storage {
   [[ align(CACHE_LINE_SIZE) ]] pad_or_not<T, (sizeof(T) > CACHE_LINE_SIZE ? 0 : CACHE_LINE_SIZE - sizeof(T) ) > data;
};

or something like that.

紅太極 2024-08-19 04:49:01

“你不能拥有大小为 0 的数组,因此需要 1 才能编译” - GNU C 确实允许数组尺寸为零。
另请参见 http://gcc.gnu.org/ onlinedocs/gcc-4.1.2/gcc/Zero-Length.html

"You can't have arrays of size 0, so 1 is required to make it compile" - GNU C does allow arrays dimensioned as zero.
See also http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Zero-Length.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文