如何从堆中为 InterlockedIncrement 函数分配正确的内存对齐方式?
这段代码似乎可以工作,但是我是否正确使用了 InterlockedIncrement 函数? m_count 的正确内存对齐是我最关心的问题。假设我们在 x86-64 系统上并编译一个 64 位应用程序(如果重要的话)。顺便说一句,出于我的实际目的,我不能将 m_count 声明为 volatile long,然后使用 InterlockedIncrement(&m_count);但它必须是指向堆中数据的指针。
#include <Windows.h>
#include <malloc.h>
class ThreadSafeCounter {
public:
ThreadSafeCounter()
{
// Are those arguments for size and alignment correct?
void* placement = _aligned_malloc( sizeof(long), sizeof(long) );
m_count = new (placement) long(0);
}
~ThreadSafeCounter()
{
_aligned_free( const_cast<long*>(m_count) );
}
void AddOne()
{
InterlockedIncrement(m_count);
}
long GetCount()
{
return *m_count;
}
private:
volatile long* m_count;
};
This code seems to work, but have I used the InterlockedIncrement function correctly? The correct memory alignment of m_count is of my primary concern. Assume we're on a x86-64 system and compile a 64-bit application (in case that matters). By the way, for my actual purposes I can't declare m_count as a volatile long and then use InterlockedIncrement(&m_count); but it must be a pointer to data in heap.
#include <Windows.h>
#include <malloc.h>
class ThreadSafeCounter {
public:
ThreadSafeCounter()
{
// Are those arguments for size and alignment correct?
void* placement = _aligned_malloc( sizeof(long), sizeof(long) );
m_count = new (placement) long(0);
}
~ThreadSafeCounter()
{
_aligned_free( const_cast<long*>(m_count) );
}
void AddOne()
{
InterlockedIncrement(m_count);
}
long GetCount()
{
return *m_count;
}
private:
volatile long* m_count;
};
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
堆分配器已经将返回的地址与本机平台字大小对齐。 x86 为 4 字节,x64 为 8 字节。您在 MSVC 的任一平台上使用 long 32 位。无需跳过 _aligned_malloc() 环。
The heap allocator already aligns returned addresses to the native platform word size. 4 bytes for x86, 8 bytes for x64. You are using long, 32-bit on either platform for MSVC. No need to jump through the _aligned_malloc() hoop.
这是一个平台架构细节,但您需要记住,原子操作不仅仅是对齐。平台 ABI 通常会确保默认情况下原始数据类型对齐,以便任何操作(包括原子)都能正常工作。 malloc() 永远不应该返回一个未对齐的指针,即使您要求一个字节也不应该。
除此之外,请特别注意 http://en.wikipedia.org/wiki/False_sharing< /a> - 意味着除了需要对齐(通常是
sizeof(long)
)之外,您还必须确保在同一缓存行中仅托管单个原子访问的变量。如果您计划使用/允许这些计数器的数组,这一点尤其重要。
Microsoft 的编译器使用 __declspec(align(value)) 来指示编译器保证特定的结构对齐。正如其他人提到的,似乎没有特定需要对这样的数据结构/类进行堆分配,但我不知道您是否需要 pimpl 来做其他事情。
It's a platform architecture detail but you need to keep in mind that there's more to atomic operations than alignment. The platform ABIs usually make sure that primitive data type alignment by default is so that any operation (including atomics) will work. malloc() should never return you a misaligned pointer, not even if you ask for a single byte.
Although, in addition to that, specifically watch out for http://en.wikipedia.org/wiki/False_sharing - meaning beyond the need to have alignment (usually a
sizeof(long)
) you also must make sure to host only a single atomically-accessed variable within the same cacheline.That is particularly important if you plan to use/allow arrays of these counters.
Microsoft's compilers use
__declspec(align(value))
for instructing the compiler to guarantee specific structure alignment. As others mentioned, there seems no specific need for such a data structure / class to be heap-allocated, but I can't know if you need pimpl for something else.对于您的用例来说,最简单的方法是通过继承使用侵入式引用计数,从而消除这种需求。
但是,如果您绝望了,只需查看 MSVC 的 shared_ptr 实现即可。
这个C级演员真是太恶心了。然而,这对我来说表明该对象肯定会利用类型特征进行正确的对齐。
The easiest thing to do for your use case is to use intrusive reference counting via inheritance, eliminating this need.
However, if you're desperate, just check out MSVC's implementation of shared_ptr.
That C-cast is quite nasty. However, this suggests to me that this object will most definitely have the correct alignment, utilizing type traits.