此双重检查锁定修复有什么问题?
所以我现在看到很多文章声称在 C++ 上双重检查锁定(通常用于防止多个线程尝试初始化延迟创建的单例)已被破坏。 正常的双重检查锁定代码如下所示:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
public:
static singleton & instance()
{
static singleton* instance;
if(!instance)
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
}
return *instance;
}
};
问题显然是分配实例的行 - 编译器可以自由分配对象,然后将指针分配给它,或者将指针设置到将分配的位置,然后分配它。 后一种情况打破了习惯用法——一个线程可以分配内存并分配指针,但在进入睡眠状态之前不运行单例的构造函数——然后第二个线程将看到实例不为空并尝试返回它,尽管它还没有建成。
我看到一个建议使用线程本地布尔值并且检查它而不是实例
。 像这样的事情:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
static boost::thread_specific_ptr<int> _sync_check;
public:
static singleton & instance()
{
static singleton* instance;
if(!_sync_check.get())
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
// Any non-null value would work, we're really just using it as a
// thread specific bool.
_sync_check = reinterpret_cast<int*>(1);
}
return *instance;
}
};
这样每个线程最终都会检查实例是否已创建一次,但在那之后停止,这会带来一些性能损失,但仍然没有锁定每个调用那么糟糕。 但是如果我们只使用本地静态布尔值怎么办?:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
public:
static singleton & instance()
{
static bool sync_check = false;
static singleton* instance;
if(!sync_check)
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
sync_check = true;
}
return *instance;
}
};
为什么这不起作用? 即使sync_check 在被分配到另一个线程中时由一个线程读取,垃圾值仍将是非零,因此为真。 多布博士的这篇文章声称你必须锁定,因为你永远不会赢得与编译器对指令进行重新排序。 这让我觉得这一定是因为某种原因不起作用,但我不明白为什么。 如果对序列点的要求像 Dobb 博士的文章让我相信的那样丢失,我不明白为什么锁之后的任何代码都无法重新排序到锁之前。 这会让C++多线程断了时期。
我想我可以看到编译器被允许专门将sync_check重新排序到锁之前,因为它是一个局部变量(即使它是静态的,我们也不会返回指向它的引用或指针)——但这仍然可以解决通过使其成为静态成员(实际上是全局的)来代替。
那么这行得通还是行不通呢? 为什么?
So I've seen a lot of articles now claiming that on C++ double checked locking, commonly used to prevent multiple threads from trying to initialize a lazily created singleton, is broken. Normal double checked locking code reads like this:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
public:
static singleton & instance()
{
static singleton* instance;
if(!instance)
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
}
return *instance;
}
};
The problem apparently is the line assigning instance -- the compiler is free to allocate the object and then assign the pointer to it, OR to set the pointer to where it will be allocated, then allocate it. The latter case breaks the idiom -- one thread may allocate the memory and assign the pointer but not run the singleton's constructor before it gets put to sleep -- then the second thread will see that the instance isn't null and try to return it, even though it hasn't been constructed yet.
I saw a suggestion to use a thread local boolean and check that instead of instance
. Something like this:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
static boost::thread_specific_ptr<int> _sync_check;
public:
static singleton & instance()
{
static singleton* instance;
if(!_sync_check.get())
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
// Any non-null value would work, we're really just using it as a
// thread specific bool.
_sync_check = reinterpret_cast<int*>(1);
}
return *instance;
}
};
This way each thread ends up checking if the instance has been created once, but stops after that, which entails some performance hit but still not nearly so bad as locking every call. But what if we just used a local static bool?:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
public:
static singleton & instance()
{
static bool sync_check = false;
static singleton* instance;
if(!sync_check)
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
sync_check = true;
}
return *instance;
}
};
Why wouldn't this work? Even if sync_check were to be read by one thread when it's being assigned in another the garbage value will still be nonzero and thus true. This Dr. Dobb's article claims that you have to lock because you'll never win a battle with the compiler over reordering instructions. Which makes me think this must not work for some reason, but I can't figure out why. If the requirements on sequence points are as lose as the Dr. Dobb's article makes me believe, I don't understand why any code after the lock couldn't be reordered to be before the lock. Which would make C++ multithreading broken period.
I guess I could see the compiler being allowed to specifically reorder sync_check to be before the lock because it's a local variable (and even though it's static we're not returning a reference or pointer to it) -- but then this could still be solved by making it a static member (effectively global) instead.
So will this work or won't it? Why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的修复不会修复任何问题,因为对sync_check 和实例的写入可能会在CPU 上无序完成。 举个例子,假设对实例的前两次调用大约同时发生在两个不同的 CPU 上。 第一个线程将按顺序获取锁、初始化指针并将sync_check 设置为true,但处理器可能会更改写入内存的顺序。 在另一个CPU上,第二个线程可以检查sync_check,看看它是否为真,但实例可能尚未写入内存。 请参阅Xbox 360 和 Microsoft Windows 的无锁编程注意事项 了解详情。
您提到的线程特定的sync_check 解决方案应该可以工作(假设您将指针初始化为0)。
Your fix doesn't fix anything since the writes to sync_check and instance can be done out of order on the CPU. As an example imagine the first two calls to instance happen at approximately the same time on two different CPUs. The first thread will acquire the lock, initialize the pointer and set sync_check to true, in that order, but the processor may change the order of the writes to memory. On the other CPU then it is possible for the second thread to check sync_check, see that it is true, but instance may not yet be written to memory. See Lockless Programming Considerations for Xbox 360 and Microsoft Windows for details.
The thread specific sync_check solution you mention should work then (assuming you initialize your pointer to 0).
这里有一些关于此内容的精彩读物(尽管它是面向 .net/c# 的):http ://msdn.microsoft.com/en-us/magazine/cc163715.aspx
归根结底,您需要能够告诉 CPU 它无法重新排序此变量访问的读/写(从最初的奔腾开始,如果 CPU 认为逻辑不受影响,它可以重新排序某些指令),并且它需要确保缓存的一致性(不要忘记这一点 - 我们开发人员可以假装所有内存只是一个平面资源,但实际上,每个 CPU 核心都有缓存,一些未共享(L1),一些有时可能共享(L2))——您的初始化可能会写入主 RAM,但另一个核心可能具有未初始化的缓存缓存中的值。 如果没有任何并发语义,CPU 可能不知道它的缓存是脏的。
我不知道 C++ 方面,但在 .net 中,您可以将变量指定为 易失性的,以保护对其的访问(或者您可以使用 System.Threading 中的内存读/写屏障方法)。
顺便说一句,我读到,在 .net 2.0 中,双重检查锁定保证在没有“易失性”变量的情况下工作(对于任何 .net 读者来说)——这对您的 C++ 代码没有帮助。
如果你想安全,你需要在 C++ 中执行相当于在 C# 中将变量标记为 易失性的操作。
There's some great reading about this (although it's .net/c# oriented) here: http://msdn.microsoft.com/en-us/magazine/cc163715.aspx
What it boils down to is that you need to be able to tell the CPU that it cannot reorder your reads/writes for this variable access (ever since the original Pentium, the CPU can reorder certain instructions if it thinks that the logic would be unaffected), and that it needs to ensure that the cache is consistent (don't forget about that -- we devs get to pretend that all memory is just one flat resource, but in reality, each CPU core has cache, some unshared (L1), some might be shared sometimes (L2)) -- your initizlization might write to main RAM, but another core might have the uninitialized value in cache. If you don't have any concurrency semantics, the CPU may not know that it's cache is dirty.
I don't know the C++ side, but in .net, you would designate the variable as volatile in order to protect access to it (or you would use the Memory read/write barrier methods in System.Threading).
As an aside, I've read that in .net 2.0, double checked locking is guaranteed to work without "volatile" variables (for any .net readers out there) -- that doesn't help you with your c++ code.
If you want to be safe, you will need to do the c++ equivalent of marking a variable as volatile in c#.
“后一种情况打破了习惯用法——两个线程最终可能会创建单例。”
但是如果我正确理解代码,第一个示例,您检查实例是否已经存在(可能由多个线程同时执行),如果没有一个线程锁定它并创建实例 - 只有一个此时线程可以执行创建。 所有其他线程都被锁定并等待。
一旦创建了实例并且互斥锁被解锁,下一个等待线程将锁定互斥锁,但它不会尝试创建新实例,因为检查将失败。
下次检查实例变量时,它将被设置,因此没有线程会尝试创建新实例。
我不确定一个线程将新实例指针分配给实例而另一个线程检查同一变量的情况 - 但我相信在这种情况下它将得到正确处理。
我在这里错过了什么吗?
好吧,不确定操作的重新排序,但在这种情况下,它会改变逻辑,所以我不希望它发生 - 但我不是这个主题的专家。
"The latter case breaks the idiom -- two threads might end up creating the singleton."
But if I understand the code correctly, the first example, you check if instance already exists (might be executed by multiple threads at the same time), if it doesn't one thread get's to lock it and it creates the instance - only one thread can execute the creation at that time. All other threads get locked out and will wait.
Once the instance is created and the mutex is unlocked the next waiting thread will lock mutex but it will not try to create new instance because the check will fail.
Next time the instance variable is checked it will be set so no threads will try to create new instance.
I'm not sure about the case where one thread is assigning new instance pointer to instance while another thread checks the same variable - but I believe it will be handled correctly in this case.
Am I missing something here?
Ok not sure about the reordering of operations but in this case it would be altering logic so I would not expect it to happen - but I'm no expert on this topic.