互斥锁功能在没有 volatile 的情况下是否足够？

发布于 2024-11-26 18:02:34 字数 695 浏览 3 评论 0原文

我和一位同事为 x86、x64、Itanium、PowerPC 和其他已有 10 年历史的服务器 CPU 上运行的各种平台编写软件。

我们刚刚讨论了诸如 pthread_mutex_lock() ... pthread_mutex_unlock() 之类的互斥函数本身是否足够，或者受保护的变量是否需要是易失性的。

int foo::bar()
{
 //...
 //code which may or may not access _protected.
 pthread_mutex_lock(m);
 int ret = _protected;
 pthread_mutex_unlock(m);
 return ret;
}

我关心的是缓存。编译器是否可以将 _protected 的副本放置在堆栈或寄存器中，并在赋值中使用该过时的值？如果不是，什么可以阻止这种情况发生？这种模式的变体容易受到攻击吗？

我认为编译器实际上并不理解 pthread_mutex_lock() 是一个特殊函数，那么我们只是受序列点保护吗？

非常感谢。

更新：好吧，我可以看到一个趋势，其中的答案解释了为什么不稳定是不好的。我尊重这些答案，但是在网上很容易找到有关该主题的文章。我在网上找不到的，也是我问这个问题的原因，是如何在没有易失性的情况下保护我的。 如果上述代码正确，如何不会受到缓存问题的影响？

原文

A coworker and I write software for a variety of platforms running on x86, x64, Itanium, PowerPC, and other 10 year old server CPUs.

We just had a discussion about whether mutex functions such as pthread_mutex_lock() ... pthread_mutex_unlock() are sufficient by themselves, or whether the protected variable needs to be volatile.

int foo::bar()
{
 //...
 //code which may or may not access _protected.
 pthread_mutex_lock(m);
 int ret = _protected;
 pthread_mutex_unlock(m);
 return ret;
}

My concern is caching. Could the compiler place a copy of _protected on the stack or in a register, and use that stale value in the assignment? If not, what prevents that from happening? Are variations of this pattern vulnerable?

I presume that the compiler doesn't actually understand that pthread_mutex_lock() is a special function, so are we just protected by sequence points?

Thanks greatly.

Update: Alright, I can see a trend with answers explaining why volatile is bad. I respect those answers, but articles on that subject are easy to find online. What I can't find online, and the reason I'm asking this question, is how I'm protected without volatile. If the above code is correct, how is it invulnerable to caching issues?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

楠木可依 2024-12-03 18:02:34

最简单的答案是多线程根本不需要易失性。

长的答案是，像关键部分这样的序列点是依赖于平台的，就像您使用的任何线程解决方案一样，因此大多数线程安全性也依赖于平台。

C++0x 有线程和线程安全的概念，但当前标准没有，因此 易失性 有时被错误地识别为防止多线程编程的操作和内存访问重新排序的东西，而实际上它从来没有预期的并且不能以这种方式可靠地使用。

在 C++ 中，volatile 唯一的用途是允许访问内存映射设备，允许在 setjmp 和 longjmp 之间使用变量，以及允许在信号处理程序中使用 sig_atomic_t 变量。关键字本身并不使变量成为原子的。

好消息是，C++0x 中我们将拥有 STL 构造 std::atomic，它可用于保证变量的原子操作和线程安全构造。在您选择的编译器支持它之前，您可能需要求助于 boost 库或使用一些汇编代码来创建您自己的对象来提供原子变量。

PS 很多混乱是由 Java 和 .NET 实际上使用关键字 volatile C++ 强制执行多线程语义引起的，但是 C++ 也遵循 C 的做法，但情况并非如此。

回复收藏 0 原文

任性一次 2024-12-03 18:02:34

您的线程库应包含有关互斥锁锁定和解锁的适当 CPU 和编译器屏障。对于 GCC，asm 语句上的内存破坏充当编译器屏障。

实际上，有两件事可以保护您的代码免受（编译器）缓存的影响：

您正在调用非纯外部函数（pthread_mutex_*()），这意味着编译器不知道函数不会修改全局变量，因此必须重新加载它们。
正如我所说，pthread_mutex_*() 包含编译器障碍，例如：在 glibc/x86 上 pthread_mutex_lock() 最终调用宏 lll_lock()，它有一个内存破坏器，强制编译器重新加载变量。

回复收藏 0 原文

偏爱你一生 2024-12-03 18:02:34

如果上面的代码是正确的，那它怎么能不受缓存的影响呢？
问题？

在 C++0x 之前，情况并非如此。而且它在 C 中没有指定。所以，这实际上取决于编译器。一般来说，如果编译器不保证它会遵守涉及多线程的函数或操作的内存访问的顺序约束，您将无法使用该编译器编写多线程安全代码。请参阅 Hans J Boehm 的线程不能作为库实现。

至于编译器应该支持线程安全代码的抽象，Memory Barriers 上的维基百科条目是一个非常好的起点。

（至于为什么人们建议使用易失性，有些编译器将易失性视为编译器的内存屏障。这绝对不是标准的。）

回复收藏 0 原文

萌无敌 2024-12-03 18:02:34

易失性关键字向编译器暗示变量可能会在程序逻辑之外发生更改，例如内存映射的硬件寄存器可能会作为中断服务例程的一部分进行更改。这可以防止编译器假设缓存的值始终是正确的，并且通常会强制执行内存读取来检索该值。这种用法比线程早了几十年左右。我也看到它与信号操纵的变量一起使用，但我不确定用法是否正确。

由互斥锁保护的变量在被不同线程读取或写入时保证是正确的。需要线程 API 来确保此类变量视图的一致性。此访问是程序逻辑的一部分，并且 volatile 关键字在这里无关紧要。

回复收藏 0 原文

空城之時有危險 2024-12-03 18:02:34

除了最简单的自旋锁算法之外，互斥锁代码相当复杂：良好的优化互斥锁/解锁代码包含即使是优秀的程序员也难以理解的代码。它使用特殊的比较和设置指令，不仅管理解锁/锁定状态，还管理等待队列，可以选择使用系统调用进入等待状态（用于锁定）或唤醒其他线程（用于解锁）。

无论如何，普通编译器都无法解码和“理解”所有复杂的代码（同样，除了简单的自旋锁），因此即使对于不知道互斥体是什么以及它如何关联的编译器来说对于同步，实际上编译器无法优化此类代码的任何内容。

也就是说，代码是否是“内联”的，或者可用于出于跨模块优化的目的进行分析，或者全局优化是否可用。

我认为编译器实际上并不理解这一点
pthread_mutex_lock() 是一个特殊函数，所以我们只是受保护
按序列点？

编译器不知道它的作用，因此不会尝试围绕它进行优化。

怎样才算是“特别”呢？它是不透明的并被如此处理。 在不透明函数中它并不特殊。

与可以访问任何其他对象的任意不透明函数没有语义差异。

我关心的是缓存。编译器是否可以放置 _protected 的副本
在堆栈或寄存器中，并在
作业？

是的，在透明、直接地作用于对象的代码中，通过使用变量名称或指针以编译器可以遵循的方式。不在可能使用任意指针间接使用变量的代码中。

所以是的在调用不透明函数之间。没有跨越。

还有对于只能在函数中使用的变量，按名称：对于没有获取地址或绑定到它们的引用的局部变量（这样编译器就无法进一步遵循用途）。这些确实可以在任意调用（包括锁定/解锁）中“缓存”。

如果不是，什么可以阻止这种情况发生？是这个的变体
模式是否脆弱？

函数的不透明度。非内联。汇编代码。系统调用。代码复杂度。一切都会让编译器跳出来并认为“这是复杂的东西，只需调用它”。

编译器的默认位置始终是“让我们愚蠢地执行，我无论如何都不明白正在做什么”而不是“我将优化它/让我们重写我更了解的算法”。大多数代码没有以复杂的非本地方式进行优化。

现在让我们假设绝对更差（从外部的角度来看，编译器应该放弃，从优化算法的角度来看，这绝对是最好的）：

该函数是“内联”的（=可用于内联）（或全局优化启动，或所有函数在道德上都是“内联”）；
该同步原语（锁定或解锁）中不需要内存屏障（如在单处理器分时系统和多处理器强有序系统中），因此它不包含此类内容；
没有使用特殊指令（例如比较和设置）（例如对于自旋锁，解锁操作是简单的写入）；
没有系统调用来暂停或唤醒线程（自旋锁中不需要）；

那么我们可能会遇到问题，因为编译器可以围绕函数调用进行优化。 这个问题可以通过插入一个编译器屏障来简单地修复，例如一个空的 asm 语句，其中包含其他可访问变量的“clobber”。这意味着编译器只是假设被调用函数可以访问的任何内容都是“被砸烂了”。

或者受保护的变量是否需要是易失性的。

您可以将其设置为易失性，其通常原因是使事物变得易失性：确保能够访问调试器中的变量，以防止浮点变量在运行时具有错误的数据类型等。

将其设置为易失性实际上不会甚至修复上述问题，因为易失性本质上是抽象机中的内存操作，具有 I/O 操作的语义，因此仅针对

真实 I/O（如 iostream
系统）进行排序调用
其他易失性操作
asm内存的对外部函数
的调用（因为它们可能会执行上述操作）

Volatile 没有相对于非易失性内存副作用进行排序。 这使得 volatile < em>实际上对于编写线程安全代码毫无用处（对于实际用途来说无用），即使是在最具体的情况下，挥发性会先验帮助，在不需要内存栅栏的情况下：在 a 上编程线程原语单CPU上的分时系统。（这可能是 C 或 C++ 中最不被理解的方面之一。）

因此，虽然 volatile 确实可以防止“缓存”，但易失性甚至不会阻止编译器对锁定/解锁操作进行重新排序，除非所有共享变量都是易失性的。

With the exception of the simplest spin lock algorithm, mutex code is quite involved: a good optimized mutex lock/unlock code contains the kind of code even excellent programmer struggle to understand. It uses special compare and set instructions, manages not only the unlocked/locked state but also the wait queue, optionally uses system calls to go into a wait state (for lock) or wake up other threads (for unlock).

There is no way the average compiler can decode and "understand" all that complex code (again, with the exception of the simple spin lock) no matter way, so even for a compiler not aware of what a mutex is, and how it relates to synchronization, there is no way in practice a compiler could optimize anything around such code.

That's if the code was "inline", or available for analyse for the purpose of cross module optimization, or if global optimization is available.

I presume that the compiler doesn't actually understand that
pthread_mutex_lock() is a special function, so are we just protected
by sequence points?

The compiler does not know what it does, so does not try to optimize around it.

How is it "special"? It's opaque and treated as such. It is not special among opaque functions.

There is no semantic difference with an arbitrary opaque function that can access any other object.

My concern is caching. Could the compiler place a copy of _protected
on the stack or in a register, and use that stale value in the
assignment?

Yes, in code that act on objects transparently and directly, by using the variable name or pointers in a way that the compiler can follow. Not in code that might use arbitrary pointers to indirectly use variables.

So yes between calls to opaque functions. Not across.

And also for variables which can only be used in the function, by name: for local variables that don't have either their address taken or a reference bound to them (such that the compiler cannot follow all further uses). These can indeed be "cached" across arbitrary calls include lock/unlock.

If not, what prevents that from happening? Are variations of this
pattern vulnerable?

Opacity of the functions. Non inlining. Assembly code. System calls. Code complexity. Everything that make compilers bail out and think "that's complicated stuff just make calls to it".

The default position of a compiler is always the "let's execute stupidly I don't understand what is being done anyway" not "I will optimize that/let's rewrite the algorithm I know better". Most code is not optimized in complex non local way.

Now let's assume the absolute worse (from out point of view which is that the compiler should give up, that is the absolute best from the point of view of an optimizing algorithm):

the function is "inline" (= available for inlining) (or global optimization kicks in, or all functions are morally "inline");
no memory barrier is needed (as in a mono-processor time sharing system, and in a multi-processor strongly ordered system) in that synchronization primitive (lock or unlock) so it contains no such thing;
there is no special instruction (like compare and set) used (for example for a spin lock, the unlock operation is a simple write);
there is no system call to pause or wake threads (not needed in a spin lock);

then we might have a problem as the compiler could optimize around the function call. This is fixed trivially by inserting a compiler barrier such as an empty asm statement with a "clobber" for other accessible variables. That means that compiler just assumes that anything that might be accessible to a called function is "clobbered".

or whether the protected variable needs to be volatile.

You can make it volatile for the usual reason you make things volatile: to be certain to be able to access the variable in the debugger, to prevent a floating point variable from having the wrong datatype at runtime, etc.

Making it volatile would actually not even fix the issue described above as volatile is essentially a memory operation in the abstract machine that has the semantics of an I/O operation and as such is only ordered with respect to

real I/O like iostream
system calls
other volatile operations
asm memory clobbers (but then no memory side effect is reordered around those)
calls to external functions (as they might do one the above)

Volatile is not ordered with respect to non volatile memory side effects. That makes volatile practically useless (useless for practical uses) for writing thread safe code in even the most specific case where volatile would a priori help, the case where no memory fence is ever needed: when programming threading primitives on a time sharing system on a single CPU. (That may be one of the least understood aspects of either C or C++.)

So while volatile does prevent "caching", volatile doesn't even prevent compiler reordering of lock/unlock operation unless all shared variables are volatile.

回复收藏 0 原文

内心荒芜 2024-12-03 18:02:34

锁/同步原语确保数据不会缓存在寄存器/CPU 缓存中，这意味着数据会传播到内存。如果两个线程使用 in 锁访问/修改数据，则可以保证从内存中读取数据并将数据写入内存。在这个用例中我们不需要 volatile。

但是，如果您的代码经过双重检查，编译器可以优化代码并删除冗余代码，以防止我们需要易失性。

示例：参见单例模式示例
https://en.m.wikipedia.org/wiki/Singleton_pattern#Lazy_initialization

为什么有人会写这种代码？
答：不获取锁会带来性能优势。

PS：这是我第一篇关于堆栈溢出的文章。

回复收藏 0 原文

小伙你站住 2024-12-03 18:02:34

如果您锁定的对象是易失性的，例如：如果它表示的值取决于程序的外部事物（硬件状态），则不会。
易失性 不应该用来表示执行程序的结果的任何类型的行为。
如果它实际上是易失性的，我个人会做的是锁定指针/地址的值，而不是底层对象。
例如：

volatile int i = 0;
// ... Later in a thread
// ... Code that may not access anything without a lock
std::uintptr_t ptr_to_lock = &i;
some_lock(ptr_to_lock);
// use i
release_some_lock(ptr_to_lock);

请注意，只有在线程中使用该对象的所有代码都锁定同一地址时，它才有效。因此，在将线程与属于 API 一部分的某些变量一起使用时请注意这一点。

Not if the object you're locking is volatile, eg: if the value it represents depends on something foreign to the program (hardware state).
volatile should NOT be used to denote any kind of behavior that is the result of executing the program.
If it's actually volatile what I personally would do is locking the value of the pointer/address, instead of the underlying object.
eg:

volatile int i = 0;
// ... Later in a thread
// ... Code that may not access anything without a lock
std::uintptr_t ptr_to_lock = &i;
some_lock(ptr_to_lock);
// use i
release_some_lock(ptr_to_lock);

Please note that it only works if ALL the code ever using the object in a thread locks the same address. So be mindful of that when using threads with some variable that is part of an API.

回复收藏 0 原文

~没有更多了~