2+ 的真正危险线程写入/读取变量
同时读/写单个变量的真正危险是什么?
如果我使用一个线程写入变量,另一个线程在 while 循环中读取变量,并且在写入变量时读取变量并且使用旧值,则不会有危险,这里还有什么危险?
同时读/写是否会导致线程崩溃,或者当发生精确的同时读/写时,低级别会发生什么?
What are the real dangers of simultaneous read/write to a single variable?
If I use one thread to write a variable and another to read the variable in a while loop and there is no danger if the variable is read while being written and an old value is used what else is a danger here?
Can a simultaneous read/write cause a thread crash or what happens on the low level when an exact simultaneous read/write occurs?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
如果两个线程在没有适当同步的情况下访问变量,并且这些访问中至少有一个是写入,那么就会出现数据争用和未定义的行为。
未定义行为的表现方式完全取决于实现。在大多数现代体系结构中,您不会从硬件中获得陷阱或异常或任何内容,并且它将读取某些内容,或存储某些内容。问题是,它不一定会读取或写入您期望的内容。
例如,使用两个线程递增变量,您可能会错过计数,如我在 devx 上的文章中所述: http: //www.devx.com/cplus/Article/42725
对于单个作者和单个读者,最常见的结果是读者看到过时的值,但您也可能看到部分更新的值如果更新需要多个周期,或者变量跨缓存行拆分。然后会发生什么取决于你用它做什么——如果它是一个指针并且你得到了部分更新的值那么它可能不是一个有效的指针,并且无论如何也不会指向你想要的东西,然后你可能会由于取消引用无效的指针值而导致任何类型的损坏或错误。如果坏指针值恰好指向内存映射 I/O 寄存器,这可能包括格式化硬盘或其他不良后果......
If two threads access a variable without suitable synchronization, and at least one of those accesses is a write then you have a data race and undefined behaviour.
How undefined behaviour manifests is entirely implementation dependent. On most modern architectures, you won't get a trap or exception or anything from the hardware, and it will read something, or store something. The thing is, it won't necessarily read or write what you expected.
e.g. with two threads incrementing a variable, you can miss counts, as described in my article at devx: http://www.devx.com/cplus/Article/42725
For a single writer and a single reader, the most common outcome will be that reader sees a stale value, but you might also see a partially-updated value if the update requires more than one cycle, or the variable is split across cache lines. What happens then depends on what you do with it --- if it's a pointer and you get a partially updated value then it might not be a valid pointer, and won't point to what you intended it to anyway, and then you might get any kind of corruption or error due to dereferencing an invalid pointer value. This may include formatting your hard disk or other bad consequences if the bad pointer value just happens to point to a memory mapped I/O register....
一般来说,你会得到意想不到的结果。维基百科定义了两种不同的赛车条件:
所以输出不会总是混乱,这取决于代码。 始终处理竞争条件是一种很好的做法,以便以后进行代码扩展并防止可能的错误。没有什么比无法信任自己的数据更烦人的了。
In general you get unexpected results. Wikipedia defines two distinct racing conditions:
So the output will not always get messed up, it depends on the code. It's good practice to always deal with racing conditions for later code scaling and preventing possible errors. Nothing is more annoying then not being able to trust your own data.
两个线程读取相同的值完全没有问题。
当一个线程写入非原子变量而另一个线程读取它时,问题就开始了。那么读取的结果是未定义的。由于线程可能随时被抢占(停止)。只有对原子变量的操作才能保证不可破坏。原子操作通常是写入
int
类型变量。如果有两个线程访问相同的数据,则最佳实践+通常不可避免地使用锁定(互斥体、信号量)。
马里奥
Two threads reading the same value is no problem at all.
The problem begins when one thread writes a non-atomic variable and another thread reads it. Then the results of the read are undefined. Since a thread may be preempted (stopped) at any time. Only operations on atomic variables are guaranteed to be non-breakable. Atomic actions are usually writes to
int
type variables.If you have two threads accessing the same data, it is best practice + usually unavoidable to use locking (mutex, semaphore).
hth
Mario
取决于平台。例如,在Win32上,对齐的32位值的读写操作是原子的——也就是说,你不能一半读一个新值,一半读一个旧值,如果你写,那么当有人来读时,要么获得完整的新值,要么获得旧值。当然,并非所有价值观或所有平台都是如此。
Depends on the platform. For example, on Win32, then read and write ops of aligned 32bit values are atomic- that is, you can't half-read a new value and half-read an old value, and if you write, then when someone comes to read, either they get the full new value or the old value. That's not true for all values, or all platforms, of course.
结果未定义。
考虑这段代码:
问题是,如果有 N 个线程,结果可能是 10 到 N*10 之间的任何值。
这是因为可能会发生所有线程读取相同值的情况,增加该值,然后将值+1 写回。但你问是否可以使程序或硬件崩溃。
这取决于。大多数情况下错误的结果是无用的。
为了解决这个锁定问题,您需要互斥锁或信号量。
互斥体是代码的锁。在大写字母中,您将锁定行中的部分代码
,其中信号量是变量的锁,
解决相同类型的问题基本上是相同的事情。
在您的履带库中检查此工具。
http://en.wikipedia.org/wiki/Mutual_exclusion
Result is undefined.
Consider this code:
Problem is that if you have N threads result can be anything between 10 and N*10.
This is because it might happen all treads read same value increase it and then write value +1 back. But you asked if you can crash program or hardware.
It depends. In most cases are wrong results useless.
For solving this locking problem you need mutex or semaphore.
Mutex is lock for code. In upper case you would lock part of code in line
Where semaphore is lock for variable
Basicaly same thing for solving same type of problem.
Check for this tools in your tread library.
http://en.wikipedia.org/wiki/Mutual_exclusion
最坏的情况取决于实施情况。 pthreads 有如此多完全独立的实现,运行在不同的系统和硬件上,我怀疑有人知道它们的一切。
如果
p
不是指向易失性的指针,那么我认为符合Posix实现的编译器允许转变为:
*的单一检查p
后面跟着一个无限循环,根本不关心*p
的值。实际上,它不会,所以问题是您是否要按照标准进行编程,还是要按照您正在使用的实现的未记录的观察到的行为进行编程。后者通常适用于简单的情况,然后您可以在代码上进行构建,直到您执行的操作足够复杂以至于意外无法正常工作。实际上,在没有一致内存缓存的多 CPU 系统上,while 循环可能需要很长时间才能看到来自不同 CPU 的更改,因为如果没有内存屏障,它可能永远不会更新其缓存视图主存储器。但英特尔拥有一致的缓存,因此您个人很可能不会看到任何足够长的延迟而无需担心。如果一些可怜的傻瓜试图在更奇特的架构上运行您的代码,他们最终可能不得不修复它。
回到理论,您所描述的设置可能会导致崩溃。想象一个假设的架构,其中:
p
指向非原子类型,例如典型 32 位架构上的long long
。long long
具有陷阱表示,例如因为它有一个用作奇偶校验的填充位。*p
的写入是半完成的,Bang,未定义的行为,你读到了一个陷阱表示。 Posix 可能禁止 C 标准允许的某些陷阱表示,在这种情况下,
long long
可能不是*p
类型的有效示例,但我希望您可以找到允许陷阱表示的类型。The worst that will happen depends on the implementation. There are so many completely independent implementations of pthreads, running on different systems and hardware, that I doubt anyone knows everything about all of them.
If
p
isn't a pointer-to-volatile then I think that a compiler for a conforming Posix implementation is allowed to turn:Into a single check of
*p
followed by an infinite loop that doesn't bother looking at the value of*p
at all. In practice, it won't, so it's a question of whether you want to program to the standard, or program to undocumented observed behavior of the implementations you're using. The latter generally works for simple cases, and then you build on the code until you do something complicated enough that it unexpectedly doesn't work.In practice, on a multi-CPU system that doesn't have coherent memory caches, it could be a very long time before that while loop ever sees a change made from a different CPU, because without memory barriers it might never update its cached view of main memory. But Intel has coherent caches, so most likely you personally won't see any delays long enough to care about. If some poor sucker ever tries to run your code on a more exotic architecture, they may end up having to fix it.
Back to theory, the setup you're describing could cause a crash. Imagine a hypothetical architecture where:
p
points to a non-atomic type, likelong long
on a typical 32 bit architecture.long long
on that system has trap representations, for example because it has a padding bit used as a parity check.*p
is half-complete when the read occursBang, undefined behavior, you read a trap representation. It may be that Posix forbids certain trap representations that the C standard allows, in which case
long long
might not be a valid example for the type of*p
, but I expect you can find a type for which trap representations are permitted.如果写入和写入的变量无法自动更新或读取,则读取器可能会获取损坏的“部分更新”值。
If the variable being written to and from can not be updated or read atomically then it is possible for the reader to pick up a corrupt "partially updated" value.
long long
变量,其中一半来自新值,另一半来自旧值)。pthread_mutex_unlock()
包含隐式内存屏障)之前,不能保证您看到新值。long long
variable with half of it coming from the new value and the other half coming from the old value).pthread_mutex_unlock()
contains an implicit memory barrier).