我可以避免对很少更改的变量使用锁吗?
我一直在读 Joe Duffy 的关于并发编程的书。我有一个关于无锁线程的学术问题。
首先:我知道无锁线程充满了危险(如果你不相信我,请阅读书中有关内存模型的部分)
不过,我有一个问题: 假设我有一个带有 int 属性的类。
该属性引用的值将被多个线程非常频繁地读取。
该值发生更改的情况极为罕见,即使发生更改,也将是单个线程更改它。
如果在使用它的另一个操作正在进行时它确实发生了变化,那么没有人会失去一根手指(使用它的任何人所做的第一件事就是将其复制到本地变量)
我可以使用锁(或 readerwriterlockslim 来保持读取)并发)。 我可以将变量标记为“易失性”(有很多这样做的例子)
但是,即使是“易失性”也会对性能造成影响。
如果我在发生变化时使用 VolatileWrite,并保持正常读取访问,会怎么样?像这样的事情:
public class MyClass
{
private int _TheProperty;
internal int TheProperty
{
get { return _TheProperty; }
set { System.Threading.Thread.VolatileWrite(ref _TheProperty, value); }
}
}
我认为我不会在现实生活中尝试这个,但我对答案很好奇(最重要的是,作为我是否理解我一直在阅读的内存模型内容的检查点)。
I've been reading Joe Duffy's book on Concurrent programming. I have kind of an academic question about lockless threading.
First: I know that lockless threading is fraught with peril (if you don't believe me, read the sections in the book about memory model)
Nevertheless, I have a question:
suppose I have an class with an int property on it.
The value referenced by this property will be read very frequently by multiple threads
It is extremely rare that the value will change, and when it does it will be a single thread that changes it.
If it does change while another operation that uses it is in flight, no one is going to lose a finger (the first thing anyone using it does is copy it to a local variable)
I could use locks (or a readerwriterlockslim to keep the reads concurrent).
I could mark the variable volatile (lots of examples where this is done)
However, even volatile can impose a performance hit.
What if I use VolatileWrite when it changes, and leave the access normal for reads. Something like this:
public class MyClass
{
private int _TheProperty;
internal int TheProperty
{
get { return _TheProperty; }
set { System.Threading.Thread.VolatileWrite(ref _TheProperty, value); }
}
}
I don't think that I would ever try this in real life, but I'm curious about the answer (more than anything, as a checkpoint of whether I understand the memory model stuff I've been reading).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
将变量标记为“易失性”有两个作用。
1)读写具有获取和释放语义,这样其他内存位置的读写不会相对于该内存位置的读写“时间上前后移动”。 (这是一种简化,但你同意我的观点。)
2)抖动生成的代码不会“缓存”逻辑上似乎不变的值。
我不知道前一点是否与您的情况相关;你只描述了一个内存位置。仅具有易失性写入而没有易失性读取是否重要由您决定。
但在我看来,后一点是非常相关的。如果您对非易失性变量有自旋锁:
抖动在其生成此代码的权利范围内,就像您编写的一样
无论它实际上是否这样做,我不知道,但它有权这样做。如果您希望代码在每次循环时实际重新检查属性,则将其标记为 易失性是正确的方法。
Marking a variable as "volatile" has two effects.
1) Reads and writes have acquire and release semantics, so that reads and writes of other memory locations will not "move forwards and backwards in time" with respect to reads and writes of this memory location. (This is a simplification, but you take my point.)
2) The code generated by the jitter will not "cache" a value that seems to logically be unchanging.
Whether the former point is relevant in your scenario, I don't know; you've only described one memory location. Whether or not it is important that you have only volatile writes but not volatile reads is something that is up to you to decide.
But it seems to me that the latter point is quite relevant. If you have a spin lock on a non-volatile variable:
the jitter is within its rights to generate this code as though you'd written
Whether it actually does so or not, I don't know, but it has the right to. If what you want is for the code to actually re-check the property on each go round the loop, marking it as volatile is the right way to go.
问题是读取线程是否会看到更改。这不仅仅是它是否立即看到的问题。
坦率地说,我已经放弃了尝试理解波动性 - 我知道它的含义并不完全像我想象的那样......但我也知道,由于读取线程上没有任何内存屏障,您可以阅读永远相同的旧数据。
The question is whether the reading thread will ever see the change. It's not just a matter of whether it sees it immediately.
Frankly I've given up on trying to understand volatility - I know it doesn't mean quite what I thought it used to... but I also know that with no kind of memory barrier on the reading thread, you could be reading the same old data forever.
易失性
的“性能影响”是因为编译器现在生成代码来实际检查该值,而不是对其进行优化 - 换句话说,您将必须采用该值无论你做什么,性能都会受到影响。The "performance hit" of
volatile
is because the compiler now generates code to actually check the value instead of optimizing that away - in other words, you'll have to take that performance hit regardless of what you do.在 CPU 级别,是的,每个处理器最终都会看到内存地址的更改。即使没有锁或内存屏障。锁和屏障只会确保这一切都按照相对顺序(相对于其他指令)发生,以便它对您的程序来说是正确的。
问题不在于缓存一致性(我希望 Joe Duffy 的书不会犯这个错误)。缓存保持一致 - 只是这需要时间,并且处理器不会费心等待这种情况发生 - 除非你强制执行。因此,处理器会继续执行下一条指令,这可能会或可能不会在上一条指令之前发生(因为每个内存读/写都需要不同的时间。具有讽刺意味的是因为处理器就一致性达成一致的时间等 - 这会导致某些高速缓存行比其他高速缓存行更快地保持一致(即,取决于该行是修改的、独占的、共享的还是无效的,它需要或多或少的时间努力进入必要的状态)。)
因此,读取可能看起来很旧或来自过时的缓存,但实际上它只是比预期发生得早(通常是因为前瞻和分支预测)。当它真正被读取时,缓存是一致的,从那时起它就发生了变化。因此,当您阅读它时,该值并不旧,但当您需要它时,它就已经存在了。你读得太早了。 :-(
或者同等地,它的编写时间晚于您的代码逻辑所认为的编写时间。
或者两者兼而有之。
无论如何,如果这是 C/C++,即使没有锁/屏障,您最终也会 (通常在几百个周期内,因为在 C/C++ 中,您可以使用 易失性(弱非线程易失性)来确保不会从寄存器中读取该值。 (现在有一个非一致的缓存!即寄存器)
在 C# 中,我对 CLR 的了解不够,无法知道一个值可以在寄存器中保留多长时间,也不知道如何确保从内存中真正重新读取。 我怀疑,只要变量访问没有完全被编译掉,您就会失去“弱”易失性,
您最终将用完寄存器(x86 没有太多可用的寄存器)并重新读取。 但不能保证我明白。
如果您可以将易失性读取限制在代码中经常但不太频繁的特定点(即在 while(things_to_do) 循环中开始下一个任务),那么这可能是你能做的最好的。
At the CPU level, yes every processor will eventually see the change to the memory address. Even without locks or memory barriers. Locks and barriers would just ensure that it all happened in a relative ordering (w.r.t other instructions) such that it appeared correct to your program.
The problem isn't cache-coherency (I hope Joe Duffy's book doesn't make that mistake). The caches stay conherent - it is just that this takes time, and the processors don't bother to wait for that to happen - unless you enforce it. So instead, the processor moves on to the next instruction, which may or may not end up happening before the previous one (because each memory read/write make take a different amount of time. Ironically because of the time for the processors to agree on coherency, etc. - this causes some cachelines to be conherent faster than others (ie depending on whether the line was Modified, Exclusive, Shared, or Invalid it takes more or less work to get into the necessary state).)
So a read may appear old or from an out of date cache, but really it just happened earlier than expected (typically because of look-ahead and branch prediction). When it really was read, the cache was coherent, it has just changed since then. So the value wasn't old when you read it, but it is now when you need it. You just read it too soon. :-(
Or equivalently, it was written later than the logic of your code thought it would be written.
Or both.
Anyhow, if this was C/C++, even without locks/barriers, you would eventually get the updated values. (within a few hundred cycles typically, as memory takes about that long). In C/C++ you could use volatile (the weak non-thread volatile) to ensure that the value wasn't read from a register. (Now there's a non-coherent cache! ie the registers)
In C# I don't know enough about CLR to know how long a value could stay in a register, nor how to ensure you get a real re-read from memory. You've lost the 'weak' volatile.
I would suspect as long as the variable access doesn't completely get compiled away, you will eventually run out of registers (x86 doesn't have many to start with) and get your re-read.
But no guarantees that I see. If you could limit your volatile-read to a particular point in your code that was often, but not too often (ie start of next task in a while(things_to_do) loop) then that might be the best you can do.
这是当“最后一个编写者获胜”模式适用于这种情况时我使用的模式。我曾使用过 volatile 关键字,但在 Jeffery Richter 的代码示例中看到此模式后,我开始使用它。
This is the pattern I use when the 'last writer wins' pattern is applicable to the situation. I had used the
volatile
keyword, but after seeing this pattern in a code example from Jeffery Richter, I started using it.对于普通事物(例如内存映射设备),CPU/CPU 内部/之间进行的缓存一致性协议可确保共享该内存的不同线程获得一致的事物视图(即,如果我更改一个 CPU 中的内存位置,它将被其他在其缓存中拥有该内存的 CPU 看到)。在这方面,易失性将有助于确保优化器不会通过读取寄存器中缓存的值来优化内存访问(无论如何,这些访问总是通过缓存)。 C# 文档对此似乎非常清楚。同样,应用程序程序员通常不必自己处理缓存一致性。
我强烈建议阅读免费提供的论文“每个程序员都应该了解内存”。许多魔法在幕后进行,主要是为了防止搬起石头砸自己的脚。
For normal things (like memory-mapped devices), the cache-coherency protocols going on within/between the CPU/CPUs is there to ensure that different threads sharing that memory get a consistent view of things (i.e., if I change the value of a memory location in one CPU, it will be seen by other CPUs that have the memory in their caches). In this regard volatile will help to ensure that the optimizer doesn't optimize away memory accesses (which are always going through cache anyway) by, say, reading the value cached in a register. The C# documentation seems pretty clear on this. Again, the application programmer doesn't generally have to deal with cache-coherency themselves.
I highly recommend reading the freely available paper "What Every Programmer Should Know About Memory". A lot of magic goes on under the hood that mostly prevents shooting oneself in the foot.
在 C# 中,
int
类型是线程安全的。既然您说只有一个线程写入它,那么您就不应该争论什么是正确的值,并且只要您缓存本地副本,就永远不应该获取脏数据。
但是,您可能希望声明它易失性 操作系统线程是否将执行更新。
另请记住,某些操作不是原子的,如果您有多个写入器,则可能会导致问题。例如,即使如果您有多个编写器,
bool
类型也不会损坏,但像这样的语句:不是原子的。如果两个线程同时读取,就会出现竞争条件。
In C#, the
int
type is thread-safe.Since you said that only one thread writes to it, you should never have contention as to what is the proper value, and as long as you are caching a local copy, you should never get dirty data.
You may, however, want to declare it volatile if an OS thread will be doing the update.
Also keep in mind that some operations are not atomic, and can cause problems if you have more than one writer. For example, even though the
bool
type wont corrupt if you have more than one writer, a statement like this:is not atomic. If two threads read at the same time, you have a race condition.