C# 中什么时候应该使用 volatile 关键字?

发布于 2024-07-06 01:06:35 字数 239 浏览 10 评论 0原文

任何人都可以对 volatile< 提供一个很好的解释C# 中的 /code>关键字? 它解决了哪些问题,没有解决哪些问题? 在什么情况下它可以让我不用使用锁定?

Can anyone provide a good explanation of the volatile keyword in C#? Which problems does it solve and which it doesn't? In which cases will it save me the use of locking?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

雪化雨蝶 2024-07-13 01:06:35

来自文档

在多处理器系统上,易失性读取操作不能保证获得任何处理器写入该内存位置的最新值。

对我来说,这听起来像其他一些答案所说的那样,您不能保证“获得最新的价值”。 来自规范< /a>,唯一的保证似乎是:

volatile int v;

x = 5;
print(y);
...
v = 3; // previous read/write operations on this thread cannot be moved after this (even if there are no dependencies)

...

print(v); // following read/write operations on this thread cannot be moved before this (even if there are no dependencies)
x = 5;
print(y);
...

随心所欲。 例如,这意味着如果线程 1 执行多次易失性写入:

// thread 1
v1 = 1
...
v2 = 2

那么线程 2 将保证看到 v1 设置为 1,然后 v2被设置为2。 但例如不能保证线程间顺序。 Albahari 中的示例:

// thread 1
volatile int v1 = 0;
v1 = 1;
print(v2)

// thread 2
volatile int v2 = 0;
v2 = 1;
print(v1);

您可能会看到:

0
0

打印,因为读取可以移动到写道。

该规范也没有提及任何有关缓存的内容。 但是文档确实在注释中提及:

易失性读取和写入可确保将值读取或写入内存而不是缓存(例如,在处理器寄存器中)。

如果为真,则意味着您可以使用它们在线程之间进行同步,如下所示:

// main thread
v = false // at t = 0s

// worker thread 1
...
// value is written directly to memory
v = true // at t = 1s
...

// worker thread 2
...
// latest value from memory is retrieved
if (v)
{
  print("this is guaranteed to be printed after t = 1s")
}
...

这似乎与之前的说明相矛盾,即您不能假设您正在获取最新值。

IMO 似乎最好在日常编程中避免 易失性 ,并且仅在其特定的弱需求提供一些性能优化而不是更强大的同步机制(如锁)的情况下才谨慎使用它。

From the documentation:

On a multiprocessor system, a volatile read operation does not guarantee to obtain the latest value written to that memory location by any processor.

This to me sounds like you are NOT guaranteed to "get the most up-to-date value" as some of the other answers say. From the specification, the only guarantees seem to be:

volatile int v;

x = 5;
print(y);
...
v = 3; // previous read/write operations on this thread cannot be moved after this (even if there are no dependencies)

...

print(v); // following read/write operations on this thread cannot be moved before this (even if there are no dependencies)
x = 5;
print(y);
...

Make of this what you will. This for example means that if thread 1 does multiple volatile writes:

// thread 1
v1 = 1
...
v2 = 2

Then thread 2 will be guaranteed to see v1 being set to 1 and THEN v2 being set to 2. But for example inter-thread ordering is not guaranteed. Example from Albahari:

// thread 1
volatile int v1 = 0;
v1 = 1;
print(v2)

// thread 2
volatile int v2 = 0;
v2 = 1;
print(v1);

You may see:

0
0

printed since reads can be moved before the writes.

The specification also does not say anything about caching. But the docs do mention it in a note:

Volatile reads and writes ensure that a value is read or written to memory and not cached (for example, in a processor register).

Which, if true, means you could use them for synchronization between threads like this:

// main thread
v = false // at t = 0s

// worker thread 1
...
// value is written directly to memory
v = true // at t = 1s
...

// worker thread 2
...
// latest value from memory is retrieved
if (v)
{
  print("this is guaranteed to be printed after t = 1s")
}
...

Which seems to be contradicting the earlier note that you cannot assume you are getting the latest value.

IMO it seems best to avoid volatile in everyday programming and only use it carefully for the case where its specific weak requirements provide some performance optimization over stronger synchronization mechanism like locks.

穿越时光隧道 2024-07-13 01:06:35

综上所述,问题的正确答案是:
如果您的代码在 2.0 运行时或更高版本中运行,则几乎从不需要 volatile 关键字,如果不必要地使用,则弊大于利。 IE 永远不要使用它。 但在运行时的早期版本中,需要对静态字段进行正确的双重检查锁定。 特别是其类具有静态类初始化代码的静态字段。

So to sum up all this, the correct answer to the question is:
If your code is running in the 2.0 runtime or later, the volatile keyword is almost never needed and does more harm than good if used unnecessarily. I.E. Don't ever use it. BUT in earlier versions of the runtime, it IS needed for proper double check locking on static fields. Specifically static fields whose class has static class initialization code.

嘿哥们儿 2024-07-13 01:06:35

下面是一个示例,其中 volatile关键字有效地用于其预期目的。 假设我们想要实现 Task类。 我们的类仅包含两个字段,_completed_result,并且仅支持两种操作:设置 _result 并仅在其为 时读取它_已完成。 让我们看看这个简单类型的无锁实现:

public class MyTask<TResult>
{
    private volatile bool _completed;
    private TResult _result;

    public void UnsafeSetResult(TResult result)
    {
        _result = result;
        _completed = true;
    }

    public bool TryGetResult(out TResult result)
    {
        if (_completed)
        {
            result = _result;
            return true;
        }
        result = default;
        return false;
    }
}

UnsafeSetResult 方法被命名为“不安全”,因为它在 MyTask 的整个生命周期内只应调用一次。代码>实例。 我们将在本答案的最后处理这个限制,但现在我们假设这个规则可以简单地通过我们的应用程序的结构来强制执行。 例如,我们可能有一个专用线程负责创建 MyTask 对象,并在它们上调用 UnsafeSetResult

我们必须回答的重要问题是:为什么 TryGetResult 在多线程环境中可以正确工作? 是什么阻止调用 TryGetResult 的线程接收 撕裂 TResult 值? 答案在于 _completed 字段被声明为 易失性,以及 _completed_results 字段的顺序被分配和读取。 我们希望确保在_result的值完全存储在字段中之前,没有线程会尝试读取它。 请注意,我们对 TResult 泛型参数没有施加任何限制,因此完全有可能是一个大型结构,例如 decimalInt128,一个 ValueTuple等。如果我们不小心,线程可能会读取半写入的 _result 值,其中一半字节仍未初始化。 这就是所谓的“撕裂”,这是我们想要防止的灾难。

我们可以通过将 _completed 设置为 true 在分配 _result并读取来确保不会发生撕裂_result之后我们确认_completedtrue_completed 字段上的 volatile 关键字可确保 C# 编译器和 .NET Jitter 都不会发出以不同顺序访问/修改计算机内存的 CPU 指令。 如果您不知道,C# 编译器和 .NET Jitter 以及 CPU 处理器是 允许重新排序程序的指令,前提是这种重新排序不会影响程序在单线程上运行时的行为。

让我们看看到底有什么影响>volatileUnsafeSetResult 方法上有:

它插入一个内存屏障,防止处理器重新排序内存操作,如下所示:如果读取或写入出现在代码中的此方法之前,则处理器无法将其移到此方法之后。

换句话说,_result = result; 无法移动到 _completed = true; 之后。

现在让我们看看到底对< code>volatile 在 TryGetResult 方法上有:

它插入一个内存屏障,防止处理器重新排序内存操作,如下所示:如果代码中此方法之后出现读取或写入,则处理器无法将其移动到此方法之前。

换句话说,result = _result; 不能移动到 if (_completed) 之前。

正如您所看到的,我们在这两种方法中都需要内存屏障。 如果我们删除这两个内存屏障中的任何一个,我们程序的正确性就不再得到保证。

最后让我们看看如何实现 UnsafeSetResult 的线程安全版本。 除了 bool 字段的 false/true 值之外,我们还需要一个过渡性的“保留”状态。 因此,我们将使用 易失性 int _state 字段来代替:

public class MyTask<TResult>
{
    private volatile int _state; // 0:incomplete, 1:reserved, 2:completed
    private TResult _result;

    public bool TrySetResult(TResult result)
    {
        if (Interlocked.CompareExchange(ref _state, 1, 0) == 0)
        {
            _result = result;
            _state = 2;
            return true;
        }
        return false;
    }

    public bool TryGetResult(out TResult result)
    {
        if (_state == 2)
        {
            result = _result;
            return true;
        }
        result = default;
        return false;
    }
}

实际的 Task 类具有类似的内部 易失性 int m_stateFlags; 字段( 源代码),其中一位以原子方式翻转(CompletionReserved源代码)在分配内部 TResult 之前? m_result; 字段。

Here is an example where the volatile keyword is used effectively for its intended purpose. Let's say that we want to implement a rudimentary version of the Task<TResult> class. Our class contains only two fields, _completed and _result, and supports only two operations: setting the _result and reading it only if it's _completed. Let's see a lock-free implementation of this simple type:

public class MyTask<TResult>
{
    private volatile bool _completed;
    private TResult _result;

    public void UnsafeSetResult(TResult result)
    {
        _result = result;
        _completed = true;
    }

    public bool TryGetResult(out TResult result)
    {
        if (_completed)
        {
            result = _result;
            return true;
        }
        result = default;
        return false;
    }
}

The UnsafeSetResult method is named "unsafe", because it should be called only once during the whole lifetime of a MyTask<TResult> instance. We will deal with this limitation at the end of this answer, but for now let's assume that this rule can be enforced simply by the structure of our application. For example we may have a single dedicated thread that is responsible for creating the MyTask<TResult> objects, and calling the UnsafeSetResult on them.

The important question that we have to answer is: why the TryGetResult works correcty in a multithreaded environment? What prevents a thread that calls the TryGetResult, to receive a torn TResult value? The answer lies on the _completed field being declared as volatile, and on the order that the _completed and _results fields are assigned and read. We want to ensure that no thread will attempt to read the _result, before its value has been completely stored in the field. Notice that we impose no limitation on the TResult generic parameter, so it is entirely possible to be a large struct, like a decimal, an Int128, a ValueTuple<long,long,long,long> etc. If we are not careful, a thread might read a half-writen _result value, with half of its bytes still uninitialized. This is called "tearing", and it's the catastrophe that we want to prevent.

We can ensure that tearing will not occur by setting the _completed to true after we assign the _result, and reading the _result after we have confirmed that the _completed is true. The volatile keyword on the _completed field ensures that neither the C# compiler, nor the .NET Jitter will emit CPU instructions that access/modify the computer memory in a different order. In case you didn't know, the C# compiler and the .NET Jitter, as well the CPU processor, are allowed to reorder the instructions of a program, provided that this reordering does not affect the program's behavior when running on a single thread.

Let's see precisely what effect the volatile has on the UnsafeSetResult method:

It inserts a memory barrier that prevents the processor from reordering memory operations as follows: If a read or write appears before this method in the code, the processor cannot move it after this method.

In other words the _result = result; cannot be moved after the _completed = true;.

Now let's see precisely what effect the volatile has on the TryGetResult method:

It inserts a memory barrier that prevents the processor from reordering memory operations as follows: If a read or write appears after this method in the code, the processor cannot move it before this method.

In other words the result = _result; cannot be moved before the if (_completed).

As you can see we need memory barriers in both methods. If we remove any one of the two memory barriers, the correctness of our program is no longer guaranteed.

Finally let's see how we could implement a thread-safe version of the UnsafeSetResult. We'll need a transitional "reserved" state, beyond the false/true values of a bool field. So we'll use a volatile int _state field instead:

public class MyTask<TResult>
{
    private volatile int _state; // 0:incomplete, 1:reserved, 2:completed
    private TResult _result;

    public bool TrySetResult(TResult result)
    {
        if (Interlocked.CompareExchange(ref _state, 1, 0) == 0)
        {
            _result = result;
            _state = 2;
            return true;
        }
        return false;
    }

    public bool TryGetResult(out TResult result)
    {
        if (_state == 2)
        {
            result = _result;
            return true;
        }
        result = default;
        return false;
    }
}

The actual Task<TResult> class has a similar internal volatile int m_stateFlags; field (source code), that has one of its bits flipped atomically (CompletionReserved, source code) before assigning the internal TResult? m_result; field.

你的笑 2024-07-13 01:06:35

编译器有时会更改代码中语句的顺序以对其进行优化。 通常这在单线程环境中不是问题,但在多线程环境中可能会出现问题。 请参阅以下示例:

 private static int _flag = 0;
 private static int _value = 0;

 var t1 = Task.Run(() =>
 {
     _value = 10; /* compiler could switch these lines */
     _flag = 5;
 });

 var t2 = Task.Run(() =>
 {
     if (_flag == 5)
     {
         Console.WriteLine("Value: {0}", _value);
     }
 });

如果运行 t1 和 t2,您会期望没有输出或结果为“Value: 10”。 编译器可能会在 t1 函数内切换行。 如果 t2 然后执行,则 _flag 的值为 5,但 _value 的值为 0。因此预期的逻辑可能会被破坏。

要解决此问题,您可以使用可应用于该字段的易失性关键字。 此语句禁用编译器优化,以便您可以在代码中强制执行正确的顺序。

private static volatile int _flag = 0;

仅当您确实需要时才应该使用易失性,因为它会禁用某些编译器优化,从而会损害性能。 并非所有 .NET 语言都支持它(Visual Basic 不支持它),因此它阻碍了语言的互操作性。

The compiler sometimes changes the order of statements in code to optimize it. Normally this is not a problem in single-threaded environment, but it might be an issue in multi-threaded environment. See following example:

 private static int _flag = 0;
 private static int _value = 0;

 var t1 = Task.Run(() =>
 {
     _value = 10; /* compiler could switch these lines */
     _flag = 5;
 });

 var t2 = Task.Run(() =>
 {
     if (_flag == 5)
     {
         Console.WriteLine("Value: {0}", _value);
     }
 });

If you run t1 and t2, you would expect no output or "Value: 10" as the result. It could be that the compiler switches line inside t1 function. If t2 then executes, it could be that _flag has value of 5, but _value has 0. So expected logic could be broken.

To fix this you can use volatile keyword that you can apply to the field. This statement disables the compiler optimizations so you can force the correct order in you code.

private static volatile int _flag = 0;

You should use volatile only if you really need it, because it disables certain compiler optimizations, it will hurt performance. It's also not supported by all .NET languages (Visual Basic doesn't support it), so it hinders language interoperability.

冰火雁神 2024-07-13 01:06:35

我找到了这篇文章 非常有帮助:

当您将一个对象或变量标记为易失性时,它就成为易失性读取和写入的候选者。 应该注意的是,在 C# 中,所有内存写入都是易失性的,无论您是将数据写入易失性对象还是非易失性对象。 然而,当您读取数据时,就会出现歧义。 当您读取非易失性数据时,执行线程可能会也可能不会总是获得最新值。 如果对象是易失性的,线程总是获取最新的值

I found this article by Joydip Kanjilal very helpful:

When you mark an object or a variable as volatile, it becomes a candidate for volatile reads and writes. It should be noted that in C# all memory writes are volatile irrespective of whether you are writing data to a volatile or a non-volatile object. However, the ambiguity happens when you are reading data. When you are reading data that is non-volatile, the executing thread may or may not always get the latest value. If the object is volatile, the thread always gets the most up-to-date value

初相遇 2024-07-13 01:06:35

只需查看易失性关键字的官方页面即可您可以查看典型用法的示例。

public class Worker
{
    public void DoWork()
    {
        bool work = false;
        while (!_shouldStop)
        {
            work = !work; // simulate some work
        }
        Console.WriteLine("Worker thread: terminating gracefully.");
    }
    public void RequestStop()
    {
        _shouldStop = true;
    }
    
    private volatile bool _shouldStop;
}

将 volatile 修饰符添加到 _shouldStop 的声明中后,您将始终获得相同的结果。 但是,如果 _shouldStop 成员上没有该修饰符,则行为是不可预测的。

所以这绝对不是彻头彻尾的疯狂

存在负责 CPU 缓存一致性的缓存一致性

另外,如果CPU采用强内存模型(作为 x86)

因此,易失性字段的读写不需要 x86 上的特殊指令:普通读写(例如使用 MOV 指令)就足够了。

C# 5.0 规范(第 10.5.3 章)中的示例

using System;
using System.Threading;
class Test
{
    public static int result;   
    public static volatile bool finished;
    static void Thread2() {
        result = 143;    
        finished = true; 
    }
    static void Main() {

        finished = false;
        new Thread(new ThreadStart(Thread2)).Start();

        for (;;) {
            if (finished) {
                Console.WriteLine("result = {0}", result);
                return;
            }
        }
    }
}

生成输出:result = 143

如果字段finished没有被声明为易失性,那么在存储完成后,主线程可以看到存储结果,因此主线程可以从该字段读取值0结果。

易失性行为依赖于平台,因此您应该根据情况需要时始终考虑使用易失性,以确保它满足您的需求。

即使易失性也无法阻止(各种)重新排序(C# - 理论与实践中的 C# 内存模型,第 2 部分)

尽管对 A 的写入是易失性的并且从 A_Won 的读取也是易失性的,但栅栏都是单向的,并且实际上允许这种重新排序。

因此,我相信,如果您想知道何时使用易失性(与lock vs Interlocked),您应该熟悉内存栅栏(全内存、半内存) )和同步的需要。 然后你自己就会得到宝贵的答案,这对你有好处。

Simply looking into the official page for volatile keyword you can see an example of typical usage.

public class Worker
{
    public void DoWork()
    {
        bool work = false;
        while (!_shouldStop)
        {
            work = !work; // simulate some work
        }
        Console.WriteLine("Worker thread: terminating gracefully.");
    }
    public void RequestStop()
    {
        _shouldStop = true;
    }
    
    private volatile bool _shouldStop;
}

With the volatile modifier added to the declaration of _shouldStop in place, you'll always get the same results. However, without that modifier on the _shouldStop member, the behavior is unpredictable.

So this is definitely not something downright crazy.

There exists Cache coherence that is responsible for CPU caches consistency.

Also if CPU employs strong memory model (as x86)

As a result, reads and writes of volatile fields require no special instructions on the x86: Ordinary reads and writes (for example, using the MOV instruction) are sufficient.

Example from C# 5.0 specification (chapter 10.5.3)

using System;
using System.Threading;
class Test
{
    public static int result;   
    public static volatile bool finished;
    static void Thread2() {
        result = 143;    
        finished = true; 
    }
    static void Main() {

        finished = false;
        new Thread(new ThreadStart(Thread2)).Start();

        for (;;) {
            if (finished) {
                Console.WriteLine("result = {0}", result);
                return;
            }
        }
    }
}

produces the output: result = 143

If the field finished had not been declared volatile, then it would be permissible for the store to result to be visible to the main thread after the store to finished, and hence for the main thread to read the value 0 from the field result.

Volatile behavior is platform dependent so you should always consider using volatile when needed by case to be sure it satisfies your needs.

Even volatile could not prevent (all kind of) reordering (C# - The C# Memory Model in Theory and Practice, Part 2)

Even though the write to A is volatile and the read from A_Won is also volatile, the fences are both one-directional, and in fact allow this reordering.

So I believe if you want to know when to use volatile (vs lock vs Interlocked) you should get familiar with memory fences (full, half) and needs of a synchronization. Then you get your precious answer yourself for your good.

懒猫 2024-07-13 01:06:35

CLR 喜欢优化指令,因此当您访问代码中的字段时,它可能并不总是访问该字段的当前值(它可能来自堆栈等)。 将字段标记为易失性可确保指令访问该字段的当前值。 当程序中的并发线程或操作系统中运行的某些其他代码可以修改值(在非锁定场景中)时,这非常有用。

显然你会失去一些优化,但它确实使代码更加简单。

The CLR likes to optimize instructions, so when you access a field in code it might not always access the current value of the field (it might be from the stack, etc). Marking a field as volatile ensures that the current value of the field is accessed by the instruction. This is useful when the value can be modified (in a non-locking scenario) by a concurrent thread in your program or some other code running in the operating system.

You obviously lose some optimization, but it does keep the code more simple.

白色秋天 2024-07-13 01:06:35

来自 MSDN
易失性修饰符通常用于被多个线程访问而不使用lock语句来串行化访问的字段。 使用 volatile 修饰符可确保一个线程检索另一线程写入的最新值。

From MSDN:
The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access. Using the volatile modifier ensures that one thread retrieves the most up-to-date value written by another thread.

惯饮孤独 2024-07-13 01:06:35

有时,编译器会优化某个字段并使用寄存器来存储它。 如果线程 1 对字段进行写入,而另一个线程访问它,则由于更新存储在寄存器(而不是内存)中,因此第二个线程将获得陈旧数据。

您可以将 volatile 关键字视为对编译器说“我希望您将此值存储在内存中”。 这保证了第二个线程检索最新值。

Sometimes, the compiler will optimize a field and use a register to store it. If thread 1 does a write to the field and another thread accesses it, since the update was stored in a register (and not memory), the 2nd thread would get stale data.

You can think of the volatile keyword as saying to the compiler "I want you to store this value in memory". This guarantees that the 2nd thread retrieves the latest value.

反差帅 2024-07-13 01:06:35

如果您使用.NET 1.1,则在执行双重检查锁定时需要 volatile 关键字。 为什么? 因为在 .NET 2.0 之前,以下情况可能会导致第二个线程访问非 null 但尚未完全构造的对象:

  1. 线程 1 询问变量是否为 null。
    //if(this.foo == null)
  2. 线程1判断变量为null,因此进入锁。
    //lock(this.bar)
  3. 线程 1 再次询问变量是否为 null。
    //if(this.foo == null)
  4. 线程1仍然确定变量为null,因此它调用构造函数并将值赋给变量。
    //this.foo = new Foo();

在 .NET 2.0 之前,可以在构造函数完成运行之前为 this.foo 分配 Foo 的新实例。 在这种情况下,第二个线程可能会进入(在线程 1 调用 Foo 的构造函数期间)并经历以下情况:

  1. 线程 2 询问变量是否为 null。
    //if(this.foo == null)
  2. 线程 2 确定该变量不为 null,因此尝试使用它。
    //this.foo.MakeFoo()

在.NET 2.0之前,您可以将this.foo声明为易失性的以解决此问题。 从 .NET 2.0 开始,您不再需要使用 volatile 关键字来完成双重检查锁定。

维基百科实际上有一篇关于双重检查锁定的好文章,并简要介绍了这个主题:
http://en.wikipedia.org/wiki/Double-checked_locking

If you are using .NET 1.1, the volatile keyword is needed when doing double checked locking. Why? Because prior to .NET 2.0, the following scenario could cause a second thread to access an non-null, yet not fully constructed object:

  1. Thread 1 asks if a variable is null.
    //if(this.foo == null)
  2. Thread 1 determines the variable is null, so enters a lock.
    //lock(this.bar)
  3. Thread 1 asks AGAIN if the variable is null.
    //if(this.foo == null)
  4. Thread 1 still determines the variable is null, so it calls a constructor and assigns the value to the variable.
    //this.foo = new Foo();

Prior to .NET 2.0, this.foo could be assigned the new instance of Foo, before the constructor was finished running. In this case, a second thread could come in (during thread 1's call to Foo's constructor) and experience the following:

  1. Thread 2 asks if variable is null.
    //if(this.foo == null)
  2. Thread 2 determines the variable is NOT null, so tries to use it.
    //this.foo.MakeFoo()

Prior to .NET 2.0, you could declare this.foo as being volatile to get around this problem. Since .NET 2.0, you no longer need to use the volatile keyword to accomplish double checked locking.

Wikipedia actually has a good article on Double Checked Locking, and briefly touches on this topic:
http://en.wikipedia.org/wiki/Double-checked_locking

深居我梦 2024-07-13 01:06:35

如果您想进一步了解 volatile 关键字的作用,请考虑以下程序(我使用的是 DevStudio 2005):

#include <iostream>
void main()
{
  int j = 0;
  for (int i = 0 ; i < 100 ; ++i)
  {
    j += i;
  }
  for (volatile int i = 0 ; i < 100 ; ++i)
  {
    j += i;
  }
  std::cout << j;
}

使用标准优化(发布)编译器设置,编译器创建以下汇编器 (IA32):

void main()
{
00401000  push        ecx  
  int j = 0;
00401001  xor         ecx,ecx 
  for (int i = 0 ; i < 100 ; ++i)
00401003  xor         eax,eax 
00401005  mov         edx,1 
0040100A  lea         ebx,[ebx] 
  {
    j += i;
00401010  add         ecx,eax 
00401012  add         eax,edx 
00401014  cmp         eax,64h 
00401017  jl          main+10h (401010h) 
  }
  for (volatile int i = 0 ; i < 100 ; ++i)
00401019  mov         dword ptr [esp],0 
00401020  mov         eax,dword ptr [esp] 
00401023  cmp         eax,64h 
00401026  jge         main+3Eh (40103Eh) 
00401028  jmp         main+30h (401030h) 
0040102A  lea         ebx,[ebx] 
  {
    j += i;
00401030  add         ecx,dword ptr [esp] 
00401033  add         dword ptr [esp],edx 
00401036  mov         eax,dword ptr [esp] 
00401039  cmp         eax,64h 
0040103C  jl          main+30h (401030h) 
  }
  std::cout << j;
0040103E  push        ecx  
0040103F  mov         ecx,dword ptr [__imp_std::cout (40203Ch)] 
00401045  call        dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (402038h)] 
}
0040104B  xor         eax,eax 
0040104D  pop         ecx  
0040104E  ret              

查看从输出中,编译器决定使用 ecx 寄存器来存储 j 变量的值。 对于非易失性循环(第一个),编译器已将 i 分配给 eax 寄存器。 非常坦率的。 不过,有一些有趣的位 - lea ebx,[ebx] 指令实际上是多字节 nop 指令,以便循环跳转到 16 字节对齐的内存地址。 另一种是使用 edx 来递增循环计数器,而不是使用 inc eax 指令。 与 inc reg 指令相比,add reg,reg 指令在一些 IA32 内核上具有较低的延迟,但从来没有更高的延迟。

现在使用易失性循环计数器进行循环。 计数器存储在 [esp] 中,并且 volatile 关键字告诉编译器应该始终从内存读取值或将值写入内存,并且永远不要将其分配给寄存器。 编译器甚至在更新计数器值时不执行加载/递增/存储三个不同的步骤(加载 eax、inc eax、保存 eax),而是直接在单个指令中修改内存(add mem ,注册)。 创建代码的方式可确保循环计数器的值在单个 CPU 内核的上下文中始终是最新的。 对数据的任何操作都不会导致损坏或数据丢失(因此不要使用加载/增量/存储,因为值可能会在增量期间发生变化,从而在存储中丢失)。 由于只有当前指令完成后才能处理中断,因此即使内存未对齐,数据也永远不会被损坏。

一旦您在系统中引入第二个 CPU,则 volatile 关键字将无法防止数据同时被另一个 CPU 更新。 在上面的示例中,您需要未对齐数据才能出现潜在的损坏。 如果数据无法以原子方式处理,则 volatile 关键字将无法防止潜在的损坏,例如,如果循环计数器的类型为 long long(64 位),则需要两次 32 位操作来更新值,在可能会发生中断并更改数据。

因此,易失性关键字仅适用于小于或等于本机寄存器大小的对齐数据,以便操作始终是原子的。

volatile 关键字被设计用于 IO 操作,其中 IO 会不断变化但具有恒定地址,例如内存映射的 UART 设备,并且编译器不应继续重用从该地址读取的第一个值。

如果您正在处理大数据或拥有多个 CPU,那么您将需要更高级别 (OS) 锁定系统来正确处理数据访问。

If you want to get slightly more technical about what the volatile keyword does, consider the following program (I'm using DevStudio 2005):

#include <iostream>
void main()
{
  int j = 0;
  for (int i = 0 ; i < 100 ; ++i)
  {
    j += i;
  }
  for (volatile int i = 0 ; i < 100 ; ++i)
  {
    j += i;
  }
  std::cout << j;
}

Using the standard optimised (release) compiler settings, the compiler creates the following assembler (IA32):

void main()
{
00401000  push        ecx  
  int j = 0;
00401001  xor         ecx,ecx 
  for (int i = 0 ; i < 100 ; ++i)
00401003  xor         eax,eax 
00401005  mov         edx,1 
0040100A  lea         ebx,[ebx] 
  {
    j += i;
00401010  add         ecx,eax 
00401012  add         eax,edx 
00401014  cmp         eax,64h 
00401017  jl          main+10h (401010h) 
  }
  for (volatile int i = 0 ; i < 100 ; ++i)
00401019  mov         dword ptr [esp],0 
00401020  mov         eax,dword ptr [esp] 
00401023  cmp         eax,64h 
00401026  jge         main+3Eh (40103Eh) 
00401028  jmp         main+30h (401030h) 
0040102A  lea         ebx,[ebx] 
  {
    j += i;
00401030  add         ecx,dword ptr [esp] 
00401033  add         dword ptr [esp],edx 
00401036  mov         eax,dword ptr [esp] 
00401039  cmp         eax,64h 
0040103C  jl          main+30h (401030h) 
  }
  std::cout << j;
0040103E  push        ecx  
0040103F  mov         ecx,dword ptr [__imp_std::cout (40203Ch)] 
00401045  call        dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (402038h)] 
}
0040104B  xor         eax,eax 
0040104D  pop         ecx  
0040104E  ret              

Looking at the output, the compiler has decided to use the ecx register to store the value of the j variable. For the non-volatile loop (the first) the compiler has assigned i to the eax register. Fairly straightforward. There are a couple of interesting bits though - the lea ebx,[ebx] instruction is effectively a multibyte nop instruction so that the loop jumps to a 16 byte aligned memory address. The other is the use of edx to increment the loop counter instead of using an inc eax instruction. The add reg,reg instruction has lower latency on a few IA32 cores compared to the inc reg instruction, but never has higher latency.

Now for the loop with the volatile loop counter. The counter is stored at [esp] and the volatile keyword tells the compiler the value should always be read from/written to memory and never assigned to a register. The compiler even goes so far as to not do a load/increment/store as three distinct steps (load eax, inc eax, save eax) when updating the counter value, instead the memory is directly modified in a single instruction (an add mem,reg). The way the code has been created ensures the value of the loop counter is always up-to-date within the context of a single CPU core. No operation on the data can result in corruption or data loss (hence not using the load/inc/store since the value can change during the inc thus being lost on the store). Since interrupts can only be serviced once the current instruction has completed, the data can never be corrupted, even with unaligned memory.

Once you introduce a second CPU to the system, the volatile keyword won't guard against the data being updated by another CPU at the same time. In the above example, you would need the data to be unaligned to get a potential corruption. The volatile keyword won't prevent potential corruption if the data cannot be handled atomically, for example, if the loop counter was of type long long (64 bits) then it would require two 32 bit operations to update the value, in the middle of which an interrupt can occur and change the data.

So, the volatile keyword is only good for aligned data which is less than or equal to the size of the native registers such that operations are always atomic.

The volatile keyword was conceived to be used with IO operations where the IO would be constantly changing but had a constant address, such as a memory mapped UART device, and the compiler shouldn't keep reusing the first value read from the address.

If you're handling large data or have multiple CPUs then you'll need a higher level (OS) locking system to handle the data access properly.

迷雾森÷林ヴ 2024-07-13 01:06:35

我认为没有比 Eric Lippert(原文中的强调):

在 C# 中,“易失性”不仅意味着“确保编译器和
抖动不执行任何代码重新排序或寄存器缓存
对此变量的优化”。它也意味着“告诉处理器
尽一切努力确保我正在阅读
最新值,即使这意味着停止其他处理器并使得
它们将主内存与其缓存同步”。

事实上,最后一点是谎言。 易失性读取的真正语义
并且写入比我在这里概述的要复杂得多; 在
事实上,他们实际上并不能保证每个处理器都会停止它所要做的事情
正在执行并更新主内存的缓存。 相反,他们提供
关于读取之前和之后的内存访问方式的保证较弱
可以观察到写入是相对于彼此有序的。
某些操作,例如创建新线程、输入锁或
使用 Interlocked 系列方法之一引入更强的
保证遵守顺序。 如果您想了解更多详情,
阅读 C# 4.0 规范的第 3.10 和 10.5.3 节。

坦率地说,我不鼓励你创建一个不稳定的领域。 易挥发的
字段表明您正在做一些彻底疯狂的事情:您
尝试在两个不同的线程上读取和写入相同的值
无需上锁。 锁保证内存读取或
观察到锁内部的修改是一致的,锁保证
一次只有一个线程访问给定的内存块,所以
在。 锁太慢的情况很多
很小,并且你得到错误代码的可能性
因为你不明白确切的内存模型是非常大的。 我
除了最琐碎的代码之外,不要尝试编写任何低锁代码
联锁操作的用法。 我将“易失性”的用法留给
真正的专家。

如需进一步阅读,请参阅:

I don't think there's a better person to answer this than Eric Lippert (emphasis in the original):

In C#, "volatile" means not only "make sure that the compiler and the
jitter do not perform any code reordering or register caching
optimizations on this variable". It also means "tell the processors to
do whatever it is they need to do to ensure that I am reading the
latest value, even if that means halting other processors and making
them synchronize main memory with their caches".

Actually, that last bit is a lie. The true semantics of volatile reads
and writes are considerably more complex than I've outlined here; in
fact they do not actually guarantee that every processor stops what it
is doing
and updates caches to/from main memory. Rather, they provide
weaker guarantees about how memory accesses before and after reads and
writes may be observed to be ordered with respect to each other
.
Certain operations such as creating a new thread, entering a lock, or
using one of the Interlocked family of methods introduce stronger
guarantees about observation of ordering. If you want more details,
read sections 3.10 and 10.5.3 of the C# 4.0 specification.

Frankly, I discourage you from ever making a volatile field. Volatile
fields are a sign that you are doing something downright crazy: you're
attempting to read and write the same value on two different threads
without putting a lock in place. Locks guarantee that memory read or
modified inside the lock is observed to be consistent, locks guarantee
that only one thread accesses a given chunk of memory at a time, and so
on. The number of situations in which a lock is too slow is very
small, and the probability that you are going to get the code wrong
because you don't understand the exact memory model is very large. I
don't attempt to write any low-lock code except for the most trivial
usages of Interlocked operations. I leave the usage of "volatile" to
real experts.

For further reading see:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文