编译器在处理易失性内存位置时必须遵循哪些规则?

发布于 2024-10-01 13:49:05 字数 292 浏览 6 评论 0原文

我知道当从由多个线程或进程写入的内存位置读取时,应该在该位置使用 volatile 关键字,如下例所示,但我想更多地了解它到底有哪些限制make for 编译器,基本上编译器在处理这种情况时必须遵循哪些规则,以及是否存在任何例外情况,尽管同时访问内存位置,但程序员可以忽略 volatile 关键字。

volatile SomeType * ptr = someAddress;
void someFunc(volatile const SomeType & input){
 //function body
}

I know when reading from a location of memory which is written to by several threads or processes the volatile keyword should be used for that location like some cases below but I want to know more about what restrictions does it really make for compiler and basically what rules does compiler have to follow when dealing with such case and is there any exceptional case where despite simultaneous access to a memory location the volatile keyword can be ignored by programmer.

volatile SomeType * ptr = someAddress;
void someFunc(volatile const SomeType & input){
 //function body
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

昔梦 2024-10-08 13:49:05

你所知道的都是假的。易失性用于同步线程之间的内存访问、应用任何类型的内存栅栏或任何此类内容。对易失性内存的操作不是原子的,并且不保证它们按任何特定顺序进行。 易失性是整个语言中最容易被误解的工具之一。 “易失性对于多线程编程来说几乎没用。

易失性的用途是与内存映射硬件、信号处理程序和setjmp机器代码指令进行交互。

它的使用方式也可以与 const 类似,这就是 Alexandrescu 在

编辑:我将尝试详细说明我刚才所说的内容。

假设您有一个类,它有一个指向无法更改的内容的指针。您自然可以将指针设置为 const:

class MyGizmo
{ 
public:
  const Foo* foo_;
};

const 在这里真正为您做什么?它对记忆没有任何作用。它不像旧软盘上的写保护标签。内存本身仍然可写。您只是无法通过 foo_ 指针对其进行写入。所以 const 实际上只是给编译器提供另一种方式让你知道你什么时候可能搞砸了的一种方式。如果您要编写以下代码:

gizmo.foo_->bar_ = 42;

...编译器将不允许这样做,因为它被标记为 const。显然,您可以通过使用 const_cast 抛弃 const 来解决这个问题,但如果您需要确信这是一个坏主意,那么对您没有任何帮助。 :)

Alexandrescu 对 volatile 的使用是完全相同的。它不会做任何事情来使内存在某种程度上“线程安全”以任何方式。它的作用是为编译器提供另一种方式让您知道何时可能搞砸了。您将真正“线程安全”的事物(通过使用实际的同步对象,如互斥体或信号量)标记为易失性。那么编译器将不允许您在非易失性上下文中使用它们。它会抛出一个编译器错误,然后您必须考虑并修复。您可以再次通过使用 const_cast 抛弃 volatile-ness 来绕过它,但这与抛弃 const-ness 一样邪恶。

我对您的建议是完全放弃 volatile 作为编写多线程应用程序的工具(编辑:),直到您真正知道自己在做什么以及为什么这样做。它有一些好处,但并不像大多数人想象的那样,如果你使用不当,你可能会编写出危险的不安全应用程序。

What you know is false. Volatile is not used to synchronize memory access between threads, apply any kind of memory fences, or anything of the sort. Operations on volatile memory are not atomic, and they are not guaranteed to be in any particular order. volatile is one of the most misunderstood facilities in the entire language. "Volatile is almost useless for multi-threadded programming."

What volatile is used for is interfacing with memory-mapped hardware, signal handlers and the setjmp machine code instruction.

It can also be used in a similar way that const is used, and this is how Alexandrescu uses it in this article. But make no mistake. volatile doesn't make your code magically thread safe. Used in this specific way, it is simply a tool that can help the compiler tell you where you might have messed up. It is still up to you to fix your mistakes, and volatile plays no role in fixing those mistakes.

EDIT: I'll try to elaborate a little bit on what I just said.

Suppose you have a class that has a pointer to something that cannot change. You might naturally make the pointer const:

class MyGizmo
{ 
public:
  const Foo* foo_;
};

What does const really do for you here? It doesn't do anything to the memory. It's not like the write-protect tab on an old floppy disc. The memory itself it still writable. You just can't write to it through the foo_ pointer. So const is really just a way to give the compiler another way to let you know when you might be messing up. If you were to write this code:

gizmo.foo_->bar_ = 42;

...the compiler won't allow it, because it's marked const. Obviously you can get around this by using const_cast to cast away the const-ness, but if you need to be convinced this is a bad idea then there is no help for you. :)

Alexandrescu's use of volatile is exactly the same. It doesn't do anything to make the memory somehow "thread safe" in any way whatsoever. What it does is it gives the compiler another way to let you know when you may have screwed up. You mark things that you have made truly "thread safe" (through the use of actual synchronization objects, like Mutexes or Semaphores) as being volatile. Then the compiler won't let you use them in a non-volatile context. It throws a compiler error you then have to think about and fix. You could again get around it by casting away the volatile-ness using const_cast, but this is just as Evil as casting away const-ness.

My advice to you is to completely abandon volatile as a tool in writing multithreadded applications (edit:) until you really know what you're doing and why. It has some benefit but not in the way that most people think, and if you use it incorrectly, you could write dangerously unsafe applications.

烟凡古楼 2024-10-08 13:49:05

在 C++11 及更高版本中,没有理由将 volatile 用作穷人的 std::atomicstd::memory_order_relaxed。只需将 std::atomicrelaxed 结合使用即可。在 volatile 按照您想要的方式工作的编译器上,带有 relaxedstd::atomic 将编译成大约相同的 asm,速度同样快。请参阅 何时将 volatile 与多线程一起使用?(从不)

这个答案关于一个单独的问题,即易失性的规则到底是什么。


它的定义并不像您希望的那样明确。 C++98 中的大多数相关标准都在第 1.9 节“程序执行”中:

抽象机的可观察行为是其对易失性数据的读取和写入以及对库I/O函数的调用的序列。

访问由易失性左值(3.10)指定的对象、修改对象、调用库 I/O 函数或调用执行任何这些操作的函数都是端效果,即执行环境状态的变化。表达式的求值可能会产生副作用。在执行序列中称为序列点的某些指定点,先前评估的所有副作用都应完成,并且后续评估的副作用不会发生。

一旦函数开始执行,在被调用函数执行完成之前,不会计算调用函数中的表达式。

当抽象机的处理因接收到信号而中断时,除 易失性 sig_atomic_t 类型以外的对象的值是未指定的,并且任何非 易失性的对象的值由处理程序修改的 sig_atomic_t 变为未定义。

每个具有自动存储持续时间(3.7.2)的对象的实例与其块中的每个条目相关联。这样的对象存在并在块执行期间和块挂起(通过调用函数或接收信号)期间保留其最后存储的值。

符合实施的最低要求是:

  • 在序列点,易失性对象是稳定的,因为先前的评估已完成且后续评估尚未发生。

  • 在程序终止时,写入文件的所有数据应与根据抽象语义执行程序可能产生的结果之一相同。

  • 交互设备的输入和输出动态应以提示消息实际出现在程序等待输入之前的方式发生。交互设备的构成是由实现定义的。

所以归结起来就是:

  • 编译器无法优化对易失性对象的读取或写入。对于像卡萨布兰卡提到的那样的简单情况,这可能会像您想象的那样工作。然而,在类似的情况下

     易失性 int a;
      整数b;
      b = a = 42;
    

    人们可以而且确实争论编译器是否必须像最后一行已读取一样生成代码

    <前><代码> a = 42; b = a;

    或者如果它可以,像通常一样(在没有易失性的情况下),生成

    <前><代码> a = 42; b = 42;

    (C++0x 可能已经解决了这一点,我还没有阅读全文。)

  • 编译器可能不会对发生在单独语句(每个分号是一个序列点),但完全允许相对于易失性对象重新安排对非易失性对象的访问。这是您不应尝试编写自己的自旋锁的众多原因之一,也是 John Dibling 警告您不要将 易失性 视为多线程编程的灵丹妙药的主要原因。

  • 说到线程,您会注意到标准文本中完全没有提及任何线程。这是因为C++98没有线程的概念。 (C++0x 确实如此,并且很可能指定它们与 volatile 的交互,但如果我是你,我不会假设任何人都实现了这些规则。)因此,没有< /em> 保证从一个线程对易失性对象的访问对于另一个线程是可见的。这是 易失性 对于多线程编程不是特别有用的另一个主要原因。


  • 无法保证 易失性 对象被整体访问,或者对 易失性 对象的修改避免触及内存中紧邻它们的其他内容。这在我引用的内容中并不明确,但由有关 volatile sig_atomic_t 的内容暗示 - 否则 sig_atomic_t 部分将是不必要的。这使得易失性对于访问I/O设备的用处大大低于其预期,并且用于嵌入式编程的编译器通常提供更强的保证,但这不是您可以指望的。 p>

  • 很多人尝试对具有易失性语义的对象进行特定访问,例如执行

    <前><代码> T x;
    *(易失性T *)&x = foo();

    这是合法的(因为它说“对象由易失性左值指定”,而不是“对象具有易失性类型”),但必须非常小心地完成,因为还记得我说过编译器完全允许相对于易失性访问重新排序非易失性访问吗? 即使是同一个对象也是如此(据我所知)。

  • 如果您担心对多个易失性值的访问在编译时重新排序,您需要了解序列点规则,该规则又长又复杂,并且我不会在这里引用它们,因为这个答案已经太长了,但是 这是一个很好的解释,只是稍微简化了。如果您发现自己需要担心 C 和 C++ 之间序列点规则的差异,那么您已经在某个地方搞砸了(例如,根据经验,永远不要重载 &&)。

    如果您还需要对 易失性 存储的可见性进行运行时排序(如 x86 以外的 ISA 上其他线程中的 易失性 加载所见),则需要内联asm 或屏障指令的内在函数...或者更好的是,将 std::atomic 与除 relaxed 以外的内存顺序一起使用,例如 std::memory_order_acquire > 和 std::memory_order_release。 (这些排序在 x86 上仍然是“免费”的,但将在具有弱排序内存模型的非 x86 上使用特殊的加载/存储指令或屏障。)

    std::atomic 还具有能够在线程之间建立发生前同步的巨大优势,例如可以释放存储 data_ready 标志,以便读者可以获取加载,然后(如果标志为真)访问普通数组。 (MSVC 历史上提供了易失性获取和释放语义 所以它可以执行此操作,/volatile:ms 启用此行为,/volatile:iso 禁用该额外排序。)

In C++11 and later, there's no reason to use volatile as a poor-man's std::atomic with std::memory_order_relaxed. Just use std::atomic with relaxed. On compilers where volatile works the way you wanted, std::atomic with relaxed will compile to about the same asm which is equally fast. See When to use volatile with multi threading? (never)

This answer is about the separate question of what the rule are exactly for volatile.


It's not as well defined as you probably want it to be. Most of the relevant standardese from C++98 is in section 1.9, "Program Execution":

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.

Accessing an object designated by a volatile lvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression might produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

Once the execution of a function begins, no expressions from the calling function are evaluated until execution of the called function has completed.

When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects with type other than volatile sig_atomic_t are unspecified, and the value of any object not of volatile sig_atomic_t that is modified by the handler becomes undefined.

An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal).

The least requirements on a conforming implementation are:

  • At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred.

  • At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

  • The input and output dynamics of interactive devices shall take place in such a fashion that prompting messages actually appear prior to a program waiting for input. What constitutes an interactive device is implementation-defined.

So what that boils down to is:

  • The compiler cannot optimize away reads or writes to volatile objects. For simple cases like the one casablanca mentioned, that works the way you might think. However, in cases like

      volatile int a;
      int b;
      b = a = 42;
    

    people can and do argue about whether the compiler has to generate code as if the last line had read

      a = 42; b = a;
    

    or if it can, as it normally would (in the absence of volatile), generate

      a = 42; b = 42;
    

    (C++0x may have addressed this point, I haven't read the whole thing.)

  • The compiler may not reorder operations on two different volatile objects that occur in separate statements (every semicolon is a sequence point) but it is totally allowed to rearrange accesses to non-volatile objects relative to volatile ones. This is one of the many reasons why you should not try to write your own spinlocks, and is the primary reason why John Dibling is warning you not to treat volatile as a panacea for multithreaded programming.

  • Speaking of threads, you will have noticed the complete absence of any mention of threads in the standards text. That is because C++98 has no concept of threads. (C++0x does, and may well specify their interaction with volatile, but I wouldn't be assuming anyone implements those rules yet if I were you.) Therefore, there is no guarantee that accesses to volatile objects from one thread are visible to another thread. This is the other major reason volatile is not especially useful for multithreaded programming.

  • There is no guarantee that volatile objects are accessed in one piece, or that modifications to volatile objects avoid touching other things right next to them in memory. This is not explicit in what I quoted but is implied by the stuff about volatile sig_atomic_t -- the sig_atomic_t part would be unnecessary otherwise. This makes volatile substantially less useful for access to I/O devices than it was probably intended to be, and compilers marketed for embedded programming often offer stronger guarantees, but it's not something you can count on.

  • Lots of people try to make specific accesses to objects have volatile semantics, e.g. doing

      T x;
      *(volatile T *)&x = foo();
    

    This is legit (because it says "object designated by a volatile lvalue" and not "object with a volatile type") but has to be done with great care, because remember what I said about the compiler being totally allowed to reorder non-volatile accesses relative to volatile ones? That goes even if it's the same object (as far as I know anyway).

  • If you are worried about compile-time reordering of accesses to more than one volatile value, you need to understand the sequence point rules, which are long and complicated and I'm not going to quote them here because this answer is already too long, but here's a good explanation which is only a little simplified. If you find yourself needing to worry about the differences in the sequence point rules between C and C++ you have already screwed up somewhere (for instance, as a rule of thumb, never overload &&).

    If you also need run-time ordering of the visibility of volatile stores as seen by volatile loads in other threads on ISAs other than x86, you'd need inline asm or intrinsics for barrier instructions... Or better, use std::atomic with a memory order other than relaxed, e.g. std::memory_order_acquire and std::memory_order_release. (Those orderings are still "free" on x86, but will use special load/store instructions or barriers on non-x86 with weakly ordered memory models.)

    std::atomic also has the huge advantage of being able to establish happens-before synchronization between threads, e.g. making it possible to release-store a data_ready flag so readers can acquire-load and then (if the flag is true) access a plain array. (MSVC historically gave volatile acquire and release semantics so it could do this. /volatile:ms enables this behaviour, /volatile:iso disables that extra ordering.)

雅心素梦 2024-10-08 13:49:05

易失性排除的一个特殊且非常常见的优化是将内存中的值缓存到寄存器中,并使用寄存器进行重复访问(因为这比每次都返回内存要快得多) )。

相反,编译器每次都必须从内存中获取值(根据 Zach 的提示,我应该说“每次”都受到序列点的限制)。

写入序列也不能使用寄存器,并且只能稍后将最终值写回:每次写入都必须被推送到内存。

为什么这有用?在某些架构上,某些 IO 设备将其输入或输出映射到内存位置(即写入该位置的字节实际上在串行线上输出)。如果编译器将其中一些写入重定向到仅偶尔刷新的寄存器,则大多数字节不会进入串行线路。不好。使用易失性可以防止这种情况。

A particular and very common optimization that is ruled out by volatile is to cache a value from memory into a register, and use the register for repeated access (because this is much faster than going back to memory every time).

Instead the compiler must fetch the value from memory every time (taking a hint from Zach, I should say that "every time" is bounded by sequence points).

Nor can a sequence of writes make use of a register and only write the final value back later on: every write must be pushed out to memory.

Why is this useful? On some architectures certain IO devices map their inputs or outputs to a memory location (i.e. a byte written to that location actually goes out on the serial line). If the compiler redirects some of those writes to a register that is only flushed occasionally then most of the bytes won't go onto the serial line. Not good. Using volatile prevents this situation.

一人独醉 2024-10-08 13:49:05

将变量声明为易失性意味着编译器无法对它本来可以做的值做出任何假设,从而阻止编译器应用各种优化。本质上,它强制编译器在每次访问时重新从内存中读取值,即使正常的代码流不会更改该值。例如:

int *i = ...;
cout << *i; // line A
// ... (some code that doesn't use i)
cout << *i; // line B

在这种情况下,编译器通常会假设由于 i 处的值在中间没有被修改,因此可以保留 A 行的值(比如在寄存器中)并打印B 中的值相同。但是,如果将 i 标记为 易失性,则告诉编译器某些外部源可能修改了 i< 处的值/code> 位于 A 行和 B 行之间,因此编译器必须从内存中重新获取当前值。

Declaring a variable as volatile means the compiler can't make any assumptions about the value that it could have done otherwise, and hence prevents the compiler from applying various optimizations. Essentially it forces the compiler to re-read the value from memory on each access, even if the normal flow of code doesn't change the value. For example:

int *i = ...;
cout << *i; // line A
// ... (some code that doesn't use i)
cout << *i; // line B

In this case, the compiler would normally assume that since the value at i wasn't modified in between, it's okay to retain the value from line A (say in a register) and print the same value in B. However, if you mark i as volatile, you're telling the compiler that some external source could have possibly modified the value at i between line A and B, so the compiler must re-fetch the current value from memory.

送你一个梦 2024-10-08 13:49:05

不允许编译器优化循环中对易失性对象的读取,否则它通常会这样做(即 strlen())。

它通常在嵌入式编程中用于读取固定地址处的硬件注册表,并且该值可能会意外更改。 (与“正常”内存相反,除非由程序本身写入,否则它不会改变......)

这就是它的主要目的。

它还可用于确保一个线程看到另一个线程写入的值的变化,但它绝不能保证读/写所述对象时的原子性。

The compiler is not allowed to optimize away reads of a volatile object in a loop, which otherwise it'd normally do (i.e. strlen()).

It's commonly used in embedded programming when reading a hardware registry at a fixed address, and that value may change unexpectedly. (In contrast with "normal" memory, that doesn't change unless written to by the program itself...)

That is it's main purpose.

It could also be used to make sure one thread see the change in a value written by another, but it in no way guarantees atomicity when reading/writing to said object.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文