Visual C 上的易失性变量和原子操作x86

发布于 2024-10-17 08:25:06 字数 604 浏览 1 评论 0原文

普通加载在 x86 上具有获取语义,普通存储具有释放语义,但是编译器仍然可以对指令重新排序。虽然栅栏和锁定指令(锁定的 xchg、锁定的 cmpxchg)会阻止硬件和编译器重新排序,但仍然需要使用编译器屏障来保护普通加载和存储。 Visual C++ 提供了 _ReadWriterBarrier() 函数,可以防止编译器重新排序,出于同样的原因,C++ 也提供了 volatile 关键字。我写下所有这些信息只是为了确保一切正确无误。所以上面写的都是正确的,是否有任何理由将其标记为将在受 _ReadWriteBarrier() 保护的函数中使用的易失性变量?

例如:

int load(int& var)
{
    _ReadWriteBarrier();
    T value = var;
    _ReadWriteBarrier();
    return value;
}

使该变量成为非易失性是否安全?据我了解,因为函数是受保护的,编译器内部无法进行重新排序。另一方面,Visual C++ 为易失性变量提供了特殊行为(与标准所做的不同),它使易失性读取和写入原子加载和存储,但我的目标是 x86,普通加载和存储在 x86 上应该是原子的无论如何,对吧?

提前致谢。

Plain load has acquire semantics on x86, plain store has release semantics, however compiler still can reorder instructions. While fences and locked instructions (locked xchg, locked cmpxchg) prevent both hardware and compiler from reordering, plain loads and stores are still necessary to protect with compiler barriers. Visual C++ provides _ReadWriterBarrier() function, which prevents compiler from reordering, also C++ provides volatile keyword for the same reason. I write all this information just to make sure that I get everything right. So all written above is true, is there any reason to mark as volatile variables which are going to be used in functions protected with _ReadWriteBarrier()?

For example:

int load(int& var)
{
    _ReadWriteBarrier();
    T value = var;
    _ReadWriteBarrier();
    return value;
}

Is it safe to make that variable non-volatile? As far as I understand it is, because function is protected and no reordering could be done by compiler inside. On the other hand Visual C++ provides special behavior for volatile variables (different from the one that standard does), it makes volatile reads and writes atomic loads and stores, but my target is x86 and plain loads and stores are supposed to be atomic on x86 anyway, right?

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

卖梦商人 2024-10-24 08:25:06

C 语言中也提供了 volatile 关键字。 “易失性”经常在嵌入式系统中使用,特别是当变量的值可能随时更改时(代码不采取任何操作)三种常见情况包括从内存映射的外设寄存器读取或由全局变量修改的全局变量。中断服务程序或多线程程序中的中断服务程序。

因此,这是最后一个可以将 volatile 视为与 _ReadWriteBarrier 类似的场景。

_ReadWriteBarrier 不是函数 - _ReadWriteBarrier 不会插入任何附加指令,并且不会阻止 CPU 重新排列读取和写入 — 它只会阻止编译器重新排列它们。 _ReadWriteBarrier 是为了防止编译器重新排序。

MemoryBarrier是为了防止CPU重排序!

编译器通常会重新排列指令...C++ 不包含对多线程程序的内置支持,因此编译器在重新排列代码时假定代码是单线程的。对于 MSVC,在代码中使用 _ReadWriteBarrier,这样编译器就不会跨它进行读写操作。

检查此链接以获取有关这些主题的更详细讨论
http://msdn.microsoft.com/en- us/library/ee418650(v=vs.85).aspx

关于您的代码片段 - 您不必使用 ReadWriteBarrier 作为 SYNC 原语 - 第一次调用 _ReadWriteBarrier 是不必要的。

使用 ReadWriteBarrier 时,您不必使用 volatile

您写道“它使 volatile 读取和写入原子加载和存储” - 我认为这样说是不行的,原子性和波动性是不同的。原子操作被认为是不可分割的 - ... http://www.yoda。 arachsys.com/csharp/threads/volatility.shtml

Volatile keyword is available in C too. "volatile" is often used in embedded System, especially when value of the variable may change at any time-without any action being taken by the code - three common scenarios include reading from a memory-mapped peripheral register or global variables either modified by an interrupt service routine or those within a multi-threaded program.

So it is the last scenario where volatile could be considered to be similar to _ReadWriteBarrier.

_ReadWriteBarrier is not a function - _ReadWriteBarrier does not insert any additional instructions, and it does not prevent the CPU from rearranging reads and writes— it only prevents the compiler from rearranging them. _ReadWriteBarrier is to prevent compiler reordering.

MemoryBarrier is to prevent CPU reordering!

A compiler typically rearranges instructions... C++ does not contain built-in support for multithreaded programs so the compiler assumes the code is single-threaded when reordering the code. With MSVC use ­_ReadWriteBarrier in the code, so that the compiler will not move reads and writes across it.

Check this link for more detailed discussion on those topics
http://msdn.microsoft.com/en-us/library/ee418650(v=vs.85).aspx

Regarding your code snippet - you do not have to use ReadWriteBarrier as a SYNC primitive - the first call to _ReadWriteBarrier is not necessary.

When using ReadWriteBarrier you do not have to use volatile

You wrote "it makes volatile reads and writes atomic loads and stores" - I don't think that is OK to say that, Atomicity and volatility are different. Atomic operations are considered to be indivisible - ... http://www.yoda.arachsys.com/csharp/threads/volatility.shtml

深海蓝天 2024-10-24 08:25:06

注:我不是这个话题的专家,我的一些说法“我在互联网上听到的”,但我想我仍然可以澄清一些误解。

[编辑] 一般来说,我只会在由 #ifdef 检查保护的隔离的本地优化中依赖特定于平台的特性,例如 x86 原子读取和缺乏 OOOX目标平台,最好在 #else 路径中附带一个可移植的解决方案。

需要注意的事项

  • 读/写操作的原子性
  • 由于编译器优化而重新排序(这包括由于简单的寄存器缓存而被另一个线程看到的不同顺序)
  • CPU 中的乱序执行

可能的误解

1. 据我了解,因为函数是受保护的,内部编译器无法进行重新排序。
[编辑]澄清一下:_ReadWriteBarrier提供了针对指令重新排序的保护,但是,您必须超越该函数的范围。 _ReadWriteBarrier 已在 VS 2010 中修复以实现此目的,早期版本可能会被破坏(取决于它们实际执行的优化)。

优化不仅限于功能。有多种机制(自动内联、链接时代码生成)跨越函数甚至编译单元(并且可以提供比小范围寄存器缓存更重要的优化)。

2. Visual C++ [...] 使易失性读取和写入原子加载和存储,
你在哪里找到的? MSDN 表示,超出标准,将在读写周围设置内存屏障,不保证原子读取。

[编辑] 请注意,C#、Java、Delphi 等具有不同的内存 mdoels,并且可能会做出不同的保证。

3. 普通加载和存储在 x86 上应该是原子的,对吧?
不,他们不是。未对齐的读取不是原子的。如果它们对齐良好,它们恰好是原子的 - 我不会依赖这一事实,除非它是孤立的并且易于交换。否则你的“x86 的简化”就会成为该目标的锁定。

[编辑] 发生未对齐读取:

char * c = new char[sizeof(int)+1];
load(*(int *)c);      // allowed by standard to be unaligned
load(*(int *)(c+1));  // unaligned with most allocators

#pragma pack(push,1)
struct 
{
   char c;
   int  i;
} foo;
load(foo.i);         // caller said so
#pragma pack(pop)

如果您记得参数必须对齐并且您控制所有代码,那么这当然是所有学术内容。我不会再写这样的代码了,因为我经常被过去的懒惰所困扰。

4. 普通加载在 x86 上具有获取语义,普通存储具有释放语义
不。x86 处理器不使用乱序执行(或者更确切地说,没有可见的 OOOX - 我认为),但这并不能阻止优化器重新排序指令。

5. _ReadBarrier / _WriteBarrier / _ReadWriteBarrier 完成所有魔法
它们不会——它们只是阻止优化器重新排序。 MSDN 最终将其设为严重警告 对于 VS2010,但该信息显然适用于 previous版本也是如此


现在,回答你的问题。

我假设该代码片段的目的是传递任何变量 N,并加载它(原子方式?)。最直接的选择是互锁读取或(在 Visual C++ 2005 及更高版本上)易失性读取。

否则,在读取之前,您需要为编译器和 CPU 设置屏障,在 VC++ 客厅中,这将是:

int load(int& var)
{   
  // force Optimizer to complete all memory writes:
  // (Note that this had issues before VC++ 2010)
   _WriteBarrier();    

  // force CPU to settle all pending read/writes, and not to start new ones:
   MemoryBarrier();

   // now, read.
   int value = var;    
   return value;
}

Noe that _WriteBarrier 在 MSDN 中有第二个警告:
*在过去版本的 Visual C++ 编译器中,_ReadWriteBarrier 和 _WriteBarrier 函数仅在本地强制执行,不会影响调用树上的函数。这些函数现在在调用树中一直强制执行。*


希望这是正确的。 stackoverflowers,如果我错了,请纠正我。

Note: I am not an expert on this topic, some of my statements are "what I heard on the internet", but I think I csan still clear up some misconceptions.

[edit] In general, I would rely on platform-specifics such as x86 atomic reads and lack of OOOX only in isolated, local optimizations that are guarded by an #ifdef checking the target platform, ideally accompanied by a portable solution in the #else path.

Things to look out for

  • atomicity of read / write operations
  • reordering due to compiler optimizations (this includes a different order seen by another thread due to simple register caching)
  • out-of-order execution in the CPU

Possible misconceptions

1. As far as I understand it is, because function is protected and no reordering could be done by compiler inside.
[edit] To clarify: the _ReadWriteBarrier provides protection against instruction reordering, however, you have to look beyond the scope of the function. _ReadWriteBarrier has been fixed in VS 2010 to do that, earlier versions may be broken (depending on the optimizations they actually do).

Optimization isn't limited to functions. There are multiple mechanisms (automatic inlining, link time code generation) that span functions and even compilation units (and can provide much more significant optimizations than small-scoped register caching).

2. Visual C++ [...] makes volatile reads and writes atomic loads and stores,
Where did you find that? MSDN says that beyond the standard, will put memory barriers around reads and writes, no guarantee for atomic reads.

[edit] Note that C#, Java, Delphi etc. have different memory mdoels and may make different guarantees.

3. plain loads and stores are supposed to be atomic on x86 anyway, right?
No, they are not. Unaligned reads are not atomic. They happen to be atomic if they are well-aligned - a fact I'd not rely on unless it's isolated and easily exchanged. Otherwise your "simplificaiton fo x86" becomes a lockdown to that target.

[edit] Unaligned reads happen:

char * c = new char[sizeof(int)+1];
load(*(int *)c);      // allowed by standard to be unaligned
load(*(int *)(c+1));  // unaligned with most allocators

#pragma pack(push,1)
struct 
{
   char c;
   int  i;
} foo;
load(foo.i);         // caller said so
#pragma pack(pop)

This is of course all academic if you remember the parameter must be aligned, and you control all code. I wouldn't write such code anymore, because I've been bitten to often by laziness of the past.

4. Plain load has acquire semantics on x86, plain store has release semantics
No. x86 processors do not use out-of-order execution (or rather, no visible OOOX - I think), but this doesn't stop the optimizer from reordering instructions.

5. _ReadBarrier / _WriteBarrier / _ReadWriteBarrier do all the magic
They don't - they just prevent reordering by the optimizer. MSDN finally made it a big bad warning for VS2010, but the information apparently applies to previous versions as well.


Now, to your question.

I assume the purpose of the snippet is to pass any variable N, and load it (atomically?) The straightforward choice would be an interlocked read or (on Visual C++ 2005 and later) a volatile read.

Otherwise you'd need a barrier for both compiler and CPU before the read, in VC++ parlor this would be:

int load(int& var)
{   
  // force Optimizer to complete all memory writes:
  // (Note that this had issues before VC++ 2010)
   _WriteBarrier();    

  // force CPU to settle all pending read/writes, and not to start new ones:
   MemoryBarrier();

   // now, read.
   int value = var;    
   return value;
}

Noe that _WriteBarrier has a second warning in MSDN:
*In past versions of the Visual C++ compiler, the _ReadWriteBarrier and _WriteBarrier functions were enforced only locally and did not affect functions up the call tree. These functions are now enforced all the way up the call tree.*


I hope that is correct. stackoverflowers, please correct me if I'm wrong.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文