原子指令

发布于 2024-08-11 23:40:44 字数 232 浏览 5 评论 0原文

原子指令是什么意思?

下面的内容如何变成原子的?

TestAndSet

int TestAndSet(int *x){
   register int temp = *x;
   *x = 1;
   return temp;
}

从软件的角度来看,如果不想使用非阻塞同步原语,如何保证指令的原子性呢?是否只能在硬件上或可以使用某些汇编级指令优化?

What do you mean by Atomic instructions?

How does the following become Atomic?

TestAndSet

int TestAndSet(int *x){
   register int temp = *x;
   *x = 1;
   return temp;
}

From a software perspective, if one does not want to use non-blocking synchronization primitives, how can one ensure Atomicity of instruction? is it possible only at Hardware or some assembly level directive optimization can be used?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

梨涡少年 2024-08-18 23:40:44

一些机器指令本质上是原子的 - 例如,在许多架构上,读取和写入正确对齐的本机处理器字大小的值是原子的。

这意味着硬件中断、其他处理器和超线程无法中断读取或存储以及将部分值读取或写入到同一位置。

更复杂的事情(例如原子地一起读写)可以通过显式原子机器指令(例如 x86 上的 LOCK CMPXCHG)来实现。

锁定和其他高级构造是基于这些原子原语构建的,它们通常仅保护单个处理器字。

一些聪明的并发算法可以仅使用指针的读取和写入来构建,例如在单个读取器和写入器之间共享的链接列表中,或者通过努力,多个读取器和写入器之间共享。

Some machine instructions are intrinsically atomic - for example, reading and writing properly aligned values of the native processor word size is atomic on many architectures.

This means that hardware interrupts, other processors and hyper-threads cannot interrupt the read or store and read or write a partial value to the same location.

More complicated things such as reading and writing together atomically can be achieved by explicit atomic machine instructions e.g. LOCK CMPXCHG on x86.

Locking and other high-level constructs are built on these atomic primitives, which typically only guard a single processor word.

Some clever concurrent algorithms can be built using just the reading and writing of pointers e.g. in linked lists shared between a single reader and writer, or with effort, multiple readers and writers.

小…红帽 2024-08-18 23:40:44

以下是我对原子性的一些注释,可以帮助您理解其含义。这些注释来自最后列出的来源,如果您需要更彻底的解释而不是像我一样的点状项目符号,我建议您阅读其中的一些注释。有错误的地方请指出,以便我改正。

定义:

  • 来自希腊语,意思是“不可分割成更小的部分”
  • “原子”操作总是被观察到完成或不完成,但是
    从来没有半途而废。
  • 原子操作必须完全执行或不执行
    全部。
  • 在多线程场景中,变量从未变异变为
    直接变异,没有“半途变异”值

示例 1:原子操作

  • 考虑不同线程使用的以下整数:

    <前><代码> int X = 2;
    整数 Y = 1;
    整数 Z = 0;

    Z = X; //线程1

    X=Y; //线程2

  • 在上面的示例中,两个线程使用 X、Y 和 Z

  • 每个读取和写入都是原子的
  • 线程将竞争:
    • 如果线程 1 获胜,则 Z = 2
    • 如果线程 2 获胜,则 Z=1
    • Z 肯定是这两个值之一

示例 2:非原子操作:++/-- 操作< /strong>

  • 考虑递增/递减表达式:

    <前><代码>i++; //增量
    我 - ; //递减

  • 操作转换为:

    1. 阅读我
    2. 增加/减少读取值
    3. 将新值写回 i
  • 每个操作都由 3 个原子操作组成,并且本身不是原子操作
  • 两次尝试在单独的线程上递增 i 可能会交织在一起增量之一丢失

示例 3 - 非原子操作:大于 4 字节的值

  • 考虑以下不可变结构:
 结构 MyLong
   {
       公共只读 int 低;
       公共只读 int 高;

       公共MyLong(int低,int高)
       {
           this.low = 低;
           this.high = 高;
       }
   }
  • 我们创建具有 MyLong 类型的特定值的字段:

    <块引用>

    MyLong X = new MyLong(0xAAAA, 0xAAAA);   
    MyLong Y = 新 MyLong(0xBBBB, 0xBBBB);     
    MyLong Z = new MyLong(0xCCCC, 0xCCCC);
    
  • 我们在单独的线程中修改字段,没有线程安全:

    <块引用>
    <前><代码>X = Y; //线程1
    Y=X; //线程2

  • 在 .NET 中,复制值类型时,CLR 不会调用构造函数 - 它一次移动一个原子操作的字节

  • 因此,两个线程中的操作现在是四个原子操作
  • 如果没有强制执行线程安全,则数据可能会被损坏
  • 考虑以下执行操作顺序:

    <块引用>

    X.low = Y.low; //线程1 - X = 0xAAAABBBB            
    Y.低=Z.低; //线程2 - Y = 0xCCCCBBBB              
    Y.高=Z.高; //线程2 - Y = 0xCCCCCCCC             
    X.高=Y.高; //线程 1 - X = 0xCCCCBBBB <-- X 的损坏值
    

    考虑

  • 在 32 位操作系统上的多个线程上读取和写入大于 32 位的值,而不添加某种锁定以使操作原子化,可能会导致如上所述的数据损坏

处理器操作

  • 在所有现代操作系统上处理器,您可以假设自然对齐的本机类型的读取和写入是原子的,只要:

    • 1:内存总线的宽度至少与正在读取或写入的类型一样宽
    • 2:CPU 在单个总线事务中读取和写入这些类型,使得其他线程无法看到它们处于半完成状态
  • 不能保证大于 8 个字节的读取和写入是原子的

  • 处理器供应商在 软件开发人员手册
  • 在单处理器中 /单核系统可以使用标准锁定技术来防止 CPU 指令被中断,但这可能效率低下
  • 如果可能的话,禁用中断是另一种更有效的解决方案
  • 在多处理器/多核系统中,仍然可以使用锁,但仅使用单指令或禁用中断不能保证原子访问
  • 原子性可以通过确保所使用的指令在总线上断言“LOCK”信号来实现,以防止系统中的其他处理器同时访问内存

语言差异< /strong>

C#

  • C# 保证对任何占用最多 4 个字节的内置值类型的操作都是原子的 对
  • 占用超过 4 个字节的值类型(double、long 等)的操作都是 原子的不保证是原子的
  • CLI 保证对具有处理器自然指针大小(或更小)大小的值类型变量的读取和写入是原子的
    • 例如 - 在 64 位版本的 CLR 中的 64 位操作系统上运行 C# 以原子方式执行 64 位双精度数和长整型的读取和写入
  • 创建原子操作:
    • .NET 提供 Interlocked 类作为 System.Threading 命名空间的一部分
    • Interlocked 类提供原子操作,例如增量、比较、交换等。
使用 System.Threading;             

int 不安全计数;                          
int 安全计数;                           

不安全计数++;                              
Interlocked.Increment(ref safeCount);

C++

  • C++ 标准不保证原子行为
  • 所有 C / C++ 操作都被假定为非原子操作,除非编译器或硬件供应商另有规定 - 包括 32 位整数赋值
  • 创建原子操作:
    • C++ 11 并发库包括 - 原子操作库 ()
    • Atomic 库提供原子类型作为模板类,可与您想要的任何类型一起使用
    • 对原子类型的操作是原子的,因此是线程安全的

结构原子计数器
{

<前><代码> std::atomic<整数>价值;

无效增量(){
++值;
}

无效减量(){
- 价值;
}

int 获取(){
返回值.load();
}

}

Java

  • Java 保证对任何占用最多 4 字节的内置值类型的操作都是原子的。
  • 对易失性长整型和双精度型的赋值也保证是原子的。Java
  • 提供了一个小型类工具包,通过 java.util.concurrent.atomic 支持对单个变量进行无锁线程安全编程
  • 这提供了基于低级原子硬件原语的原子无锁操作,例如比较和交换 (CAS) - 也称为比较和设置:
    • CAS 形式 - boolean CompareAndSet(expectedValue, updateValue );
      • 如果变量当前包含预期值,则此方法会自动将变量设置为 updateValue - 成功时报告 true
导入java.util.concurrent.atomic.AtomicInteger;

公开课柜台
{
     私有 AtomicInteger 值= new AtomicInteger();

     公共 int 增量(){
         返回值.incrementAndGet();  
     }

     公共 int getValue(){
         返回值.get();
     }
}

来源
http://www.evernote.com/shard /s10/sh/c2735e95-85ae-4d8c-a615-52aadc305335/99de177ac05dc8635fb42e4e6121f1d2

Below are some of my notes on Atomicity that may help you understand the meaning. The notes are from the sources listed at the end and I recommend reading some of them if you need a more thorough explanation rather than point-form bullets as I have. Please point out any errors so that I may correct them.

Definition :

  • From the Greek meaning "not divisible into smaller parts"
  • An "atomic" operation is always observed to be done or not done, but
    never halfway done.
  • An atomic operation must be performed entirely or not performed at
    all.
  • In multi-threaded scenarios, a variable goes from unmutated to
    mutated directly, with no "halfway mutated" values

Example 1 : Atomic Operations

  • Consider the following integers used by different threads :

     int X = 2;
     int Y = 1;
     int Z = 0;
    
     Z = X;  //Thread 1
    
     X = Y;  //Thread 2
    
  • In the above example, two threads make use of X, Y, and Z

  • Each read and write are atomic
  • The threads will race :
    • If thread 1 wins, then Z = 2
    • If thread 2 wins, then Z=1
    • Z will will definitely be one of those two values

Example 2 : Non-Atomic Operations : ++/-- Operations

  • Consider the increment/decrement expressions :

    i++;  //increment
    i--;  //decrement
    
  • The operations translate to :

    1. Read i
    2. Increment/decrement the read value
    3. Write the new value back to i
  • The operations are each composed of 3 atomic operations, and are not atomic themselves
  • Two attempts to increment i on separate threads could interleave such that one of the increments is lost

Example 3 - Non-Atomic Operations : Values greater than 4-Bytes

  • Consider the following immutable struct :
  struct MyLong
   {
       public readonly int low;
       public readonly int high;

       public MyLong(int low, int high)
       {
           this.low = low;
           this.high = high;
       }
   }
  • We create fields with specific values of type MyLong :

    MyLong X = new MyLong(0xAAAA, 0xAAAA);   
    MyLong Y = new MyLong(0xBBBB, 0xBBBB);     
    MyLong Z = new MyLong(0xCCCC, 0xCCCC);
    
  • We modify our fields in separate threads without thread safety :

    X = Y; //Thread 1                                  
    Y = X; //Thread 2
    
  • In .NET, when copying a value type, the CLR doesn't call a constructor - it moves the bytes one atomic operation at a time

  • Because of this, the operations in the two threads are now four atomic operations
  • If there is no thread safety enforced, the data can be corrupted
  • Consider the following execution order of operations :

    X.low = Y.low;      //Thread 1 - X = 0xAAAABBBB            
    Y.low = Z.low;      //Thread 2 - Y = 0xCCCCBBBB              
    Y.high = Z.high;    //Thread 2 - Y = 0xCCCCCCCC             
    X.high = Y.high;    //Thread 1 - X = 0xCCCCBBBB   <-- corrupt value for X
    
  • Reading and writing values greater than 32-bits on multiple threads on a 32-bit operating system without adding some sort of locking to make the operation atomic is likely to result in corrupt data as above

Processor Operations

  • On all modern processors, you can assume that reads and writes of naturally aligned native types are atomic as long as :

    • 1 : The memory bus is at least as wide as the type being read or written
    • 2 : The CPU reads and writes these types in a single bus transaction, making it impossible for other threads to see them in a half-completed state
  • On x86 and X64 there is no guarantee that reads and writes larger than eight bytes are atomic

  • Processor vendors define the atomic operations for each processor in a Software Developer's Manual
  • In single processors / single core systems it is possible to use standard locking techniques to prevent CPU instructions from being interrupted, but this can be inefficient
  • Disabling interrupts is another more efficient solution, if possible
  • In multiprocessor / multicore systems it is still possible to use locks but merely using a single instruction or disabling interrupts does not guarantee atomic access
  • Atomicity can be achieved by ensuring that the instructions used assert the 'LOCK' signal on the bus to prevent other processors in the system from accessing the memory at the same time

Language Differences

C#

  • C# guarantees that operations on any built-in value type that takes up to 4-bytes are atomic
  • Operations on value types that take more than four bytes (double, long, etc.) are not guaranteed to be atomic
  • The CLI guarantees that reads and writes of variables of value type that are the size (or smaller) of the processor's natural pointer size are atomic
    • Ex - running C# on a 64-bit OS in a 64-bit version of the CLR performs reads and writes of 64-bit doubles and long integers atomically
  • Creating atomic operations :
    • .NET provodes the Interlocked Class as part of the System.Threading namespace
    • The Interlocked Class provides atomic operations such as increment, compare, exchange, etc.
using System.Threading;             

int unsafeCount;                          
int safeCount;                           

unsafeCount++;                              
Interlocked.Increment(ref safeCount);

C++

  • C++ standard does not guarantee atomic behavior
  • All C / C++ operations are presumed non-atomic unless otherwise specified by the compiler or hardware vendor - including 32-bit integer assignment
  • Creating atomic operations :
    • The C++ 11 concurrency library includes the - Atomic Operations Library ()
    • The Atomic library provides atomic types as a template class to use with any type you want
    • Operations on atomic types are atomic and thus thread-safe

struct AtomicCounter
{

   std::atomic< int> value;   

   void increment(){                                    
       ++value;                                
   }           

   void decrement(){                                         
       --value;                                                 
   }

   int get(){                                             
       return value.load();                                    
   }      

}

Java

  • Java guarantees that operations on any built-in value type that takes up to 4-bytes are atomic
  • Assignments to volatile longs and doubles are also guaranteed to be atomic
  • Java provides a small toolkit of classes that support lock-free thread-safe programming on single variables through java.util.concurrent.atomic
  • This provides atomic lock-free operations based on low-level atomic hardware primitives such as compare-and-swap (CAS) - also called compare and set :
    • CAS form - boolean compareAndSet(expectedValue, updateValue );
      • This method atomically sets a variable to the updateValue if it currently holds the expectedValue - reporting true on success
import java.util.concurrent.atomic.AtomicInteger;

public class Counter
{
     private AtomicInteger value= new AtomicInteger();

     public int increment(){
         return value.incrementAndGet();  
     }

     public int getValue(){
         return value.get();
     }
}

Sources
http://www.evernote.com/shard/s10/sh/c2735e95-85ae-4d8c-a615-52aadc305335/99de177ac05dc8635fb42e4e6121f1d2

吃兔兔 2024-08-18 23:40:44

原子来自希腊语ἄτομος(atomos),意思是“不可分割”。 (注意:我不会说希腊语,所以也许它实际上是别的东西,但大多数说英语的人引用词源都是这样解释的。:-)

在计算中,这意味着操作,<我>发生了。在完成之前没有任何可见的中间状态。因此,如果您的 CPU 因服务硬件 (IRQ) 而中断,或者另一个 CPU 正在读取同一内存,则不会影响结果,并且这些其他操作将观察到它已完成或未启动。

举个例子......假设您想将变量设置为某项,但前提是之前尚未设置过该变量。您可能倾向于这样做:

if (foo == 0)
{
   foo = some_function();
}

但是如果并行运行怎么办?程序可能会获取 foo,将其视为零,同时线程 2 出现并执行相同的操作并将值设置为某个值。回到原始线程,代码仍然认为 foo 为零,并且该变量被分配了两次。

对于这样的情况,CPU 提供了一些指令,可以作为原子实体进行比较和条件赋值。因此,测试和设置、比较和交换以及加载链接/条件存储。您可以使用它们来实现锁(您的操作系统和 C 库已完成此操作。)或者您可以编写依赖原语来执行某些操作的一次性算法。 (这里有一些很酷的事情要做,但大多数凡人都会避免这样做,因为担心出错。)

Atomic comes from the Greek ἄτομος (atomos) which means "indivisible". (Caveat: I don't speak Greek, so maybe it's really something else, but most English speakers citing etymologies interpret it this way. :-)

In computing, this means that the operation, well, happens. There isn't any intermediate state that's visible before it completes. So if your CPU gets interrupted to service hardware (IRQ), or if another CPU is reading the same memory, it doesn't affect the result, and these other operations will observe it as either completed or not started.

As an example... let's say you wanted to set a variable to something, but only if it has not been set before. You might be inclined to do this:

if (foo == 0)
{
   foo = some_function();
}

But what if this is run in parallel? It could be that the program will fetch foo, see it as zero, meanwhile thread 2 comes along and does the same thing and sets the value to something. Back in the original thread, the code still thinks foo is zero, and the variable gets assigned twice.

For cases like this, the CPU provides some instructions that can do the comparison and the conditional assignment as an atomic entity. Hence, test-and-set, compare-and-swap, and load-linked/store-conditional. You can use these to implement locks (your OS and your C library has done this.) Or you can write one-off algorithms that rely on the primitives to do something. (There's cool stuff to be done here, but most mere mortals avoid this for fear of getting it wrong.)

绅刃 2024-08-18 23:40:44

当您进行任何形式的包含共享资源的并行处理(包括不同的应用程序协作或共享数据)时,原子性是一个关键概念。

通过一个例子很好地说明了这个问题。假设您有两个程序想要创建一个文件,但前提是该文件尚不存在。这两个程序中的任何一个都可以在任何时间点创建该文件。

如果您这样做(我将使用 C,因为它是您的示例中的内容):

 ...
 f = fopen ("SYNCFILE","r");
 if (f == NULL) {
   f = fopen ("SYNCFILE","w");
 }
 ...

您无法确定其他程序在您的打开读取和打开写入之间没有创建该文件。

您无法自己完成此操作,您需要操作系统的帮助,通常为此目的提供同步原语,或者保证原子性的其他机制(例如,锁定操作是原子性的关系数据库,或较低级别的机制,如处理器“测试和设置”指令)。

Atomicity is a key concept when you have any form of parallel processing (including different applications cooperating or sharing data) that includes shared resources.

The problem is well illustrated with an example. Let's say you have two programs that want to create a file but only if the file doesn't already exists. Any of the two program can create the file at any point in time.

If you do (I'll use C since it's what's in your example):

 ...
 f = fopen ("SYNCFILE","r");
 if (f == NULL) {
   f = fopen ("SYNCFILE","w");
 }
 ...

you can't be sure that the other program hasn't created the file between your open for read and your open for write.

There's no way you can do this on your own, you need help from the operating system, that usually provide syncronization primitives for this purpose, or another mechanism that is guaranteed to be atomic (for example a relational database where the lock operation is atomic, or a lower level mechanism like processors "test and set" instructions).

沧笙踏歌 2024-08-18 23:40:44

原子性只能由操作系统来保证。操作系统使用底层处理器功能来实现此目的。

因此创建自己的 testandset 函数是不可能的。 (虽然我不确定是否可以使用内联 asm 片段,并直接使用 testandset 助记符(可能该语句只能使用操作系统权限来完成))

编辑:
根据本文下面的评论,可以直接使用 ASM 指令创建自己的“bittestandset”函数(在 intel x86 上)。然而,这些技巧是否也适用于其他处理器尚不清楚。

我坚持我的观点:如果你想做一些大气的事情,请使用操作系统功能,而不是自己做

Atomicity can only be guaranteed by the OS. The OS uses the underlying processor features to achieve this.

So creating your own testandset function is impossible. (Although I'm not sure if one could use an inline asm snippet, and use the testandset mnemonic directly (Could be that this statement can only be done with OS priviliges))

EDIT:
According to the comments below this post, making your own 'bittestandset' function using an ASM directive directly is possible (on intel x86). However, if these tricks also work on other processors is not clear.

I stand by my point: if You want to do atmoic things, use the OS functions and don't do it yourself

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文