如何在 AIX/powerpc 上实现原子分配?
我正在将内核扩展移植到多处理器 PowerPC 上的 32/64 位 AIX,用 C 编写。我只需要原子读取操作和原子写入操作(我没有使用获取和添加,比较-并交换等) 只是澄清一下:对我来说,“原子性”不仅意味着“无交错”,还意味着“跨多个核心的可见性”。 这些操作对指针进行操作,因此对“int”变量的操作对我来说毫无用处。
如果我将变量声明为“易失性”,C 标准表示该变量可以被未知因素修改,因此不受优化影响。
从我读到的内容看来,常规读取和写入应该是非交错的,并且 Linux 内核源 似乎同意。它说:
__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
stw
是“存储词”,据说是原子的,但我不知道“%U0%X0”是什么意思。我不明白这个汇编指令如何增强可见性。 当我编译内核扩展时,“std”用于我想要的分配,但根据我的阅读,它对于 64 位计算机也应该是原子的。我对 PowerPC 及其指令集的细节了解甚少,但是我在编译文件的汇编列表中没有找到任何内存屏障指令(“sync”或“eieio”)。
内核提供了 fetch_and_addlp() 服务,可用于实现原子读取(例如,v = fetch_and_addlp(&x, 0)
)。
所以我的问题是:
声明变量“易失性”是否足以实现可见性和无交错意义上的读写原子性?
如果1的答案是“否”,那么这样的原子性是如何实现的?
Linux PowerPC 原子实现中“%U0%X0”的含义是什么?
I'm porting a kernel extentsion to 32/64 bit AIX on multi-processor PowerPC, written in C. I don't need more than atomic read operation and atomic write operations (I have no use for fetch-and-add, compare-and-swap etc.)
Just to clarify: to me, "atomicity" means not only "no interleaving", but also "visibility across multiple cores".
The operations operate on pointers, so operations on 'int' variables are useless to me.
If I declare the variable "volatile", the C standard says the variable can be modified by unknown factors and is therefore not subject to optimizations.
From what I read, it seems that regular reads and writes are supposed to be non-interleaved, and the linux kernel souces seem to agree. it says:
__asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
stw
is "store word", which is supposedly atomic, but I don't know what the "%U0%X0" means. I do not understand how this assembly instruction imposes visibility.
When I compile my kernel extension, 'std' is used for the assignment I want, but it should also be atomic for a 64 bit machine, from what I read. I have very little understanding of the specifics of PowerPC and its instruction set, However I did not find in the assembly listing of the compiled file any memory barrier instructions ("sync" or "eieio").
The kernel provides the fetch_and_addlp() service which can be used to implement atomic read (v = fetch_and_addlp(&x, 0)
, for example).
So my questions are:
is it enough to declare the variable 'volatile' to achieve read and write atomicity in the sense of visibility and no-interleaving?
if the answer to 1 is "no", how is such atomicity achieved?
what is the meaning of "%U0%X0" in the Linux PowerPC atomic implementation?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
GCC 内联汇编语法有其特殊之处。
该行中,
m
是输出操作数,r
是输入操作数。 %1 和 %0 指的是参数顺序 (0->m, 1->r),stw
汇编指令采用 2 个参数,%U0%X0 是对参数的约束。这些限制是为了迫使 GCC 分析论点并确保你不做愚蠢的事情。事实证明,“U”是 powerpc 特定的(我习惯了 X64 约束集:)。约束的完整列表可以在以下位置找到:http://gcc .gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
There are idiosyncrasies in the GCC inline assembly syntax.
in the line,
the
m
is an output operand and ther
is an input operand. The %1 and %0 refer to the argument order (0->m, 1->r)the
stw
assembly instruction takes 2 arguments and the %U0%X0 are constraints on the arguments. These constraints are to force GCC to analyze the arguments and make sure you dont do something stupid. As it turns out, `U' is powerpc-specific (I'm used to the X64 constraint set :). The full list of constraints can be found in :http://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
我已经设法回答了问题 1 和 2,但没有回答问题 3:
I have managed to answer questions 1 and 2, but not 3: