Linux内核中读写原子操作的实现

发布于 2025-01-05 17:14:50 字数 609 浏览 5 评论 0原文

最近，我研究了 Linux 内核的原子读写实现，并提出了一些问题。

首先是ia64架构的相关代码：

typedef struct {
    int counter;
} atomic_t;

#define atomic_read(v)      (*(volatile int *)&(v)->counter)
#define atomic64_read(v)    (*(volatile long *)&(v)->counter)

#define atomic_set(v,i)     (((v)->counter) = (i))
#define atomic64_set(v,i)   (((v)->counter) = (i))

对于读取和写入操作，似乎都采用直接方法来读取或写入变量。除非某个地方还有其他技巧，否则我不明白如何保证此操作在程序集域中是原子的。我想一个明显的答案是这样的操作会转换为一个汇编操作码，但即便如此，在考虑不同的内存缓存级别（或其他优化）时如何保证这一点？
在读取宏中，易失性类型用于转换技巧。有人知道这如何影响这里的原子性吗？（请注意，它不在写入操作中使用）

原文

Recently I've peeked into the Linux kernel implementation of an atomic read and write and a few questions came up.

First the relevant code from the ia64 architecture:

typedef struct {
    int counter;
} atomic_t;

#define atomic_read(v)      (*(volatile int *)&(v)->counter)
#define atomic64_read(v)    (*(volatile long *)&(v)->counter)

#define atomic_set(v,i)     (((v)->counter) = (i))
#define atomic64_set(v,i)   (((v)->counter) = (i))

For both read and write operations, it seems that the direct approach was taken to read from or write to the variable. Unless there is another trick somewhere, I do not understand what guarantees exist that this operation will be atomic in the assembly domain. I guess an obvious answer will be that such an operation translates to one assembly opcode, but even so, how is that guaranteed when taking into account the different memory cache levels (or other optimizations)?
On the read macros, the volatile type is used in a casting trick. Anyone has a clue how this affects the atomicity here? (Note that it is not used in the write operation)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

停滞 2025-01-12 17:14:50

我认为您在这里误解了“原子”和“易失性”一词的（非常模糊的）用法。原子实际上意味着单词将被原子地读取或写入（一步到位，并保证该内存位置的内容始终是一次写入或另一次写入，而不是介于两者之间）。并且 volatile 关键字告诉编译器永远不要由于较早的读/写而假设该位置的数据（基本上，永远不要优化读取）。

“原子”和“易失性”这两个词在这里并不意味着存在任何形式的内存同步。两者都不意味着任何读/写障碍或围栏。关于内存和缓存一致性，没有任何保证。这些功能基本上仅在软件级别上是原子的，并且硬件可以根据其认为合适的方式进行优化/谎言。

现在为什么简单地读取就足够了：每种架构的内存模型都不同。许多架构可以保证对与某个字节偏移量或 x 个字长度等对齐的数据进行原子读取或写入，并且根据 CPU 的不同而有所不同。 Linux 内核包含许多针对不同架构的定义，使其无需在保证（有时甚至仅在实践中，即使实际上他们的规范说不可以）的平台上进行任何原子调用（基本上是 CMPXCHG）。实际上不能保证）原子读/写。

至于易失性，虽然通常不需要它，除非您正在访问内存映射IO，但这一切都取决于何时/何地/为何>atomic_read 和 atomic_write 宏正在被调用。许多编译器将会（尽管 C 规范中没有设置）为易失性变量生成内存屏障/栅栏（GCC，我的脑海中浮现出来的就是其中之一。MSVC 肯定会这样做。）。虽然这通常意味着对此变量的所有读/写现在正式免除任何编译器优化，在这种情况下，仅创建一个“虚拟”易失性变量这个特定的读/写实例是优化和重新排序的限制。

I think you are misunderstanding the (very much vague) usage of the word "atomic" and "volatile" here. Atomic only really means that the words will be read or written atomically (in one step, and guaranteeing that the contents of this memory position will always be one write or the other, and not something in between). And the volatile keyword tells the compiler to never assume the data in that location due to an earlier read/write (basically, never optimize away the read).

What the words "atomic" and "volatile" do NOT mean here is that there's any form of memory synchronization. Neither implies ANY read/write barriers or fences. Nothing is guaranteed with regards to memory and cache coherence. These functions are basically atomic only at the software level, and the hardware can optimize/lie however it deems fit.

Now as to why simply reading is enough: the memory models for each architecture are different. Many architectures can guarantee atomic reads or writes for data aligned to a certain byte offset, or x words in length, etc. and vary from CPU to CPU. The Linux kernel contains many defines for the different architectures that let it do without any atomic calls (CMPXCHG, basically) on platforms that guarantee (sometimes even only in practice even if in reality their spec says the don't actually guarantee) atomic reads/writes.

As for the volatile, while there is no need for it in general unless you're accessing memory-mapped IO, it all depends on when/where/why the atomic_read and atomic_write macros are being called. Many compilers will (though it is not set in the C spec) generate memory barriers/fences for volatile variables (GCC, off the top of my head, is one. MSVC does for sure.). While this would normally mean that all reads/writes to this variable are now officially exempt from just about any compiler optimizations, in this case by creating a "virtual" volatile variable only this particular instance of a read/write is off-limits for optimization and re-ordering.

回复收藏 0 原文