CPU 是否自动将值分配给内存?

发布于 2024-09-02 19:56:19 字数 453 浏览 2 评论 0原文

我想知道一个简单的问题有一段时间了; CPU 是自动分配值,还是逐位分配值(例如 32 位整数)。
如果是一点一点地,访问这个确切位置的另一个线程是否可以获得要分配的值的“部分”?

想一想:
我有两个线程和一个共享的“unsigned int”变量(称之为“g_uiVal”)。
两个线程都循环。
On 正在使用 printf("%u\n", g_uiVal) 打印“g_uiVal”。
第二个只是增加这个数字。
打印线程是否会打印完全不属于“g_uiVal”值或属于“g_uiVal”值的一部分的内容?

在代码中:

unsigned int g_uiVal;

void thread_writer()
{
 g_uiVal++;
}
void thread_reader()
{
 while(1)
  printf("%u\n", g_uiVal);
}

A quick question I've been wondering about for some time; Does the CPU assign values atomically, or, is it bit by bit (say for example a 32bit integer).
If it's bit by bit, could another thread accessing this exact location get a "part" of the to-be-assigned value?

Think of this:
I have two threads and one shared "unsigned int" variable (call it "g_uiVal").
Both threads loop.
On is printing "g_uiVal" with printf("%u\n", g_uiVal).
The second just increase this number.
Will the printing thread ever print something that is totally not or part of "g_uiVal"'s value?

In code:

unsigned int g_uiVal;

void thread_writer()
{
 g_uiVal++;
}
void thread_reader()
{
 while(1)
  printf("%u\n", g_uiVal);
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

月朦胧 2024-09-09 19:56:20

取决于CPU和内存的总线宽度。在 PC 环境中,除了真正古老的 CPU 之外,任何高达 32 位的访问都是原子的; 64 位访问可能是也可能不是。在嵌入式领域,许多(大多数?)CPU 都是 32 位宽,并且没有提供任何更宽的位宽,因此您的 int64_t 保证是非原子的。

Depends on the bus widths of the CPU and memory. In a PC context, with anything other than a really ancient CPU, accesses of up to 32 bit accesses are atomic; 64-bit accesses may or may not be. In the embedded space, many (most?) CPUs are 32 bits wide and there is no provision for anything wider, so your int64_t is guaranteed to be non-atomic.

瑾兮 2024-09-09 19:56:20

我相信唯一正确的答案是“视情况而定”。你可能会问什么?

对于初学者来说哪个CPU好。但也有一些 CPU 对于写入字宽值是原子的,但仅限于对齐时。这确实不是 C 语言级别可以保证的。

许多编译器提供“内在函数”来发出正确的原子操作。这些扩展的行为类似于函数,但为您的目标架构发出正确的代码以获得所需的原子操作。例如: http://gcc.gnu.org/onlinedocs/gcc/Atomic -Builtins.html

I believe the only correct answer is "it depends". On what you may ask?

Well for starters which CPU. But also some CPUs are atomic for writing word width values, but only when aligned. It really is not something you can guarantee at a C language level.

Many compilers offer "intrinsics" to emit correct atomic operations. These are extensions which act like functions, but emit the correct code for your target architecture to get the needed atomic operations. For example: http://gcc.gnu.org/onlinedocs/gcc/Atomic-Builtins.html

花辞树 2024-09-09 19:56:20

你在问题中提到了“一点一点”。我不认为任何架构一次只执行一点操作,除了一些专门的串行协议总线。标准内存读/写以 8、16、32 或 64 位粒度完成。因此,您的示例中的操作可能是原子的。

然而,答案很大程度上依赖于平台。

  • 这取决于CPU的能力。
    硬件可以做原子32位吗
    手术?这里有一个提示:如果
    你正在处理的变量更大
    比本机寄存器大小(例如
    32 位系统上的 64 位 int),它是
    绝对不是原子的。
  • 这取决于编译器如何
    生成机器代码。它可以
    已经把你的变量变成了32位
    访问 4x 8 位内存读取。
  • 如果地址是什么,那就很棘手了
    您正在访问的内容未对齐
    跨越机器的自然词
    边界。你可以点击aa缓存
    错误或页面错误。

使用您发布的代码示例,您很可能会看到损坏或意外的值。

您的平台可能提供一些执行原子操作的方法。对于 Windows 平台,它是通过 Interlocked功能。对于 Linux/Unix,请查看 atomic_t 类型

You said "bit-by-bit" in your question. I don't think any architecture does operations a bit at a time, except with some specialized serial protocol busses. Standard memory read/writes are done with 8, 16, 32, or 64 bits of granularity. So it is POSSIBLE the operation in your example is atomic.

However, the answer is heavily platform dependent.

  • It depends on the CPU's capabilities.
    Can the hardware do an atomic 32-bit
    operation? Here's a hint: If the
    variable you are working on is larger
    than the native register size (e.g.
    64-bit int on a 32-bit system), it's
    definitely NOT atomic.
  • It depends on how the compiler
    generates the machine code. It could
    have turned your 32-bit variable
    access into 4x 8-bit memory reads.
  • It gets tricky if the address of what
    you are accessing is not aligned
    across a machine's natural word
    boundary. You can hit a a cache
    fault or page fault.

It is VERY POSSIBLE that you would see a corrupt or unexpected value using the code example that you posted.

Your platform probably provides some method of doing atomic operations. In the case of a Windows platform, it is via the Interlocked functions. In the case of Linux/Unix, look at the atomic_t type.

薯片软お妹 2024-09-09 19:56:20

补充一下到目前为止所说的内容 - 另一个潜在的问题是缓存。 CPU 倾向于使用本地(芯片上)内存高速缓存,该高速缓存可能会也可能不会立即刷新回主内存。如果该机器具有多个 CPU,则在修改 CPU 进行更改后的一段时间内,另一个 CPU 可能不会看到更改 - 除非有一些同步命令通知所有 CPU 它们应该同步其片上缓存。正如您可以想象的那样,这种同步会大大减慢处理速度。

To add to what has been said so far - another potential concern is caching. CPUs tend to work with the local (on die) memory cache which may or may not be immediately flushed back to the main memory. If the box has more than one CPU, it is possible that another CPU will not see the changes for some time after the modifying CPU made them - unless there is some synchronization command informing all CPUs that they should synchronize their on-die caches. As you can imagine such synchronization can considerably slow the processing down.

可可 2024-09-09 19:56:20

不要忘记编译器在优化时假设单线程,并且这整个事情可能会消失。

Don't forget that the compiler assumes single-thread when optimizing, and this whole thing could just go away.

起风了 2024-09-09 19:56:20

POSIX 定义了特殊类型 sig_atomic_t ,它保证写入它的信号对于信号来说是原子的,这将使它从其他线程的角度来看也是原子的。它们没有专门定义这样的原子跨线程类型,因为线程通信预计由互斥体或其他同步原语介导。

POSIX defines the special type sig_atomic_t which guarentees that writes to it are atomic with respect to signals, which will make it also atomic from the point of view of other threads like you want. They don't specifically define an atomic cross-thread type like this, since thread communication is expected to be mediated by mutexes or other sychronization primitives.

谜兔 2024-09-09 19:56:20

考虑到现代微处理器(并忽略微控制器),32 位分配是原子的,而不是逐位的。

然而,现在完全脱离了你的问题的主题......打印线程仍然可以打印一些意想不到的东西,因为在这个例子中缺乏同步,当然,由于指令重新排序和多个核心每个都有自己的 g_uiVal 副本在他们的缓存中。

Considering modern microprocessors (and ignoring microcontrollers), the 32-bit assignment is atomic, not bit-by-bit.

However, now completely off of your question's topic... the printing thread could still print something that is not expected because of the lack of synchronization in this example, of course, due to instruction reordering and multiple cores each with their own copy of g_uiVal in their caches.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文