如何在C中一次原子读取多个变量?

发布于 2025-02-10 06:48:09 字数 839 浏览 1 评论 0原文

我正在尝试一次读取三个变量a,b,c。该模式看起来像下面的代码。

_Atomic uint32_t a, b, c;

void thread_high_priority(void)
{
  atomic_fetch_sub_explicit(&a, 1, memory_order_relaxed);
  atomic_fetch_add_explicit(&b, 1, memory_order_relaxed);
  atomic_fetch_sub_explicit(&c, 1, memory_order_relaxed);
}

void thread_low_priority(void)
{
  uint32_t _a = a;
  uint32_t _b = b;
  uint32_t _c = c;
}

thread_high_priority是一个以高优先级运行的线程,thread_low_priority以低优先级运行。 thread_high_priority可以中断thread_low_priority的执行,而不是相反。也就是说,thread_high_priority将始终不间断地运行。

约束是thread_high_priority是时间关键的。因此,我不想使用静音块阻止,因为这很耗时,甚至造成僵局。有没有办法确保在不会中断的情况下一次读取所有三个变量?

编辑:该平台是在Baremetal环境中运行的ARMV7M架构。

I am trying to read three variables a, b, c atomically at once. The pattern looks something like the code below.

_Atomic uint32_t a, b, c;

void thread_high_priority(void)
{
  atomic_fetch_sub_explicit(&a, 1, memory_order_relaxed);
  atomic_fetch_add_explicit(&b, 1, memory_order_relaxed);
  atomic_fetch_sub_explicit(&c, 1, memory_order_relaxed);
}

void thread_low_priority(void)
{
  uint32_t _a = a;
  uint32_t _b = b;
  uint32_t _c = c;
}

thread_high_priority is a thread running in high priority and thread_low_priority running in low priority. thread_high_priority can interrupt the execution of thread_low_priority, but not the other way. That is, thread_high_priority will always run uninterruptedly.

The constraint is that thread_high_priority is time-critical. Therefore, I don't want to use a mutex to block as it is time-consuming and even causes deadlock. Is there a way to make sure all three variables are read at once without interruption?

Edit: The platform is ARMv7M architecture running in baremetal environment.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

青柠芒果 2025-02-17 06:48:09

您可以通过一定程度的间接解决这个问题。

只要只有一位作者,您就可以这样做:

  • 将一组数据项放在结构中,
  • 分配几个这样的结构
  • 在非原子上写入读者没有使用原子上的读者
  • 将指针更改为读者应使用读取器的结构

应读取指针,然后访问各个结构中的数据。

如果可能在主要上下文仍在阅读时发生另一个中断,那么您需要将指针保留到读取器所使用的struct的指针,而作者可以在填写结构之前对此进行检查。如果只有一位读者,那么在原子上访问第二个指针会更容易。

为了使事情变得平滑,您可以分配三个或更多的结构,并将它们视为环缓冲区。

You can solve this problem with a level of indirection.

As long as there is only one writer, you could do it like this:

  • Put the set of data items in a struct
  • Allocate several such structs
  • Write non-atomically to the members of a struct which the readers are not using
  • Atomically change a pointer to which struct the reader should use

The reader should read the pointer then access the data in the respective struct.

If it is possible that another interrupt occurs while the main context is still reading then you need to keep a pointer to which struct the reader is using, and the writer can check this before filling out the struct. Accessing this second pointer atomically is easier if there is only one reader.

To smooth things out you can allocate three or more structs, and treat them as a ring buffer.

隔纱相望 2025-02-17 06:48:09

我还提出了另一个基于seqlock的解决方案。
在知道我尝试实现的目标本质上是撕裂检测之后,我使用 seqlock template 进行了重写。我仍然将三个变量定义a,b,c_ATOMIC UINT32_T,因为我还想使用thread_low_priority使用atomic_fetch_*在thread_low_priority中修改它们。

在ARMV7-M Archiecution RMW原子操作中,使用ldrex/strex实现。编译器将发布循环以检查strex是否成功。就我而言,使用RMW操作时可能是一个问题,因为thread_high_priority需要快速并不间断地运行。我目前不知道是否有strexthread_high_priority上下文中始终失败的情况可能会导致僵局。

_Atomic uint32_t a, b, c;
atomic_uint seqcount = 0;

void thread_high_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint orig_cnt = atomic_load_explicit(&seqcount, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
  atomic_thread_fence(memory_order_release);

  _a = atomic_load_explicit(&a, memory_order_relaxed);
  _b = atomic_load_explicit(&b, memory_order_relaxed);
  _c = atomic_load_explicit(&c, memory_order_relaxed);
  atomic_store_explicit(&a, _a - 1, memory_order_relaxed);
  atomic_store_explicit(&b, _b + 1, memory_order_relaxed);
  atomic_store_explicit(&c, _c - 1, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 2, memory_order_release);
}

void thread_low_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint c0, c1;
  do {
    c0 = atomic_load_explicit(&seqcount, memory_order_acquire);

    _a = atomic_load_explicit(&a, memory_order_relaxed);
    _b = atomic_load_explicit(&b, memory_order_relaxed);
    _c = atomic_load_explicit(&c, memory_order_relaxed);

    c1 = atomic_load_explicit(&seqcount, memory_order_acquire);
  } while (c0 & 1 || c0 != c1);
}

编辑:再次检查编译器的输出后,我在thread_high_priority中稍作修改代码。使用ARM GCC 10.3.1(2021.10 none)与编译标志-O1 -MCPU = Cortex -M3 -Std = GNU18 -MTHUMB

在我的原始代码中,dmb ish是在商店之前发出的,如下所示。

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_release);
--->
        adds    r1, r2, #1
        dmb     ish
        str     r1, [r3]

在我将内存屏障与商店分开之后,在存储后发出dmb ish,因此在更新seqcount的更新之前,请在更新a,b,c 代码>。

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
-->
        adds    r1, r2, #1
        str     r1, [r3]
        dmb     ish

I also come up with another solution that based on SeqLock.
After knowing that what I tried to achieve is essentially tear-detection, I rewrite it using a SeqLock template. I still define my three variables a, b, c as _Atomic uint32_t since I also want to modify them in thread_low_priority using atomic_fetch_*.

On ARMv7-M archiecture RMW atomic operations are implement using ldrex/strex. The compiler will issue a loop to check whether strex success or not. In my case, it could be a problem when using RMW operations because thread_high_priority needs to be fast and run uninterruptedly. I currently don't know if there is a case where strex always failed in the thread_high_priority context that could cause deadlock.

_Atomic uint32_t a, b, c;
atomic_uint seqcount = 0;

void thread_high_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint orig_cnt = atomic_load_explicit(&seqcount, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
  atomic_thread_fence(memory_order_release);

  _a = atomic_load_explicit(&a, memory_order_relaxed);
  _b = atomic_load_explicit(&b, memory_order_relaxed);
  _c = atomic_load_explicit(&c, memory_order_relaxed);
  atomic_store_explicit(&a, _a - 1, memory_order_relaxed);
  atomic_store_explicit(&b, _b + 1, memory_order_relaxed);
  atomic_store_explicit(&c, _c - 1, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 2, memory_order_release);
}

void thread_low_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint c0, c1;
  do {
    c0 = atomic_load_explicit(&seqcount, memory_order_acquire);

    _a = atomic_load_explicit(&a, memory_order_relaxed);
    _b = atomic_load_explicit(&b, memory_order_relaxed);
    _c = atomic_load_explicit(&c, memory_order_relaxed);

    c1 = atomic_load_explicit(&seqcount, memory_order_acquire);
  } while (c0 & 1 || c0 != c1);
}

Edit: Again after checking the output from compiler, I slightly modify my code in thread_high_priority. Compile using ARM gcc 10.3.1 (2021.10 none) with compilation flag -O1 -mcpu=cortex-m3 -std=gnu18 -mthumb.

In my original code, dmb ish is issued before the store as shown below.

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_release);
--->
        adds    r1, r2, #1
        dmb     ish
        str     r1, [r3]

After I separate the memory barrier from store, dmb ish is issued after store, so that the update of seqcount is visible before updating a, b, c.

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
-->
        adds    r1, r2, #1
        str     r1, [r3]
        dmb     ish
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文