如何在C中一次原子读取多个变量？

发布于 2025-02-10 06:48:09 字数 839 浏览 1 评论 0原文

我正在尝试一次读取三个变量a，b，c。该模式看起来像下面的代码。

_Atomic uint32_t a, b, c;

void thread_high_priority(void)
{
  atomic_fetch_sub_explicit(&a, 1, memory_order_relaxed);
  atomic_fetch_add_explicit(&b, 1, memory_order_relaxed);
  atomic_fetch_sub_explicit(&c, 1, memory_order_relaxed);
}

void thread_low_priority(void)
{
  uint32_t _a = a;
  uint32_t _b = b;
  uint32_t _c = c;
}

thread_high_priority是一个以高优先级运行的线程，thread_low_priority以低优先级运行。 thread_high_priority可以中断thread_low_priority的执行，而不是相反。也就是说，thread_high_priority将始终不间断地运行。

约束是thread_high_priority是时间关键的。因此，我不想使用静音块阻止，因为这很耗时，甚至造成僵局。有没有办法确保在不会中断的情况下一次读取所有三个变量？

编辑：该平台是在Baremetal环境中运行的ARMV7M架构。

原文

I am trying to read three variables a, b, c atomically at once. The pattern looks something like the code below.

_Atomic uint32_t a, b, c;

void thread_high_priority(void)
{
  atomic_fetch_sub_explicit(&a, 1, memory_order_relaxed);
  atomic_fetch_add_explicit(&b, 1, memory_order_relaxed);
  atomic_fetch_sub_explicit(&c, 1, memory_order_relaxed);
}

void thread_low_priority(void)
{
  uint32_t _a = a;
  uint32_t _b = b;
  uint32_t _c = c;
}

thread_high_priority is a thread running in high priority and thread_low_priority running in low priority. thread_high_priority can interrupt the execution of thread_low_priority, but not the other way. That is, thread_high_priority will always run uninterruptedly.

The constraint is that thread_high_priority is time-critical. Therefore, I don't want to use a mutex to block as it is time-consuming and even causes deadlock. Is there a way to make sure all three variables are read at once without interruption?

Edit: The platform is ARMv7M architecture running in baremetal environment.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青柠芒果 2025-02-17 06:48:09

您可以通过一定程度的间接解决这个问题。

只要只有一位作者，您就可以这样做：

将一组数据项放在结构中，
分配几个这样的结构
在非原子上写入读者没有使用原子上的读者
将指针更改为读者应使用读取器的结构

应读取指针，然后访问各个结构中的数据。

如果可能在主要上下文仍在阅读时发生另一个中断，那么您需要将指针保留到读取器所使用的struct的指针，而作者可以在填写结构之前对此进行检查。如果只有一位读者，那么在原子上访问第二个指针会更容易。

为了使事情变得平滑，您可以分配三个或更多的结构，并将它们视为环缓冲区。

回复收藏 0 原文

隔纱相望 2025-02-17 06:48:09

我还提出了另一个基于seqlock的解决方案。
在知道我尝试实现的目标本质上是撕裂检测之后，我使用 seqlock template 进行了重写。我仍然将三个变量定义a，b，c为_ATOMIC UINT32_T，因为我还想使用thread_low_priority使用atomic_fetch_*在thread_low_priority中修改它们。。

在ARMV7-M Archiecution RMW原子操作中，使用ldrex/strex实现。编译器将发布循环以检查strex是否成功。就我而言，使用RMW操作时可能是一个问题，因为thread_high_priority需要快速并不间断地运行。我目前不知道是否有strex在thread_high_priority上下文中始终失败的情况可能会导致僵局。

_Atomic uint32_t a, b, c;
atomic_uint seqcount = 0;

void thread_high_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint orig_cnt = atomic_load_explicit(&seqcount, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
  atomic_thread_fence(memory_order_release);

  _a = atomic_load_explicit(&a, memory_order_relaxed);
  _b = atomic_load_explicit(&b, memory_order_relaxed);
  _c = atomic_load_explicit(&c, memory_order_relaxed);
  atomic_store_explicit(&a, _a - 1, memory_order_relaxed);
  atomic_store_explicit(&b, _b + 1, memory_order_relaxed);
  atomic_store_explicit(&c, _c - 1, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 2, memory_order_release);
}

void thread_low_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint c0, c1;
  do {
    c0 = atomic_load_explicit(&seqcount, memory_order_acquire);

    _a = atomic_load_explicit(&a, memory_order_relaxed);
    _b = atomic_load_explicit(&b, memory_order_relaxed);
    _c = atomic_load_explicit(&c, memory_order_relaxed);

    c1 = atomic_load_explicit(&seqcount, memory_order_acquire);
  } while (c0 & 1 || c0 != c1);
}

编辑：再次检查编译器的输出后，我在thread_high_priority中稍作修改代码。使用ARM GCC 10.3.1（2021.10 none）与编译标志-O1 -MCPU = Cortex -M3 -Std = GNU18 -MTHUMB。

在我的原始代码中，dmb ish是在商店之前发出的，如下所示。

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_release);
--->
        adds    r1, r2, #1
        dmb     ish
        str     r1, [r3]

在我将内存屏障与商店分开之后，在存储后发出dmb ish，因此在更新seqcount的更新之前，请在更新a，b，c 代码>。

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
-->
        adds    r1, r2, #1
        str     r1, [r3]
        dmb     ish

I also come up with another solution that based on SeqLock.
After knowing that what I tried to achieve is essentially tear-detection, I rewrite it using a SeqLock template. I still define my three variables a, b, c as _Atomic uint32_t since I also want to modify them in thread_low_priority using atomic_fetch_*.

On ARMv7-M archiecture RMW atomic operations are implement using ldrex/strex. The compiler will issue a loop to check whether strex success or not. In my case, it could be a problem when using RMW operations because thread_high_priority needs to be fast and run uninterruptedly. I currently don't know if there is a case where strex always failed in the thread_high_priority context that could cause deadlock.

_Atomic uint32_t a, b, c;
atomic_uint seqcount = 0;

void thread_high_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint orig_cnt = atomic_load_explicit(&seqcount, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
  atomic_thread_fence(memory_order_release);

  _a = atomic_load_explicit(&a, memory_order_relaxed);
  _b = atomic_load_explicit(&b, memory_order_relaxed);
  _c = atomic_load_explicit(&c, memory_order_relaxed);
  atomic_store_explicit(&a, _a - 1, memory_order_relaxed);
  atomic_store_explicit(&b, _b + 1, memory_order_relaxed);
  atomic_store_explicit(&c, _c - 1, memory_order_relaxed);

  atomic_store_explicit(&seqcount, orig_cnt + 2, memory_order_release);
}

void thread_low_priority(void)
{
  uint32_t _a, _b, _c;
  
  uint c0, c1;
  do {
    c0 = atomic_load_explicit(&seqcount, memory_order_acquire);

    _a = atomic_load_explicit(&a, memory_order_relaxed);
    _b = atomic_load_explicit(&b, memory_order_relaxed);
    _c = atomic_load_explicit(&c, memory_order_relaxed);

    c1 = atomic_load_explicit(&seqcount, memory_order_acquire);
  } while (c0 & 1 || c0 != c1);
}

Edit: Again after checking the output from compiler, I slightly modify my code in thread_high_priority. Compile using ARM gcc 10.3.1 (2021.10 none) with compilation flag -O1 -mcpu=cortex-m3 -std=gnu18 -mthumb.

In my original code, dmb ish is issued before the store as shown below.

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_release);
--->
        adds    r1, r2, #1
        dmb     ish
        str     r1, [r3]

After I separate the memory barrier from store, dmb ish is issued after store, so that the update of seqcount is visible before updating a, b, c.

atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
-->
        adds    r1, r2, #1
        str     r1, [r3]
        dmb     ish

回复收藏 0 原文

~没有更多了~