从全局结构读取时是否需要信号量?

发布于 2024-07-09 03:28:58 字数 877 浏览 8 评论 0原文

这是一个相当基本的问题,但我没有看到任何地方有人提出这个问题。

假设我们有一个全局结构(在 C 中),如下所示:

struct foo {
  int written_frequently1;
  int read_only;
  int written_frequently2;
};

对我来说很明显,如果我们有很多线程读取和写入,我们需要在 writing_frequently 成员上使用信号量(或其他锁) ,即使是为了阅读,因为我们不能 100% 确定对该结构的赋值是原子的。

如果我们希望有很多线程读取 read_only 成员,并且没有线程写入,那么我们需要在结构访问上使用一个信号量来仅用于读取吗?

(我倾向于说不,因为前后位置不断变化的事实不应该影响 read_only 成员,并且读取该值的多个线程不应该互相干扰。但我不确定。)


[编辑:我现在意识到我应该更好地问这个问题,以便非常具体阐明我的意思。 当然,当我第一次问这个问题时,我并没有真正理解所涉及的所有问题。 当然,如果我现在全面编辑问题,我将毁掉所有这些伟大的答案。 我的意思更像是:

struct bar {
  char written_frequently1[LONGISH_LEN];
  char read_only[LONGISH_LEN];
  char written_frequently2[LONGISH_LEN];
};

我问的主要问题是,由于这些数据是结构的一部分,它是否受到其他结构成员的影响,并且可能反过来影响它们?

事实上,成员是整数,因此写入可能是原子的,在这种情况下实际上只是一个转移注意力的事实。]

A fairly basic question, but I don't see it asked anywhere.

Let's say we have a global struct (in C) like so:

struct foo {
  int written_frequently1;
  int read_only;
  int written_frequently2;
};

It seems clear to me that if we have lots of threads reading and writing, we need a semaphore (or other lock) on the written_frequently members, even for reading, since we can't be 100% sure that assignments to this struct will be atomic.

If we want lots of threads to read the read_only member, and none to write, to we need a semaphore on the struct access just for reading?

(I'm inclined to say no, because the fact that the locations immediately before and after are constantly changed shouldn't affect the read_only member, and multiple threads reading the value shouldn't interfere with each other. But I'm not sure.)


[Edit: I realize now I should have asked this question much better, in order to clarify very specifically what I meant. Naturally, I didn't really grok all of the issues involved when I first asked the question. Of course, if I comprehensively edit the question now, I will ruin all of these great answers. What I meant is more like:

struct bar {
  char written_frequently1[LONGISH_LEN];
  char read_only[LONGISH_LEN];
  char written_frequently2[LONGISH_LEN];
};

The major issue I asked about is, since this data is part of a struct, is it at all influenced by the other struct members, and might it influence them in return?

The fact that the members were ints, and therefore writes are likely atomic, is really just a red herring in this case.]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

等风来 2024-07-16 03:28:58

您需要一个互斥体来保证操作是原子的。 因此,在这种特殊情况下,您可能根本不需要互斥体。具体来说,如果每个线程写入一个元素并且写入是原子的并且新值独立于任何元素(包括其本身)的当前值,没有问题。

示例:多个线程中的每一个都会更新一个“last_updated_by”变量,该变量仅记录更新它的最后一个线程。 显然,只要变量本身被原子更新,就不会发生错误。


但是,如果线程一次读取或写入多个元素,您确实需要一个互斥体来保证一致性,特别是因为您提到锁定一个元素 em> 而不是整个结构

示例:线程更新结构的“日”、“月”和“年”元素。 这必须以原子方式发生,以免另一个线程在“月”增量之后但在“日”换行到 1 之前读取该结构,以避免诸如 2 月 31 日之类的日期。请注意,您必须尊重互斥锁阅读; 否则您可能会读取到错误的、半更新的值。

You need a mutex to guarantee that an operation is atomic. So in this particular case, you may not need a mutex at all. Specifically, if each thread writes to one element and the write is atomic and the new value is independent of the current value of any element (including itself), there is no problem.

Example: each of several threads updates a "last_updated_by" variable that simply records the last thread that updated it. Clearly, as long as the variable itself is updated atomically, no errors will occur.


However, you do need a mutex to guarantee consistency if a thread reads or writes more than one element at a time, particularly because you mention locking an element rather than the entire structure.

Example: a thread updates the "day", "month" and "year" elements of a structure. This must happen atomically, lest another thread read the structure after the "month" increments but before the "day" wraps to 1, to avoid dates such as February 31. Note that you must honor the mutex when reading; otherwise you may read an erroneous, half-updated value.

粉红×色少女 2024-07-16 03:28:58

如果 read_only 成员实际上是只读的,则不存在数据被更改的危险,因此不需要同步。 这可以是在线程启动之前设置的数据。

您将需要同步任何可写入的数据,无论频率如何。

If the read_only member is actually read only, then there is no danger of the data being changed and therefore no need for synchronization. This could be data that is set up before the threads are started.

You will want synchronization for any data that can be written, regardless of the frequency.

浅唱々樱花落 2024-07-16 03:28:58

“只读”有点误导,因为变量在初始化时至少被写入一次。 在这种情况下,如果初始写入和后续读取位于不同的线程中,您仍然需要在初始写入和后续读取之间设置内存屏障,否则它们可能会看到未初始化的值。

"Read only" is a bit misleading, since the variable is written to at least once when it's initialized. In that case you still need a memory barrier between the initial write and subsequent reads if they're in different threads, or else they could see the uninitialized value.

缪败 2024-07-16 03:28:58

读者也需要互斥体!

似乎有一个常见的误解,认为互斥锁仅适用于编写者,而读者不需要它们。 这是错误的,并且这种误解导致了极难诊断的错误。

下面以示例的形式说明原因。

想象一个每秒更新代码的时钟:

if (++seconds > 59) {        // Was the time hh:mm:59?
   seconds = 0;              // Wrap seconds..
   if (++minutes > 59)  {    // ..and increment minutes.  Was it hh:59:59?
     minutes = 0;            // Wrap minutes..
     if (++hours > 23)       // ..and increment hours.  Was it 23:59:59?
        hours = 0;           // Wrap hours.
    }
}

如果代码不受互斥体保护,则另一个线程可以读取小时分钟秒< /code> 更新过程中的变量。 按照上面的代码:

[Start just before midnight] 23:59:59
[WRITER increments seconds]  23:59:60
[WRITER wraps seconds]       23:59:00
[WRITER increments minutes]  23:60:00
[WRITER wraps minutes]       23:00:00
[WRITER increments hours]    24:00:00
[WRITER wraps hours]         00:00:00

从第一次增量到六步后的最终操作,时间无效。 如果读者在此期间检查时钟,它会看到一个值不仅可能不正确,而且非法。 由于您的代码可能依赖于时钟而不直接显示时间,因此这是“跳弹”错误的典型来源,而且众所周知,很难追踪。

修复方法很简单。

用互斥体包围时钟更新代码,并创建一个读取器函数,该函数在执行时也锁定互斥体。 现在,读取器将等待更新完成,写入器不会在读取过程中更改值。

Readers need mutexes, too!

There seems to be a common misconception that mutexes are for writers only, and that readers don't need them. This is wrong, and this misconception is responsible for bugs that are extremely difficult to diagnose.

Here's why, in the form of an example.

Imagine a clock that updates every second with the code:

if (++seconds > 59) {        // Was the time hh:mm:59?
   seconds = 0;              // Wrap seconds..
   if (++minutes > 59)  {    // ..and increment minutes.  Was it hh:59:59?
     minutes = 0;            // Wrap minutes..
     if (++hours > 23)       // ..and increment hours.  Was it 23:59:59?
        hours = 0;           // Wrap hours.
    }
}

If the code is not protected by a mutex, another thread can read the hours, minutes, and seconds variables while an update is in progress. Following the code above:

[Start just before midnight] 23:59:59
[WRITER increments seconds]  23:59:60
[WRITER wraps seconds]       23:59:00
[WRITER increments minutes]  23:60:00
[WRITER wraps minutes]       23:00:00
[WRITER increments hours]    24:00:00
[WRITER wraps hours]         00:00:00

The time is invalid from the first increment until the final operation six steps later. If a reader checks the clock during this period, it will see a value that may be not only incorrect but illegal. And since your code is likely to depend on the clock without displaying the time directly, this is a classic source of "ricochet" errors that are notoriously difficult to track down.

The fix is simple.

Surround the clock-update code with a mutex, and create a reader function that also locks the mutex while it executes. Now the reader will wait until the update is complete, and the writer won't change the values mid-read.

爱殇璃 2024-07-16 03:28:58

不。

一般来说,您需要信号量来防止并发访问资源(在本例中为 int)。 但是,由于 read_only 成员是只读的,因此它不会在访问之间/期间发生更改。 请注意,它甚至不必是原子读取 - 如果没有任何变化,您总是安全的。

您最初如何设置read_only

No.

In general you need semaphores to prevent concurrent access to resources (an int in this case). However, since the read_only member is read only, it won't change between/during accesses. Note that it doesn't even have to be an atomic read — if nothing changes, you're always safe.

How are you setting read_only initially?

风铃鹿 2024-07-16 03:28:58

如果所有线程都只是读取,则不需要信号量。

If all the threads are only reading, you don't need a semaphore.

入画浅相思 2024-07-16 03:28:58

您可能会喜欢阅读这些关于实用无锁的论文中的任何一篇编程,或者只是剖析和理解所提供的片段。

You might enjoy reading any one of these papers on practical lock free programming, or just dissecting and understanding the provided snippets.

童话 2024-07-16 03:28:58

我会将每个字段隐藏在函数调用后面。 只写字段将有一个信号量。 只读仅返回值。

I would hide each field behind behind a function call. The write-only fields would have a semaphore. The read-only just returns the value.

只有影子陪我不离不弃 2024-07-16 03:28:58

添加到之前的答案:

  1. 在这种情况下,自然同步范例是互斥,而不是信号量。
  2. 我同意您不需要对只读变量进行任何互斥。
  3. 如果结构的读写部分具有一致性约束,通常您将需要一个它们的所有互斥体,以保持操作的原子性。

Adding to previous answers:

  1. In this case the natural synchronization paradigm is mutual exclusion, not semaphores.
  2. I agree that you don't need any mutex on readonly variables.
  3. If the read-write part of the structure has consistency constraints, in general you will need one mutex for all of them, in order to keep the operations atomic.
咆哮 2024-07-16 03:28:58

非常感谢所有出色的回答者(以及所有出色的答案)。

总结一下:

如果结构体有一个只读成员(在我们的例子中,如果该值在任何线程可能想要读取它之前设置一次),那么读取该成员的线程不需要锁、互斥锁、信号量或任何其他并发保护。

即使经常给其他成员写信也是如此。 不同的变量都是同一结构的一部分这一事实没有什么区别。

Many thanks to all the great answerers (and for all the great answers).

To sum up:

If there is a read-only member of a struct (in our case, if the value is set once, long before any thread might want to read it), then threads reading this member do not need locks, mutexes, semaphores, or any other concurrency protection.

This is true even if the other members are written to frequently. The fact that the different variables are all part of the same struct makes no difference.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文