C# 的“锁”是? Interlocked.CompareExchange构造已过时?

发布于 2024-08-05 09:30:25 字数 2106 浏览 8 评论 0原文

摘要:

在我看来:

  1. 将表示逻辑状态的字段包装到单个不可变的可使用对象中,
  2. 通过调用 Interlocked.CompareExchange更新对象的权威引用
  3. 并适当处理更新失败

提供了一种这使得“锁”构造不仅没有必要,而且是一种真正具有误导性的构造,它回避了有关并发的某些现实,并因此引入了许多新问题。

问题讨论:

首先,让我们考虑一下使用锁的主要问题

  1. 锁会导致性能下降,并且必须在读写时同时使用。
  2. 锁会阻止线程执行,阻碍并发并带来死锁风险。

考虑一下受“锁”启发的可笑行为。当需要同时更新一组逻辑资源时,我们“锁定”该组资源,并且通过松散关联但专用的锁定对象来实现这一点,否则该对象没有任何作用(红旗#1)。

然后,我们使用“锁定”模式来标记一个代码区域,在该区域中,一组数据字段发生逻辑上一致的状态更改,但我们通过将字段与同一对象中不相关的字段,同时让它们全部可变,然后迫使我们自己陷入困境(红旗#2),在读取这些不同的字段时我们还必须使用锁,这样我们就不会捕获它们处于不一致的状态。

显然,这种设计存在严重问题。它有点不稳定,因为它需要仔细管理锁对象(锁定顺序、嵌套锁、线程之间的协调、阻塞/等待另一个正在等待您执行某些操作的线程正在使用的资源等),这取决于上下文。我们还听到人们谈论避免僵局是多么“困难”,而实际上它非常简单:不要偷你打算要求为你参加比赛的人的鞋子!

解决方案:

完全停止使用“锁”。将字段正确地转入代表一致状态或模式的不可损坏/不可变的对象中。也许它只是一对用于在显示名称和内部标识符之间进行转换的字典,或者它可能是包含一个值和到下一个对象的链接的队列的头节点;无论它是什么,将其包装到它自己的对象中并将其密封以保持一致性。

认识到写入或更新失败的可能性,在发生时检测它,并根据上下文做出立即(或稍后)重试或执行其他操作的决定,而不是无限期地阻塞。

虽然阻塞似乎是一种对似乎必须完成的任务进行排队的简单方法,但并非所有线程都如此专注和自服务,以至于它们有能力这样做一个有危及整个系统风险的东西。不仅懒惰地用“锁”序列化事物,而且作为试图假装写入不应该失败的副作用,您会阻塞/冻结您的线程,因此它会设置无响应且无用,放弃所有其他责任它顽固地等待着完成它早些时候计划要做的事情,无知有时需要帮助他人来履行自己的责任。

当独立的自发行为同时发生时,竞争条件是正常的,但与不受控制的以太网冲突不同,作为程序员,我们可以完全控制我们的“系统”(即确定性数字硬件)及其输入(无论多么随机,以及随机性如何)零还是一?)和输出,以及存储系统状态的内存,因此活锁应该不是问题;此外,我们还具有带有内存屏障的原子操作,可以解决可能有许多处理器同时运行的问题。

总结一下:

  1. 获取当前状态对象,使用其数据,并构造一个新状态。
  2. 意识到其他活动线程也会做同样的事情,并且可能会打败你,但所有线程都会观察代表“当前”状态的权威参考点。
  3. 使用 Interlocked.CompareExchange 同时查看您工作所基于的状态对象是否仍然是最新状态,并将其替换为新状态,否则会失败(因为另一个线程先完成)并采取适当的纠正措施。

最重要的是你如何处理失败并重新站起来。这就是我们避免活锁、想太多而做得不够或做正确的事情的地方。我想说的是,锁创造了一种错觉,即你永远不会从马上掉下来,尽管在踩踏中骑行,当线程在这样一个幻想的土地上做白日梦时,系统的其余部分可能会分崩离析、崩溃和燃烧。


那么,是否有一些“锁”结构可以做到的事情,是使用 CompareExchange 和不可变逻辑状态对象的无锁实现无法实现的(更好,以不太不稳定的方式)?

所有这些都是我在激烈地处理锁之后自己意识到的,但经过一番搜索后,在另一个线程 无锁多线程编程会让事情变得更容易吗?,有人提到当我们面对具有数百个处理器的高度并行系统时,无锁编程将非常重要,如果我们无法承受使用高度竞争的锁。

Summary:

It seems to me that:

  1. wrapping fields representing a logical state into a single immutable consumable object
  2. updating the object's authoritative reference with a call to Interlocked.CompareExchange<T>
  3. and handling update failures appropriately

provides a kind of concurrency that renders the "lock" construct not only unnecessary, but a truly misleading construct that dodges certain realities about concurrency and introduces a host of new problems as a result.

Problem Discussion:

First, let's consider the main problems with using a lock:

  1. Locks cause a performance hit, and must be used in tandem for reading and writing.
  2. Locks block thread execution, hindering concurrency and risking deadlocks.

Consider the ridiculous behavior inspired by the "lock". When the need arises to update a logical set of resources concurrently, we "lock" the set of resources, and we do so via a loosely associated, but dedicated locking object, which otherwise serves no purpose (red flag #1).

We then use the "lock" pattern to mark-off a region of code where a logically consistent state change on a SET of data fields occurs, and yet we shoot ourselves in the foot by mixing the fields with unrelated fields in the same object, while leaving them all mutable and then forcing ourselves into a corner (red flag #2) where we have to also use locks when reading these various fields, so we don't catch them in an inconsistent state.

Clearly, there's a serious problem with that design. It's somewhat unstable, because it requires careful management of the lock objects (locking order, nested locks, coordination among threads, blocking/waiting on a resource in use by another thread that's waiting for you to do something, etc.), which depends on the context. We also hear people talk about how avoiding deadlock is "hard", when it's actually very straightforward: don't steal the shoes of a person you plan on asking to run a race for you!

Solution:

Stop using "lock" altogether. Properly roll your fields into an incorruptible/immutable object representing a consistent state or schema. Perhaps it's simply a pair of dictionaries for converting to and from display-names and internal-identifiers, or maybe it's a head node of a queue containing a value and a link to the next object; whatever it is, wrap it into it's own object and seal it for consistency.

Recognize write or update failure as a possibility, detect it when it occurs, and make a contextually informed decision to retry immediately (or later) or do something else instead of blocking indefinitely.

While blocking seems like a simple way to queue a task that seems like it must be done, not all threads are so dedicated and self-serving that they can afford to do such a thing at the risk of compromising the entire system. Not only is it lazy to serialize things with a "lock", but as a side affect of trying to pretend a write shouldn't fail, you block/freeze your thread, so it sets there unresponsive and useless, forsaking all other responsibilities in its stubborn wait to accomplish what it set out to do some time earlier, ignorant of the fact that assisting others is sometimes necessary for fulfilling it's own responsibilities.

Race conditions are normal when independent, spontaneous actions are occurring simultaneously, but unlike uncontrolled Ethernet collisions, as programmers we have total control over our "system" (i.e. deterministic digital hardware) and its inputs (no matter how random, and how random can a zero or one really be?) and outputs, and the memory that stores our system's state, so livelock should be a non-issue; furthermore, we have atomic operations with memory barriers that resolve the fact that there may be many processors operating concurrently.

To summarize:

  1. Grab the current state object, consume its data, and construct a new state.
  2. Realize that other active threads will be doing the very same thing, and may beat you to it, but all observe an authoritative reference point representing the "current" state.
  3. Use Interlocked.CompareExchange to simultaneously see if the state object you based your work on is still the most current state, and replace it with your new one, otherwise fail (because another thread finished first) and take appropriate corrective action.

The most important part is how you handle the failure and get back on your horse. This is where we avoid livelocks, thinking too much and not doing enough or doing the right thing. I would say that locks create the illusion that you'll never fall off your horse, despite riding in a stampede, and while a thread daydreams in such a fantasy land, the rest of the system can fall apart and crash and burn.


So, is there something the "lock" construct can do that can't be achieved (better, in a less unstable fashion) with a lock-free implementation using CompareExchange and immutable logical state objects?

All of this is a realization I've come to on my own after dealing with locks intensely, but after some searching, in another thread Is lock free multithreaded programming making anything easier?, someone mentions that lock-free programming is going to be very important when we face highly parallel systems with hundreds of processors, were we cannot afford to use highly contended locks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

长途伴 2024-08-12 09:30:25

你的比较交换建议有一个很大的缺点——它不公平,因为它有利于短期任务。如果系统中有许多短任务,则完成长任务的机会可能会非常低。

Your compare-exchange-suggestion has one big drawback - it is not fair because it favors short tasks. If there are many short tasks in a system, chances for a long task to ever complete may get very low.

七月上 2024-08-12 09:30:25

比赛发生需要满足四个条件。

  1. 第一个条件是存在可以从多个线程访问的内存位置。通常,这些位置是全局/静态变量,或者是可从全局/静态变量访问的堆内存。
  2. 第二个条件是存在一个与这些共享内存位置关联的属性(通常称为不变量),该属性必须为真或有效,程序才能正确运行。通常,该属性需要在更新发生之前保持 true 才能保证更新正确。
  3. 第三个条件是不变属性在实际更新的某些部分不成立。 (在处理的某些部分,它暂时无效或错误)。
  4. 发生竞争必须发生的第四个也是最后一个条件是另一个线程在不变量被破坏时访问内存,从而导致不一致或不正确的行为。

    1. 如果您没有可从多个线程访问的共享内存位置,或者您可以编写代码来消除该共享内存变量,或限制仅允许一个线程访问它,那么有不可能出现竞争条件,您无需担心任何事情。否则,lock 语句或其他一些同步例程是绝对必要的,不能安全地忽略。

    2. 如果没有不变量(假设您所做的只是写入此共享内存位置,并且线程操作中没有任何内容读取它的值),那么同样没有问题。

    3. 如果不变量永远不会无效,那么也没有问题。 (假设共享内存是一个日期时间字段,存储代码上次运行的日期时间,那么除非线程根本无法写入它,否则它不会无效......

    4. 要消除 nbr 4,您必须使用锁或某种类似的同步方法,限制对一次从多个线程访问共享内存的代码块的写访问。

在这种情况下,“并发命中”不仅是不可避免的,而且是绝对必要的。内存,以及您的关键“不变量”究竟是什么,允许您对系统进行编码以最小化这种并发“命中”。 (即安全最大化并发性。)

There are four conditions for a race to occur.

  1. The first condition is that there are memory locations that are accessible from more than one thread. Typically, these locations are global/static variables, or are heap memory reachable from global/static variables.
  2. The second condition is that there is a property (often called an invariant), which is associated with these shared memory locations that must be true, or valid, for the program to function correctly. Typically, the property needs to hold true before an update occurs for the update to be correct.
  3. The third condition is that the invariant property does not hold during some part of the actual update. (It is transiently invalid or false during some portion of the processing).
  4. The fourth and final condition that must occur for a race to happen is that another thread accesses the memory while the invariant is broken, thereby causing inconsistent or incorrect behavior.

    1. If you don't have a shared memory location which is accessible from multiple threads, or you can write your code to either eliminate that shared memory variable, or restrict access to it to only one thread, then there is no possibility of a race condition, and you don't need to worry about anything. Otherwise, the lock statement, or some other synchronization routine is absolutely necessary and cannot be safely ignored.

    2. If there is no invariant (let's say all you do is write to this shared memory location and nothing in the thread operation reads it's value) then again, there is no problem.

    3. If the invariant is never invalid, again no problem. (say the shared memory is a datetime field storing the datetime of the last time the code ran, then it can't be invalid unless a thread fails to write it at all...

    4. To eliminate nbr 4, you have to restrict write access to the block of code that accesses the shared memory from more than one thread at a time, using lock or some comparable synchronization methodology.

The "concurrency hit" is in this case not only unavoidable but absolutely necessary. Intelligent analysis of what exactly is the shared memory, and what exactly is your critical "invariant" allows you to code the system to minimize this concurrency "Hit". (i.e., maximize concurrency safely. )

终陌 2024-08-12 09:30:25

我想知道您将如何使用无锁编程风格来执行此任务?您有许多工作线程,它们都定期访问共享任务列表以执行下一个作业。 (当前)他们锁定列表,找到头部的项目,将其删除,然后解锁列表。请考虑所有错误条件和可能的数据竞争,以便两个线程最终不会处理同一任务,或者意外跳过任务。

我怀疑执行此操作的代码可能会遇到过于复杂的问题,并且在高争用的情况下可能会表现不佳。

I'd like to know how you would perform this task using your lock free programming style? You have a number of worker threads all periodically hitting a shared lists of tasks for the next job to perform. (currently) They lock the list, find the item at the head, remove it, and unlock the list. Please take into account all error conditions and possible data races so that no two threads can end up working on the same task, or that a task is accidentally skipped.

I suspect that the code to do this may suffer from a problem of over-complexity and have a possibility of poor performance in the case of high contention.

青衫负雪 2024-08-12 09:30:25

与 Interlocked.CompareExchange 等 CAS 操作相比,锁的一大优点是您可以修改锁内的多个内存位置,并且所有修改同时对其他线程/进程可见。

使用 CAS,只有一个变量会被原子更新。无锁代码通常要复杂得多,因为您不仅一次只能向其他线程呈现单个变量(或 CAS2 的两个相邻变量)的更新,还必须能够在 CAS 不更新时处理“失败”情况没有成功。另外,您还需要处理 ABA 问题和其他可能的并发症。

有多种方法,如低级锁定、细粒度锁定、条带锁、读写器锁等,可以使简单的锁定代码对多核更加友好。

也就是说,锁定和无锁代码都有很多有趣的用途。但是,除非您真正知道自己在做什么,否则创建自己的无锁代码并不适合初学者。使用经过充分验证的无锁代码或算法并对其进行彻底测试,因为在许多无锁尝试中导致失败的边缘条件很难找到。

The big advantage of a lock over a CAS operation like Interlocked.CompareExchange is that you can modify multiple memory locations within a lock and all the modifications will be visible to other threads / processes at the same time.

With CAS, only a single variable is atomically updated. Lockfree code is usually significantly more complex because not only can you only present the update of a single variable (or two adjacent variables with CAS2) to other threads at a time, you also have to be able to handle "fail" conditions when CAS doesn't succeed. Plus you need to handle ABA issues and other possible complications.

There are a variety of methods like low locking, fine grain locking, striped locks, reader-writer locks, etc. that can make simple locking code much more multicore friendly.

That said, there are plenty of interesting uses for both locking and lockfree code. However, unless you REALLY know what you're doing creating your own lockfree code is not for a beginner. Use either lockfree code or algorithms that have been well proven and test them thoroughly because the edge conditions that cause failure in many lockfree attempts are very hard to find.

童话 2024-08-12 09:30:25

我想说,一般来说,悲观并发由于乐观并发而过时,或者模式 A 由于模式 B 而过时,这并不过时。我认为这与上下文有关。无锁很强大,但单方面应用它可能没有意义,因为并不是每个问题都完全适合它。这是需要权衡的。也就是说,如果有一种传统上尚未实现的通用无锁、乐观的方法,那就太好了。简而言之,是的,lock可以做一些其他方法无法实现的事情:代表一个可能更简单的解决方案。话又说回来,如果某些事情并不重要,那么两者可能会产生相同的结果。我想我有点模棱两可...

I would say it is no more obsolete than saying, in general, that pessimistic concurrency is obsolete given optimistic concurrency, or that pattern A is obsolete because of pattern B. I think it's about context. Lock-free is powerful, but there may not be a point in applying it unilaterally because not every problem is perfectly suited to this. There are trade-offs. That said, it would be good to have a general purpose lockless, optimistic approach where it hasn't been realized traditionally. In short, yes, lock can do something that' can't be achieved with the other approach: represent a potentially simpler solution. Then again, it may be that both have the same result, if certain things don't matter. I suppose I'm equivocating just a bit...

紫轩蝶泪 2024-08-12 09:30:25

理论上,如果需要完成固定数量的工作,使用 Interlocked.CompareExchange 的程序将设法在不锁定的情况下完成所有工作。不幸的是,在存在高争用的情况下,读/计算新/比较交换循环最终可能会严重崩溃,以至于 100 个处理器每个尝试对公共数据项执行一次更新可能最终会花费更长的时间——实际上是这样 时间——比单个处理器按顺序执行 100 次更新要快。并行性不会提高性能——反而会毁掉性能。使用锁来保护资源意味着一次只有一个 CPU 可以更新它,但会提高性能以匹配单 CPU 情况。

无锁编程的一个真正优点是,如果线程被拦截任意时间,系统功能不会受到不利影响。通过使用锁和超时的组合,人们可以保持这一优势,同时避免纯粹基于CompareExchange的编程的性能缺陷。基本思想是,在存在争用的情况下,资源切换到基于锁的同步,但如果线程持有锁的时间过长,则将创建新的锁对象,并且较早的锁将被忽略。这意味着如果前一个线程仍在尝试执行CompareExchange循环,它将失败(并且必须重新开始),但后面的线程不会被阻塞,也不会牺牲正确性。

请注意,仲裁上述所有内容所需的代码将是复杂且棘手的,但如果希望系统在出现某些故障情况时具有鲁棒性,则可能需要这样的代码。

In theory, if there is a fixed amount of work to be done, a program which uses Interlocked.CompareExchange will manage to do it all without locking. Unfortunately, in the presence of high contention, a read/compute-new/compareExchange loop can end up thrashing so badly that 100 processors each trying to perform one update to a common data item may end up taking longer--in real time--than would a single processor performing 100 updates in sequence. Parallelism wouldn't improve performance--it would kill it. Using a lock to guard the resource would mean that only one CPU at a time could update it, but would improve performance to match the single-CPU case.

The one real advantage of lock-free programming is that system functionality will not be adversely affected if a thread gets waylaid for an arbitrary amount of time. One can maintain that advantage while avoiding the performance pitfalls of purely-CompareExchange-based programming by using a combination of locks and timeouts. The basic idea is that in the presence of contention, the resource switches to lock-based synchronization, but if a thread holds a lock for too long, the a new lock object will be created and the earlier lock will be ignored. This will mean that if that former thread was still trying to do a CompareExchange cycle, it will fail (and have to start all over again), but later threads will not be blocked nor will correctness be sacrificed.

Note that the code required to arbitrate all of the above will be complicated and tricky, but if one wants to have a system be robust in the presence of certain fault conditions, such code may be required.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文