当前位置：文江博客话题详情

解锁对 stl vector::size 安全性的访问

发布于 2024-08-02 05:22:21 字数 236 浏览 2 评论 0原文

我在 stl 向量上有几个作家（线程）和一个读者。

正常的写入和读取受到互斥体保护，但我想避免我所拥有的循环上的争用，并且我想知道 vector::size 是否足够安全，我想这取决于实现，但由于通常向量动态内存适用于存储的项目在重新分配期间不应使存储大小的内存失效。

我不介意在尺寸>之后出现误报。 0 我实际上会锁定并再次检查，因此如果在另一个线程写入时读取 size() 不会出现段错误，那么对我来说应该足够安全。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

空城之時有危險 2024-08-09 05:22:21

我不知道并发读取和写入整数段错误的实现（尽管 C++03 标准没有禁止这样做，而且我不知道 POSIX 是否禁止）。如果向量使用 pImpl，并且不在向量对象本身中存储大小，则当您尝试从已在另一个线程中释放的 pImpl 对象读取大小时，可能会遇到问题。例如，我的机器上的 GCC 确实使用了 pImpl（并且不直接存储大小 - 它是根据 begin() 和 end() 之间的差异计算的，因此在修改期间明显存在竞争条件的机会）。

即使它没有崩溃，它也很可能给出毫无意义或错误的答案。如果您不锁定，那么您读取的值可能是：

非原子读取，这意味着您获得一个值的最重要的一半和另一个值的最不重要的一半。实际上，在大多数实现中，读取 size_t 可能是原子的，因为有充分的理由将 size_t 设为架构的自然字大小。但如果发生这种情况，当“之前”和“之后”都不为 0 时，可能会将值读取为 0。例如，考虑转换 0x00FF -> 0。 0x0100。如果你得到“之后”的下半部分和“之前”的上半部分，你就读到了 0。
任意陈旧。如果没有锁定（或其他一些内存屏障），您可以从缓存中获取值。如果该缓存不与其他 CPU/核心共享，并且您的架构没有所谓的“一致缓存”，那么运行不同线程的不同 CPU 或核心可能在六周前更改了大小，并且您永远不会查看新值。屏障，如果另一个线程执行了 Push_back，您可以想象“看到”向量末尾的新值，但不能“看到”增加的大小。

许多这样的问题都隐藏在常见的架构中。例如，x86 具有一致的缓存，并且 MSVC 在访问易失性对象时保证完整的内存屏障。 ARM 不保证一致的缓存，但实际上多核 ARM 并不常见，因此双重检查锁定通常也适用。

这些保证解决了一些困难并允许进行一些优化，这就是为什么它们首先被制定，但它们并不通用。显然，如果不做出一些超出 C++ 标准的假设，就根本无法编写多线程代码，但是您依赖的特定于供应商的保证越多，代码的可移植性就越差。除了参考特定的实现之外，无法回答您的问题。

如果您正在编写可移植代码，那么您实际上应该将所有内存读取和写入都视为可能针对您的线程自己的私有内存缓存。内存屏障（包括锁）是一种“发布”您的写入和/或从其他线程“导入”写入的方法。与版本控制系统（或您最喜欢的任何本地副本的其他示例）的类比很明显，不同之处在于，即使您不要求，事物也可能随时发布/导入。当然，没有合并或冲突检测，除非行业最终在我没有看到的时候实现了事务内存;-)

在我看来，多线程代码应该首先避免共享内存，然后在绝对必要时锁定，然后进行分析，然后担心争用和无锁算法。一旦进入最后阶段，您需要研究并遵循针对您的特定编译器和架构的经过充分测试的原则和模式。 C++0x 将通过标准化一些您可以依赖的东西来提供一定帮助，并且 Herb Sutter 的“有效并发”系列中的一些内容详细介绍了如何利用这些保证。其中一篇文章提供了无锁多写入器单读取器队列的实现，它可能适合也可能不适合您的目的。

I don't know off-hand of an implementation in which concurrent reads and writes to an integer segfault (although the C++03 standard does not prohibit that, and I don't know whether POSIX does). If the vector uses pImpl, and doesn't store the size in the vector object itself, you could maybe have problems where you try to read the size from a pImpl object which has been freed in another thread. For example, GCC on my machine does use a pImpl (and doesn't store the size directly - it's calculated as the difference between begin() and end(), so there's obvious opportunities there for race conditions during modification).

Even if it doesn't crash, though, it might very well give a meaningless or wrong answer. If you don't lock then the value you read could for example be:

read non-atomically, meaning you get the most significant half of one value and the least significant half of another. In practice, reading size_t is probably atomic on most implementations, since there are good reasons for size_t to be the natural word size of the architecture. But if it happens, this could read a value as 0 when neither the "before" not the "after" was 0. Consider for example the transition 0x00FF -> 0x0100. If you get the bottom half of the "after" and the top half of "before", you've read 0.
arbitrarily stale. Without locking (or some other memory barrier), you could get a value out of a cache. If that cache is not shared with other CPUs/cores, and if your architecture does not have so-called 'coherent caches', then a different CPU or core running a different thread could have changed the size six weeks ago, and you will never see the new value. Furthermore, different addresses might be different amounts stale - without memory barriers, if another thread has done a push_back you could conceivably "see" the new value at the end of your vector but not "see" the increased size.

A lot of these problems are hidden on common architectures. For instance, x86 has coherent caches, and MSVC guarantees full memory barriers when accessing volatile objects. ARM doesn't guarantee coherent caches, but in practice multi-core ARM isn't that common, so double-checked locking normally works there too.

These guarantees solve some difficulties and allow some optimisations, which is why they're made in the first place, but they're not universal. Obviously you can't write multi-threaded code at all without making some assumptions beyond the C++ standard, but the more vendor-specific guarantees you rely on, the less portable your code is. It's not possible to answer your question other than with reference to a particular implementation.

If you're writing portable code, then really you should think of all memory reads and writes as potentially being to your thread's own private memory cache. Memory barriers (including locks) are a means to "publish" your writes and/or "import" writes from other threads. The analogy with version-control systems (or your favourite other example of local copies of anything) is clear, with the difference that things might be published/imported at any time, even if you don't ask them to be. And of course there's no merging or conflict detection, unless the industry has finally implemented transactional memory while I wasn't looking ;-)

In my opinion, multi-threaded code should first avoid shared memory, then lock if absolutely necessary, then profile, then worry about contention and lock-free algorithms. Once you get to the last stage, you need to research and follow well-tested principles and patterns for your particular compiler and architecture. C++0x will help somewhat by standardising some things you can rely on, and some of Herb Sutter's "Effective Concurrency" series goes into details how to make use of these guarantees. One of the articles has an implementation of a lock-free multi-writer single-reader queue, which may or may not be adaptable for your purposes.

回复收藏 0 原文