对 c++ 中多线程竞争条件的怀疑；带 vtable 实现的虚拟调用

发布于 2024-09-08 12:43:23 字数 3165 浏览 22 评论 0原文

我怀疑在某些 C++ 多线程情况下可能存在竞争条件，涉及 vtable 动态调度实现中的虚拟方法调用（其中 vtable 指针作为隐藏成员存储在具有虚拟方法的对象中）。我想确认这是否确实是一个问题，并且我指定了 boost 的线程库，以便我们可以假设一些参考框架。

假设对象“O”有一个 boost::mutex 成员，其整个构造函数/析构函数和方法都被范围锁定（类似于 Monitor 并发模式）。线程“A”在没有外部同步的情况下在堆上构造一个对象“O”（即没有包含“new”操作的共享互斥体，为此它可以与其他线程同步；但请注意，仍然存在“内部” ，监视器”互斥体锁定其构造函数的范围）。然后，线程 A 通过同步机制（例如，同步读写器队列）将指向“O”实例（它刚刚构造的）的指针传递给另一个线程“B”（注意：只有指向“O”实例的指针）正在传递对象——而不是对象本身）。构造之后，线程“A”或任何其他线程都不会对“A”构造的“O”实例执行任何写操作。

线程“B”从同步队列中读取对象“O”的指针值，然后立即离开守卫该队列的临界区。然后线程“B”对对象“O”执行虚拟方法调用。我认为这里可能会出现问题。

现在，我对动态分派的[很可能] vtable 实现中的虚拟方法调用的理解是，调用线程“B”必须取消引用指向“O”的指针，以获得存储为其对象的隐藏成员的 vtable 指针，并且这种情况发生在进入方法体之前（当然是因为在访问存储在对象本身中的 vtable 指针之前无法安全准确地确定要执行的方法体）。假设上述陈述对于这样的实现可能是正确的，这不是竞争条件吗？

由于在任何内存可见性保证操作发生之前（即获取对象“O”中的成员变量互斥体），线程“B”检索虚函数表指针（通过取消引用指向堆中对象“O”的指针），那么不确定“B”是否会感知到“A”最初写在对象“O”的构造上的vtable指针值，对吗？（即，它可能会感知到垃圾值，从而导致未定义的行为，对吗？）。

如果上述情况是有效的，那么这是否意味着对线程之间共享的专用内部同步对象进行虚拟方法调用是未定义的行为？

同样，由于标准对于 vtable 实现是不可知的，因此如何保证 vtable 指针在虚拟调用之前对其他线程安全可见？我想可以在外部同步（“外部”，例如“用共享互斥锁 lock()/unlock() 块包围”）构造函数调用，然后至少在每个线程中进行初始虚拟方法调用，但这似乎是一些非常不和谐的编程。

因此，如果我的怀疑是正确的，那么一个可能更优雅的解决方案是使用内联的非虚拟成员函数来锁定成员互斥体，然后转发到虚拟调用。但是，即便如此，我们能否保证构造函数在保护构造函数主体本身的 lock() 和unlock() 范围内初始化 vtable 指针？

如果有人可以帮助我澄清这一点并确认/否认我的怀疑，我将非常感激。

编辑：演示上述内容的代码

class Interface
{
    public:
    virtual ~Interface() {}
    virtual void dynamicCall() = 0;
};

class Monitor : public Interface
{
    boost::mutex mutex;
    public:
    Monitor()
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        // initialize
    }
    virtual ~Monitor()
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        // destroy
    }
    virtual void dynamicCall()
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        // do w/e
    }
};

// for simplicity, the numbers following each statement specify the order of execution, and these two functions are assumed
// void passMonitorToSharedQueue( Interface * monitor )
//        Thread A passes the 'monitor' pointer value to a 
//        synchronized queue, pushes it on the queue, and then 
//        notifies Thread B that a new entry exists
// Interface * getMonitorFromSharedQueue()
//        Thread B blocks until Thread A notifies Thread B
//        that a new 'Interface *' can be retrieved,at which
//        point it retrieves and returns it
void threadBFunc()
{
    Interface * if = getMonitorFromSharedQueue(); // (1)
    if->dynamicCall(); // (4) (ISSUE HERE?)
}
void threadAFunc()
{
    Interface * monitor = new Monitor; // (2)
    passMonitorToSharedQueue(monitor); // (3)
}

- 在第 (4) 点我的印象是“线程 A”写入内存的 vtable 指针值可能对“线程 B”不可见，因为我没有任何理由假设编译器将生成这样的代码：vtable 指针被写入构造函数的锁定互斥块内。

例如，考虑多核系统的情况，其中每个核都有专用缓存。根据这篇文章，缓存通常会被积极优化，并且——尽管强制缓存一致性——如果不涉及同步原语，则不会对缓存一致性强制执行严格的排序。

也许我误解了这篇文章的含义，但这是否意味着“A”将 vtable 指针写入构造对象（并且没有迹象表明此写入发生在构造函数的锁定互斥块内）可能会在“B”读取vtable指针之前不被“B”感知？如果A和B都在不同的核心上执行（“A”在core0上，“B”在core1上），则缓存一致性机制可能会重新排序core1缓存中vtable指针值的更新（使其一致的更新）与 core0 缓存中的 vtable 指针的值（“A”写入的）一样，它发生在“B”读取之后......如果我正确地解释了这篇文章。

原文

I have a suspicion that there might be a race condition in a certain C++ multithreading situation involving virtual method calls in a vtable dynamic dispatching implementation (for which a vtable pointer is stored as a hidden member in the object with virtual methods). I would like to confirm whether or not this is actually an issue, and I am specifying boost's threading library so we can assume some frame of reference.

Suppose an object "O" has a boost::mutex member for which the entirety of its constructor/destructor and methods are scope-locked on (similar to the Monitor concurrency pattern). A thread "A" constructs an object "O" on the heap without external synchronization (ie WITHOUT an shared mutex enclosing the "new" operation, for which it could synchronize with other threads; note, though, that there is still the "internal, Monitor" mutex locking the scope of its constructor). The thread A then passes a pointer to the "O" instance (which it just constructed) to another thread "B", by means of a synchronized mechanism--for instance, a synchronized readers-writers queue (note: only the pointer to the object is being passed--not the object itself). After construction, neither thread "A" or any other threads perform any writing operation on the "O" instance which "A" constructed.

The thread "B" reads the pointer value of the object "O" from the synchronized queue, after which it immediately leaves the critical section guarding the queue. Then the thread "B" performs a virtual method call on the object "O." Here is where I think an issue may arise.

Now, my understanding of virtual method calls in a [quite probable] vtable implementation of dynamic dispatching is that the calling thread "B" must dereference the pointer to "O" in order to obtain the vtable pointer stored as a hidden member of its object, and that this happens BEFORE the method body is entered (naturally because the method body to execute is not safely and accurately determined until vtable pointer stored in the object itself is accessed). Assuming the aforementioned statements are possibly true for such an implementation, is this not a race condition?

Since the vtable pointer is retrieved by thread "B" (by dereferencing the pointer to the object "O" located in the heap) prior to any memory visibility guaranteeing operations taking place (ie acquiring the member variable mutex in the object "O"), then it is not certain that "B" will perceive the vtable pointer value that "A" originally wrote on the object "O"'s construction, correct? (ie, it may instead perceive a garbage value, resulting in undefined behavior, correct?).

If the above is a valid possibility, does this not imply that making virtual method calls on exclusively internally synchronized objects that are shared between threads is undefined behavior?

And--likewise--since the standard is agnostic to a vtable implementation, how could one ever guarantee that the vtable pointer is safely visible to other threads prior to a virtual call? I suppose one could externally synchronize ("externally" as in, for instance, "surrounding with a shared mutex lock()/unlock() block") the constructor call and then at least the initial virtual method call in each of the threads, but this seems like some awfully discordant programming.

So, if my suspicions are true, then a possibly more elegant solution would be to use inlined, non-virtual member functions which lock the member mutex and then subsequently forward to a virtual call. But--even then--could we guarantee that the constructor initialized the vtable pointer within the confines of the lock() and unlock() guarding the constructor body itself?

If someone could help me clear this up and confirm/deny my suspicions, I would be very grateful.

EDIT: code demonstrating the above

class Interface
{
    public:
    virtual ~Interface() {}
    virtual void dynamicCall() = 0;
};

class Monitor : public Interface
{
    boost::mutex mutex;
    public:
    Monitor()
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        // initialize
    }
    virtual ~Monitor()
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        // destroy
    }
    virtual void dynamicCall()
    {
        boost::unique_lock<boost::mutex> lock(mutex);
        // do w/e
    }
};

// for simplicity, the numbers following each statement specify the order of execution, and these two functions are assumed
// void passMonitorToSharedQueue( Interface * monitor )
//        Thread A passes the 'monitor' pointer value to a 
//        synchronized queue, pushes it on the queue, and then 
//        notifies Thread B that a new entry exists
// Interface * getMonitorFromSharedQueue()
//        Thread B blocks until Thread A notifies Thread B
//        that a new 'Interface *' can be retrieved,at which
//        point it retrieves and returns it
void threadBFunc()
{
    Interface * if = getMonitorFromSharedQueue(); // (1)
    if->dynamicCall(); // (4) (ISSUE HERE?)
}
void threadAFunc()
{
    Interface * monitor = new Monitor; // (2)
    passMonitorToSharedQueue(monitor); // (3)
}

-- at point (4)
I'm under the impression that the vtable pointer value which "Thread A" wrote to memory may not be visible by "Thread B", as I don't see any reason to assume that the compiler will generate code such that the vtable pointer is written within the constructor's locked mutex block.

For instance, consider the situation of multicore systems where each core has a dedicated cache. According to this article, caches are commonly aggressively optimized and--despite forcing cache coherence--do not enforce a strict ordering on cache coherence if there are no synchronization primitives involved.

Perhaps I am misunderstanding the implications of the article, but wouldn't that mean that "A"'s write of the vtable pointer to the constructed object (and there is no indication that this write occurs within a the constructor's locked mutex block) may not be perceived by "B" before "B" reads the vtable pointer? If both A and B are executed on different cores ("A" on core0 and "B" on core1), the cache coherence mechanism may re-order the update of the vtable pointer value in core1's cache (the update that would make it consistent with the value of the vtable pointer in core0's cache, which "A" wrote) such that it occurs after "B"'s read ... if I'm interpreting the article correctly.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我不是你的备胎 2024-09-15 12:43:23

我不太明白，但我认为您可能有两种可能性：

A）“O”在将其传递到同步队列到“B”之前已完全构造（返回构造函数）。在这种情况下，没有问题，因为对象已完全构造完毕，包括 vtable 指针。该位置的内存将具有 vtable，因为它全部位于一个进程内。

B) “O”尚未完全构造，但例如您将 this 从构造函数传递到同步队列中。在这种情况下，仍然必须在线程“A”中调用构造函数主体之前设置 vtable 指针，因为从构造函数调用虚函数是有效的 - 它只会调用当前类的方法版本，不是最衍生的一种。因此，我也不希望在这种情况下看到竞争条件。如果您实际上是从构造函数中将 this 传递给另一个线程，您可能需要重新考虑您的方法，因为调用未完全构造的对象似乎确实很危险。

回复收藏 0 原文

江城子 2024-09-15 12:43:23

如果我试图理解你的文章，我相信你在问这个：-

线程“A”在没有外部同步的情况下在堆上构造了一个对象“O”

// global namespace
SomeClass* pClass = new SomeClass;

同时你说线程- “A”将上述实例传递给线程“B”。这意味着实例 SomeClass 已完全构造或者您是否尝试将 this 指针从 SomeClass 的 ctor 传递到线程“B”？如果是，那么您肯定在虚拟函数方面遇到了麻烦。但这与竞争条件无关。

如果您在线程“B”中访问全局实例变量而线程“A”没有传递它，则可能存在竞争条件。大多数编译器都设计了“新”指令，例如...。

pClass = // Step 3
operator new(sizeof(SomeClass)); // Step 1
new (pClass ) SomeClass; // Step 2

如果仅完成 Step-1，或者仅完成 Step-1 和 Step-2，则访问 pClass 是未定义的。

华泰

If I try to understand your essay, I believe you are asking this:-

A thread "A" constructs an object "O" on the heap without external synchronization

// global namespace
SomeClass* pClass = new SomeClass;

At the same time you are saying that thread-'A' passes the above instance to thread-'B'. This means that the instance SomeClass is fully constructed Or are you trying to pass this pointer from the ctor of SomeClass to thread-'B'? If yes, you are definitely in trouble w.r.t virtual functions. But this has got nothing to do with the race conditions.

If you are accessing the global instance variable in thread-'B' without thread-'A' passing it then there is a possibility of race conditions. The 'new' instruction is laid out by most compilers like ....

pClass = // Step 3
operator new(sizeof(SomeClass)); // Step 1
new (pClass ) SomeClass; // Step 2

If only Step-1 is complete, or if only Step-1 and Step-2 are complete then accessing pClass is undefined.

HTH

回复收藏 0 原文

情域 2024-09-15 12:43:23

在具有隐式缓存的共享内存多处理器系统中，您需要内存屏障以使主内存的更改对其他缓存可见。通常，您可以假设获取或释放任何操作系统同步原语（以及构建在它们之上的任何操作系统同步原语）都具有完整的内存屏障，这样在获取（或释放）同步原语之前发生的所有写入对于您获取同步原语之后的所有处理器都是可见的。它（或释放）。

对于您的具体问题，您在 Monitor::Monitor() 内部有一个内存屏障，因此当它返回时，vtable 将至少已初始化为 Monitor::vtable >。如果您从 Monitor 派生，则可能会出现问题，但在您发布的代码中却没有派生，因此这不是问题。

如果您确实想确保在调用 getMonitorFromSharedQueue() 时获得正确的 vtable，则应该在调用 if->dynamicCall() 之前设置读取屏障。