使用 C/C++ 乐观读取和锁定 STM（软件事务内存）

发布于 2024-12-20 03:30:25 字数 1904 浏览 1 评论 0原文

我一直在对 STM（软件事务内存）实现进行一些研究，特别是关于利用锁且不依赖于垃圾收集器的存在的算法，以保持与 C/C++ 等非托管语言的兼容性。我读过 Herlihy 和 Shavit 的 “多处理器编程的艺术”，并阅读了 Shavit 的几篇描述他的“事务锁定” 和 "事务锁定II” STM 实现。他们的基本方法是利用存储全局版本时钟值的哈希表和锁来确定另一个线程的写入是否已触及内存位置。据我了解该算法，当执行写入事务时，版本时钟被读取并存储在线程本地内存中，并且还在线程本地内存中创建读取集和写入集。然后执行以下步骤：

读取的任何地址的值都存储在读取集中。这意味着事务检查正在读取的任何位置都没有被锁定，并且它们等于或小于本地存储的版本时钟值。
写入的任何地址的值以及要写入这些位置的值都存储在写入集中。
一旦整个写事务完成（这可能包括读取和写入多个位置），事务将尝试使用哈希表中的锁来锁定要写入的每个地址，该哈希表针对该地址进行哈希处理' 价值。
当所有写集地址被锁定时，全局版本时钟自动递增，并且新的递增值被本地存储。
写入事务再次检查以确保读取集中的值尚未使用新版本号更新或未被另一个线程锁定。
写事务使用从步骤 #4 存储的新值更新该内存位置的版本标记，并将写集中的值提交到内存。
内存位置上的锁将被释放

。步骤失败（即步骤#1、#3 和#5），然后写入事务将中止。

读取事务的过程要简单得多。根据Shavit的论文，我们简单地

读取并本地存储全局版本时钟值
检查以确保内存位置没有大于当前存储的全局版本时钟值的时钟值，并确保内存位置当前不存在锁定
执行读取操作
重复步骤 #2 进行验证

如果步骤 #2 或步骤 #4 失败，则读取事务将中止。

但我似乎无法解决的问题是，当您尝试读取位于堆中的对象内的内存位置，并且另一个线程在指针上调用 delete 时会发生什么到那个物体？在 Shavit 的论文中，他们详细解释了如何不能对已回收或释放的内存位置进行写入，但似乎在读取事务内部，没有什么可以阻止可能的计时场景，使您能够从已被另一个线程释放的对象内部的内存位置读取。作为示例，请考虑以下代码：

Thread A 在原子读取事务内部执行以下操作：linked_list_node* next_node = node->next;

Thread B 执行以下操作： delete node;

由于 next_node 是线程局部变量，因此它不是事务对象。尽管实际上需要两次单独的读取，但解引用操作需要为其分配node->next的值。在这些读取之间，可以在 node 上调用 delete，以便从成员 next 读取实际上是从一段内存中读取已经被释放了。由于读取是乐观的，因此在线程 A 中不会检测到线程 B 中 node 所指向的内存的释放。这不会导致可能的崩溃或分段错误吗？如果确实如此，如何在不锁定读取的内存位置的情况下避免这种情况（教科书和论文都指出这是不必要的）？

原文

I've been doing some research on STM (software transactional memory) implementations, specifically on algorithms that utilize locks and are not dependent on the presence of a garbage collector in order to maintain compatibility with non-managed languages like C/C++. I've read the STM chapter in Herlihy and Shavit's "The Art of Multiprocessor Programming", as well as read a couple of Shavit's papers that describe his "Transactional Locking" and "Transactional Locking II" STM implementations. Their basic approach is to utilize a hash-table that stores the values of a global version-clock and a lock to determine if a memory location has been touched by another thread's write. As I understand the algorithm, when a writing transaction is performed, the version-clock is read and stored in thread-local memory, and a read-set and write-set are also created in thread-local memory. Then the following steps are performed:

The values of any addresses read are stored in the read-set. This means that the transaction checks that any locations being read are not locked, and they are equal to or less than the locally stored version clock value.
The values of any addresses written are stored in the write-set, along with the values that are to be written to those locations.
Once the entire write-transaction is complete (and this can include reading and writing to a number of locations), the transaction attempts to lock each address that is to be written to using the lock in the hash-table that is hashed against the address' value.
When all the write-set addresses are locked, the global version clock is atomically incremented and the new incremented value is locally stored.
The write-transaction checks again to make sure that the values in the read-set have not been updated with a new version-number or are not locked by another thread.
The write-transaction updates the version-stamp for that memory location with the new value it stored from step #4, and commits the values in the write-set to memory
The locks on the memory locations are released

If any of the above check-steps fail (i.e., steps #1, #3, and #5), then the write-transaction is aborted.

The process for a read-transaction is a lot simpler. According to Shavit's papers, we simply

Read and locally store the global version-clock value
Check to make sure the memory locations do not have a clock value greater than the currently stored global version-clock value and also make sure the memory locations are not currently locked
Perform the read operations
Repeat step #2 for validation

If either step #2 or #4 fail, then the read-transaction is aborted.

The question that I can't seem to solve in my mind though is what happens when you attempt to read a memory location inside an object that is located in the heap, and another thread calls delete on a pointer to that object? In Shavit's papers, they go into detail to explain how there can be no writes to a memory location that has been recycled or freed, but it seems that inside of a read-transaction, there is nothing preventing a possible timing scenario that would allow you to read from a memory location inside of an object that is has been freed by another thread. As an example, consider the following code:

Thread A executes the following inside of an atomic read-transaction: linked_list_node* next_node = node->next;

Thread B executes the following: delete node;

Since next_node is a thread-local variable, it's not a transactional object. The dereferencing operation required to assign it the value of node->next though actually requires two separate reads. In between those reads, delete could be called on node, so that the read from the member next is actually reading from a segment of memory that has already been freed. Since the reads are optimistic, the freeing of the memory pointed to by node in Thread B won't be detected in Thread A. Won't that cause a possible crash or segmentation fault? If it does, how could that be avoided without locking the memory locations for a read as well (something that both the text-book as well as the papers denotes is unnecessary)?