二维数组的并发向量

发布于 2024-12-13 20:25:21 字数 507 浏览 2 评论 0 原文

我目前正在尝试使用 tbb::concurrent_vector 表示 2D 数组。这个二维数组将被许多不同的线程访问,这就是为什么我希望它能够最有效地处理并行访问。

我想出了两个解决方案:

  • 使用tbb::concurrent_vector; > 来存储它。

  • 将所有内容存储在 tbb::concurrent_vector 中,并使用 x * width + y

    访问元素

我更喜欢第二个,因为我不想锁定一整行来访问一个元素(因为我假设要访问元素 array[x][y],tbb 实现将锁定第 x 行,然后锁定第 y 元素) 。

我想知道哪种解决方案对您来说更好。

I am currently trying to represent a 2D array using tbb::concurrent_vector<T>. This 2d array will be accessed by a lot of different threads and thats why I want it to handle parallel accesses the most efficiently possible.

I came up with 2 solutions:

  • Use a tbb::concurrent_vector<tbb::concurrent_vector<T> > to store it.

  • Store everything in a tbb::concurrent_vector<T> and access elements w/ x * width + y

I had a preference for the second one because I dont want to lock an entire row to access one element (since I assume that to access the element array[x][y], the tbb implementation will lock the xth row and then the yth element).

I would like to know which solution seems better to you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

别闹i 2024-12-20 20:25:21

首先,我认为关于 tbb::concurrent_vector 可能存在一些混淆。此容器类似于 std::vector ,但具有线程安全的大小调整功能,但由于内部存储布局,元素访问速度较慢。

您可以在此处阅读更多相关信息。

在您的情况下,由于您提出的第二个解决方案(具有 x * width + y 索引的一维数组),我假设您的算法不涉及数组的密集多线程调整大小。因此,与 std::vector 等单线程容器相比,您不会从 tbb::concurrent_vector 中受益。

我猜您假设 tbb::concurrent_vector 保证线程安全的元素访问,但事实并非如此 - 引用自 tbb::concurrent_vector::operator[] 文档:

获取对给定索引处元素的引用。

此方法对于并发读取是线程安全的,并且在增长向量时也是如此,只要调用线程已检查该索引

如果您不调整数组大小,则您只对第一部分感兴趣:线程安全并发读取。但是 std::vector 甚至原始 C 数组也可以给你同样的结果。另一方面,两者都不提供线程安全的任意访问(读/写元素)。

您必须使用锁来实现它,或者找到另一个为您执行此操作的库(可能是来自 STLPort 的 std::vector ,但我不确定)。尽管这效率很低,因为每次访问 2D 数组中的元素都会涉及线程同步开销。虽然我不知道您到底想实现什么,但同步很可能比您的实际计算时间更长。

现在回答你的问题,即使在单线程设置中,使用一维数组来表示 ND 数组总是更好,因为计算索引 (x * width + y) 比真正的 ND 数组所需的额外内存访问更快。
对于并发向量来说,情况更是如此,因为在最好的情况下(没有冲突的行访问),您的锁开销会增加两倍,而在存在冲突的情况下,锁开销会更多。

因此,在您提出的两种解决方案中,我会毫不犹豫地选择第二个:具有足够的元素访问锁定的一维数组(不必要的 tbb::concurrent_vector )。

根据您的算法和不同线程的访问模式,另一种方法 - 用于图像编辑软件(gimp、photoshop...) - 是基于图块的:

template<typename T> struct Tile {
    int offsetX, int offsetY;
    int width, height;
    Not_ThreadSafe_Vector<T> data;
};
ThreadSafe_Vector< Tile<T> > data

其中 Not_ThreadSafe_Vector 可以是任何容器,无需锁定元素访问,例如 std::vector ; ThreadSafe_Vector 是一个具有线程安全读/写元素访问的容器(不是 tbb::concurrent_vector!)。
这样,如果您的访问模式具有某些局部性(线程更有可能访问靠近其先前访问的元素而不是远处的元素),则可以使每个线程处理来自单个图块的大部分数据。时间,并且只有在切换到另一个图块时才会有同步开销。

First, I think there might be some confusion concerning tbb::concurrent_vector. This container is similar to std::vector but with thread-safe resizing but slower element access due to the internal storage layout.

You can read more about it here.

In your case, because of the second solution you proposed (1D array with x * width + y indexing), I'm assuming your algorithm doesn't involves intensive multi-threaded resizing of the array. Therefore, you wouldn't benefit from tbb::concurrent_vector compared to a single threaded container like std::vector.

I guess you were assuming that tbb::concurrent_vector guaranteed thread-safe element access, but it does not - quote from tbb::concurrent_vector::operator[] doc:

Get reference to element at given index.

This method is thread-safe for concurrent reads, and also while growing the vector, as long as the calling thread has checked that index

If you are not resizing the array, you are only interested in the first part: thread-safe concurrent reads. But std::vector or even raw C array gives you the same. On the other hand, neither offers thread-safe arbitrary access (read/write elements).

You would have to implement it using locks, or find another library that does this for you (maybe std::vector from STLPort but I'm not sure). Although this would be quite inefficient, since each access of an element from your 2D array would involve a thread synchronization overhead. While I don't know what exactly you are trying to implement, It's quite possible that the synchronization would take longer than your actual computations.

Now to answer your question, even in a single threaded setting, it is always better to use 1D array for representing ND arrays, because computing the index (x * width + y) is faster than an additional memory accesses needed by the true ND array.
For concurrent vectors, this is even more true because you would have twice the lock overhead in a best-case scenario (no conflicting row access), and a lot more in case there are conflicts.

So out of the two solutions you proposed, I wouldn't hesitate and go for the second one: a 1D array (not necessary tbb::concurrent_vector) with adequate locking for element access.

Depending on your algorithm and the access pattern by the different threads, another approach - used in image editing software (gimp, photoshop...) - is tile based:

template<typename T> struct Tile {
    int offsetX, int offsetY;
    int width, height;
    Not_ThreadSafe_Vector<T> data;
};
ThreadSafe_Vector< Tile<T> > data

Where Not_ThreadSafe_Vector could be any container without locking on element access, e.g. std::vector ; and ThreadSafe_Vector is a container with thread safe read/write element access (not tbb::concurrent_vector!).
This way, if you have some locality in your access pattern (a thread is more likely to access an element near to its previous access than far away), then each thread can be made to work on the data from a single tile most of the time, and you only have synchronization overhead when switching to another tile.

躲猫猫 2024-12-20 20:25:21

我检查了 tbb::concurrent_vector 或 std::vector 不允许创建原子类型的向量。这是因为原子类型不可复制。

I checked that tbb::concurrent_vector or std::vector does not allow creation of a vector of atomic types. This is because atomic types are not copyable.

叫思念不要吵 2024-12-20 20:25:21

tbb::concurrent_vector 对于访问元素(读、写、获取地址)和增长向量来说是完全线程安全的。 此处查看英特尔员工 Arch D. Robison 的回复。

但是,清除或销毁向量的操作不是线程安全的(请参阅 $6.2.1 Intel TBB 教程 pdf 文档)。也就是说,如果 tbb::concurrent_vector 上正在进行其他操作,则不要调用 clear()

正如 Antoine 提到的,由于效率和性能方面的原因,将 ND 阵列作为一维阵列处理始终是首选。表现。

tbb::concurrent_vector is completely thread safe for accessing elements (read, write, take the address) and growing the vector. Check response from Intel employee Arch D. Robison here.

However, operations for clearing or destroying a vector are not thread safe (refer to $6.2.1 Intel TBB Tutorial pdf document). That is, don't invoke clear() if there are other operations underway on the tbb::concurrent_vector.

As Antoine mentioned, handling the ND array as a 1D array is always to be preferred due to efficiency & performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文