pthreads_mutex 的段错误

发布于 2024-10-19 23:13:04 字数 1336 浏览 4 评论 0原文

我正在 pthreads 中实现粒子交互模拟器,并且我的 pthreads 代码中不断出现分段错误。错误发生在以下循环中,每个线程在我的 thread_routine 中的每个时间步结束时都会执行该循环:

    for (int i = first; i < last; i++)
    {
            get_id(particles[i], box_id);
            pthread_mutex_lock(&locks[box_id.x + box_no * box_id.y]);
            //cout << box_id.x << "," << box_id.y << "," << thread_id << "l" << endl;
            box[box_id.x][box_id.y].push_back(&particles[i]);
            //cout << box_id.x << box_id.y << endl;
            pthread_mutex_unlock(&locks[box_id.x + box_no * box_id.y]);
    }

奇怪的是,如果我取消注释一个(无论是哪一个)或两个 couts,程序将运行为预期的,没有发生错误(但这显然会降低性能,并且不是一个优雅的解决方案),给出正确的输出。

box 是全局声明的 向量<向量<向量<粒子_t*> > >盒子 它代表将我的(方形)域分解为盒子。

当循环开始时,所有 i、j 的 box[i][j].size() 已设置为零,并且循环应该将粒子放回到盒子结构中(get_id 函数给出正确的结果,我已经查过了)

数组pthread_mutex_t locks被声明为全局

pthread_mutex_t * locks

大小由线程0设置,并且在创建其他线程之前由线程0初始化的锁:

locks = (pthread_mutex_t *) malloc( box_no*box_no * sizeof( pthread_mutex_t ) );

for (int i = 0; i < box_no*box_no; i++)
{
    pthread_mutex_init(&locks[i],NULL);
}

你有吗知道什么可能导致这种情况吗?如果处理器数量设置为 1,代码也会运行,而且似乎我运行的处理器越多,段错误发生得越早(它已经在两个处理器上运行了一次整个模拟,但这似乎是一个例外)

谢谢

I am implementing a particle interaction simulator in pthreads,and I keep getting segmentation faults in my pthreads code. The fault occurs in the following loop, which each thread does in the end of each timestep in my thread_routine:

    for (int i = first; i < last; i++)
    {
            get_id(particles[i], box_id);
            pthread_mutex_lock(&locks[box_id.x + box_no * box_id.y]);
            //cout << box_id.x << "," << box_id.y << "," << thread_id << "l" << endl;
            box[box_id.x][box_id.y].push_back(&particles[i]);
            //cout << box_id.x << box_id.y << endl;
            pthread_mutex_unlock(&locks[box_id.x + box_no * box_id.y]);
    }

The strange thing is that if I uncomment one (it doesn't matter which one) or both of the couts, the program runs as expected, with no errors occurring (but this obviously kills performance, and isn't an elegant solution), giving correct output.

box is a globally declared
vector < vector < vector < particle_t*> > > box
which represents a decomposition of my (square) domain into boxes.

When the loop starts, box[i][j].size() has been set to zero for all i, j, and the loop is supposed to put particles back into the box-structure (the get_id function gives correct results, I've checked)

The array pthread_mutex_t locks is declared as a global

pthread_mutex_t * locks,

and the size is set by thread 0 and the locks initialized by thread 0 before the other threads are created:

locks = (pthread_mutex_t *) malloc( box_no*box_no * sizeof( pthread_mutex_t ) );

for (int i = 0; i < box_no*box_no; i++)
{
    pthread_mutex_init(&locks[i],NULL);
}

Do you have any idea of what could cause this? The code also runs if the number of processors is set to 1, and it seems like the more processors I run on, the earlier the seg fault occurs (it has run through the entire simulation once on two processors, but this seems to be an exception)

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

眼泪也成诗 2024-10-26 23:13:04

这只是一个有根据的猜测,但基于如果对所有框使用一个锁,问题就会消失:push_back 必须分配内存,这是通过 std::allocator< /代码> 模板。我不认为分配器保证是线程安全的,也不认为它保证分区,每个向量一个。 (底层operator new线程安全的,但是allocator通常会使用块切片技巧来分摊operator new的成本。)

使用 reserve 对您来说是否实用 提前为所有向量预先分配空间,使用一些保守的估计来估计每个盒子中有多少粒子?这是我要尝试的第一件事。

我尝试的另一件事是对所有盒子使用一个锁,我们知道这是有效的,但是将锁定/解锁操作移到 for 循环之外,以便每个线程都可以将其所有项目存储在一次。这实际上可能比你想要做的更快——更少的锁抖动。

This is only an educated guess, but based on the problem going away if you use one lock for all the boxes: push_back has to allocate memory, which it does via the std::allocator template. I don't think allocator is guaranteed to be thread-safe and I don't think it's guaranteed to be partitioned, one for each vector, either. (The underlying operator new is thread-safe, but allocator usually does block-slicing tricks to amortize operator new's cost.)

Is it practical for you to use reserve to preallocate space for all your vectors ahead of time, using some conservative estimate of how many particles are going to wind up in each box? That's the first thing I'd try.

The other thing I'd try is using one lock for all the boxes, which we know works, but moving the lock/unlock operations outside the for loop so that each thread gets to stash all its items at once. That might actually be faster than what you're trying to do -- less lock thrashing.

兔姬 2024-10-26 23:13:04

boxbox[i] 向量是否正确初始化?你只说最里面的一组向量已设置。否则,看起来 box_idxy 组件是错误的并且超出了其中一个数组的末尾。

它破坏了外观的哪一部分?

Are the box and box[i] vectors initialized properly? You only say the innermost set of vectors are set. Otherwise it looks like box_id's x or y component is wrong and running off the end of one of your arrays.

What part of the look is it crashing on?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文