pThread同步问题

发布于 2024-11-04 07:51:23 字数 1620 浏览 0 评论 0原文

我面临 pthread 同步问题。 threadWaitFunction1,是一个线程等待函数。 我期望行号。 247 flag = 1 仅在 243-246 完成后执行。 但我觉得奇怪的是,有时,在243-246还没有完成之前,它就直接跳到247。

请帮我。

提前致谢。

236   struct timespec timeToWait;
237   static void* threadWaitFunction1(void *timeToWaitPtr)
238   {
239       cout << "Setting flag =0 inside threadWaitFunction1\n";
240       
241       cout << "Inside threadWaitFunction\n";
242       struct timespec *ptr = (struct timespec*) timeToWaitPtr;
243       pthread_mutex_lock(&timerMutex);
          flag = 0;
244       pthread_cond_timedwait(&timerCond, &timerMutex, ptr);
          flag=1;
245       pthread_mutex_unlock(&timerMutex);
246       cout << "Setting flag =1 inside threadWaitFunction1\n";
247       
248
249    }

创建并调用上述线程的线程是:

263  static void timer_trackStartTime ()
264  {
265       struct timeval now;
266       pthread_t thread;
267       
268       printf("Inside trackStartTime: flag = %d\n",flag);
269 
270      /* Setting timer expiration */
271       timeToWait.tv_sec = lt_leak_start_sec;;  // First expiry after 1 sec
272       timeToWait.tv_nsec = lt_leak_start_nsec;
273       pthread_create(&thread, NULL, threadWaitFunction1, &timeToWait);
274       pthread_join(thread, NULL);
275       //pthread_kill(thread, SIGKILL); // Destroying the thread to ensure no leaks
276 
.
.
283       }

如果我使用 pthread_mutex_lock 保护整个函数,但仍然存在相同的问题。如何保证有序执行?有人可以帮忙吗?

编辑:now.tv_sec 和 now.tv_nsec 从代码中删除。 *编辑:更改了互斥体内的标志(仍然不起作用)*

I am facing a sync issue with pthread. threadWaitFunction1, is a thread wait function.
I expect line no. 247 flag = 1 to be executed only after 243-246 has finished.
But i find it strange that sometimes, it jumps directly to 247 before 243-246 has finished.

Please help me.

Thanks in advance.

236   struct timespec timeToWait;
237   static void* threadWaitFunction1(void *timeToWaitPtr)
238   {
239       cout << "Setting flag =0 inside threadWaitFunction1\n";
240       
241       cout << "Inside threadWaitFunction\n";
242       struct timespec *ptr = (struct timespec*) timeToWaitPtr;
243       pthread_mutex_lock(&timerMutex);
          flag = 0;
244       pthread_cond_timedwait(&timerCond, &timerMutex, ptr);
          flag=1;
245       pthread_mutex_unlock(&timerMutex);
246       cout << "Setting flag =1 inside threadWaitFunction1\n";
247       
248
249    }

The thread which creates and calls the above thread is:

263  static void timer_trackStartTime ()
264  {
265       struct timeval now;
266       pthread_t thread;
267       
268       printf("Inside trackStartTime: flag = %d\n",flag);
269 
270      /* Setting timer expiration */
271       timeToWait.tv_sec = lt_leak_start_sec;;  // First expiry after 1 sec
272       timeToWait.tv_nsec = lt_leak_start_nsec;
273       pthread_create(&thread, NULL, threadWaitFunction1, &timeToWait);
274       pthread_join(thread, NULL);
275       //pthread_kill(thread, SIGKILL); // Destroying the thread to ensure no leaks
276 
.
.
283       }

If i protect the whole function using pthread_mutex_lock, but still the same problem persists. How to ensure orderly execution? Can anyone help?

EDIT: now.tv_sec and now.tv_nsec removed from the code.
*EDIT: Changed the flags inside the mutex (still does not work)*

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

她如夕阳 2024-11-11 07:51:24

因此,让你不高兴的并不是真正的执行顺序(这很可能是正确的),而是时间安排。在“它在 243-246 完成之前直接跳转到 247”下,您的意思是“我观察到它在 244 中应该等待的时间过去之前执行了 247”。正确的?

然后,我怀疑这是 虚假唤醒 的问题:即使没有,线程也可能会被唤醒其他线程向条件变量发出信号。 pthread_cond_timedwait() 的规范指出“pthread_cond_timedwait() 或 pthread_cond_wait() 函数可能会发生虚假唤醒。”

通常,条件变量与应用程序中的某个事件相关联,等待条件变量的线程实际上是在等待另一个线程发出的表明感兴趣的事件已发生的信号。如果您没有事件并且只想等待一定时间,确实可以使用其他方法,例如 usleep()计时器 更合适,除非您还需要 pthread 取消点。

添加:由于您似乎对 usleep() 感到满意,并且只询问为什么 pthread_cond_timedwait() 没有达到您的期望,因此我决定不发布代码。如果您需要它,您可以使用@Hasturkun的答案。


ADDED-2:下面注释中的输出(在应用 Hasturkun 解决方案后获得)表明等待线程不会退出循环,这可能意味着 pthread_cond_timedwait() 返回与 ETIMEDOUT 不同的内容。您是否看到@nos 对您的帖子的评论(我固定了要减去的纳秒量):

确保 (now.tv_usec * 1000) + lt_leak_start_nsec;不会溢出。您只能将 tv_nsec 设置为最大值 999999999,如果表达式大于该值,您应该从 tv_nsec 中减去 1000000000,然后将 tv_sec 加 1。如果您的 timeToWaitPtr 包含无效的 tv_nsec(大于 999999999),pthread_cond_timedwait 将失败(您应该检查它的返回值也是。) – 4 月 28 日 19:04

在这种情况下,pthread_cond_timedwait() 将重复返回 EINVAL 并且永远不会跳出循环。最好在进入等待循环之前调整超时,尽管也可以响应 EINVAL 来完成此操作。


ADDED-3:现在,在您更改问题中的代码以通过超时而不添加到当前时间后,它还有另一个问题。如规范中所述,pthread_cond_timedwait() 的超时 是绝对时间,而不是相对时间;因此,当您传递 3 秒之类的超时值时,它会被解释为“自系统时间参考点起 3 秒”。那一刻几乎肯定已经过去了一段时间,因此 pthread_cond_timedwait() 立即返回。
我建议您彻底阅读规范(包括基本原理),以更好地理解该函数的使用方式。

So it is not really execution ordering (which is most probably correct) but timing that makes you unhappy. And under "it jumps directly to 247 before 243-246 has finished" you mean "I observed it executing 247 before the time it should wait in 244 has passed". Right?

Then, I suspect this is the problem of spurious wakeup: a thread might get woken up even though no other thread signalled the condition variable. The specification of pthread_cond_timedwait() says that "Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur."

Usually, a condition variable is associated with a certain event in the application, and a thread waiting on a condition variable in fact waits for a signal by another thread that the event of interest has happened. If you have no event and just want to wait for a certain amount of time, indeed other ways, such as usleep() or timers, are more appropriate, except if you also need a pthread cancellation point.

ADDED: Since you seem satisfied with usleep() and only asked why pthread_cond_timedwait() did not work to your expectations, I decided not to post the code. If you need it, you may use the answer of @Hasturkun.


ADDED-2: The output in comments below (obtained after the solution of Hasturkun was applied) suggests that the waiting thread does not exit the loop, which likely means that pthread_cond_timedwait() returns something different than ETIMEDOUT. Have you seen the comment by @nos to your post (I fixed the amount of nanosecs to subtract):

Make sure (now.tv_usec * 1000) + lt_leak_start_nsec; doesn't overflow. You can only set tv_nsec to max 999999999, if the expression is larger than that, you should subtract 1000000000 from tv_nsec, and increment tv_sec by 1. If your timeToWaitPtr contains an invalid tv_nsec (larger than 999999999), pthread_cond_timedwait will fail (you should check its return value too.) – nos Apr 28 at 19:04

In this case, pthread_cond_timedwait() will repeatedly return EINVAL and will never get out of the loop. It is better to adjust the timeout before entering the wait loop, though it can also be done in response to EINVAL.


ADDED-3: Now after you changed the code in your question to pass the timeout without adding to current time, it has another problem. As stated in the spec, the timeout for pthread_cond_timedwait() is absolute time, not relative; so when you pass something like 3 sec as the timeout, it is interpreted as "3 seconds since the reference point for the system time". That moment is almost certainly passed for a while, and so pthread_cond_timedwait() returns immediately.
I would recommend you to read the specification thoroughly (including rationale) to build better understanding of how this function is supposed to be used.

丢了幸福的猪 2024-11-11 07:51:24

Paul E. McKenney 写了一本名为 “并行编程很难,如果是这样,你能做什么?”,其中有关于内存障碍的非常好的信息(和一些漂亮的图片)。

回到你的问题,flag 不受任何保护。虽然您可能认为 pthread_mutex_lock() 和 pthread_mutex_unlock 提供了一些强大的排序和可见性保证,但它提供的唯一保证是针对临界区域内的访问和互斥体本身。

此外,在某些体系结构上,pthread_mutex_lock() 使用获取屏障,而 pthread_mutex_unlock() 使用释放屏障,这意味着互斥保护区域之前和之后的访问可能会溢出到互斥体保护区。在释放互斥体的 CPU 和获取相同互斥体的另一个 CPU 之间提供了强顺序保证,但几乎所有其他东西都不需要(并且可能不会得到)如此强的保证。

编辑:

显然我对 pthreads 的看法是错误的,它们似乎需要完整的内存屏障(如果您将与其他线程同步内存解释为需要这样做)。有关此内容的更多信息,以及有关实际实现中提供的保证的一些信息,请访问 Pthread 式锁的重新排序约束 汉斯·伯姆。

我还想知道 IA64 上的 NPTL 1, 2

Paul E. McKenney has written a book titled "Is Parallel Programming Hard, And, If So, What Can You Do About It?", which has really good information (and some nice pictures) on memory barriers.

Back to your question, flag isn't protected by anything. While you may think that pthread_mutex_lock() and pthread_mutex_unlock provides some strong ordering and visibility guarantees, the only guarantees it provides are for accesses inside the critical region and for the mutex itself.

What's more, on some architectures pthread_mutex_lock() uses an acquire barrier and pthread_mutex_unlock() uses a release barrier, which means that accesses before and after the mutex protected region can spill into the mutex protected region. Strong ordering guarantees are provided between a CPU releasing a mutex and another CPU acquiring the same mutex, but pretty much everything else doesn't need (and maybe doesn't get) such strong guarantees.

Edit:

Apparently I was wrong with respect to pthreads, they seem to require full memory barriers (if you interpret synchronize memory with respect to other threads as requiring that). More about this, and some info on the guarantees provided in real-world implementations at Reordering Constraints for Pthread-Style Locks by Hans Boehm.

I'm also still wondering about NPTL on IA64 1, 2.

九歌凝 2024-11-11 07:51:24

正如 Alexey Kukanov 所说,问题很可能是虚假唤醒。您的代码可能会被更正为循环,直到发生超时。请注意,我还将标志设置移至互斥体下方。

static void* threadWaitFunction1(void *timeToWaitPtr)
{
    struct timespec *ptr = (struct timespec*) timeToWaitPtr;
    int ret;

    pthread_mutex_lock(&timerMutex);
    cout << "Setting flag =0 inside threadWaitFunction1\n";
    flag=0;
    cout << "Inside threadWaitFunction\n";
    while (pthread_cond_timedwait(&timerCond, &timerMutex, ptr) != ETIMEDOUT)
        ;
    cout << "Setting flag =1 inside threadWaitFunction1\n";
    flag=1;
    pthread_mutex_unlock(&timerMutex);
}

为了安全起见,您应该检查同一互斥体下的标志来建立排序

As stated by Alexey Kukanov, the problem is likely spurious wakeup. your code may be corrected to loop until timeout occurs. Note that I also moved the flag setting to be under the mutex.

static void* threadWaitFunction1(void *timeToWaitPtr)
{
    struct timespec *ptr = (struct timespec*) timeToWaitPtr;
    int ret;

    pthread_mutex_lock(&timerMutex);
    cout << "Setting flag =0 inside threadWaitFunction1\n";
    flag=0;
    cout << "Inside threadWaitFunction\n";
    while (pthread_cond_timedwait(&timerCond, &timerMutex, ptr) != ETIMEDOUT)
        ;
    cout << "Setting flag =1 inside threadWaitFunction1\n";
    flag=1;
    pthread_mutex_unlock(&timerMutex);
}

To be on the safe side, you should check the flag under the same mutex to establish ordering

2024-11-11 07:51:24

这可能是因为编译器已经优化了一些东西,并将您的分配放在线程互斥体之前的标志上。如果您想保证执行顺序(通常无法保证,唯一的条件是程序的可见行为不会因优化而改变),您可以使用内存屏障来使确保您希望按照您编写的顺序执行的指令仅按照该顺序执行。

这里是一篇非常有趣的文章,虽然相当技术性而且很长,介绍了如何记忆障碍的作用以及它们的作用和不作用。它是为 Linux 编写的,但基本原理保持不变。

编辑:

锁是一个隐式内存屏障,通过我之前给出的链接,因此不需要内存屏障。

This could be because the compiler has optimized things and put your assignment to your flag before the thread mutex. If you want to guarantee order of execution, (something which is not normally guaranteed, on the only condition that the visible behaviour of your program does not change due to optimizations), you use a memory barrier to make sure that the instructions you want to be executed in the order you write them, are executed only in that order.

Here is a very interesting, though rather technical and long, article on how memory barriers work and what they do and don't do. It's written for Linux, but the basic principles remain the same.

EDIT:

The lock is an implicit memory barrier, by the link I gave earlier, so no memory barrier is needed.

淡忘如思 2024-11-11 07:51:24

仅供大家参考:

使用 pthread_cond_timedwait(&timerCond, &timerMutex, ptr); 无法实现的功能; 我使用 usleep( ) 实现了, usleep采用 timespec 结构,我们可以在其中使用秒和纳秒指定等待时间,我的目的就解决了。

那么 pthread_cond_timedwait(&timerCond, &timerMutex, ptr ); 有意义吗?我很惊讶,因为这个 API 预计会让调用线程等待,以满足该条件,但处理器似乎跳转到下一条指令作为优化措施,并且不等待条件满足。

但我的问题仍然是一样的,至于为什么,pthread_cond_timedwait(&timerCond, &timerMutex, ptr); 不应该让调用线程等待?

看来我在这个 API 上浪费了一天的时间:pthread_cond_timedwait( )

Just for everyone's info:

What i could not achieve using pthread_cond_timedwait(&timerCond, &timerMutex, ptr); i have achieved using usleep( ), usleep takes timespec structure where we can specify the wait period using seconds and nanoseconds, and my purpose is solved.

So what does the pthread_cond_timedwait(&timerCond, &timerMutex, ptr); make sense for?? I am surprised, as this API is expected to make the calling thread wait, fo that condition to satisfy, but it seems that processor jumps to next instruction as an optimisation measure, and does not wait foer the condition to satisfy.

But still my problem remains the same, as to why, pthread_cond_timedwait(&timerCond, &timerMutex, ptr); should not make the calling thread wait?

It seems i wasted a day behind this API: pthread_cond_timedwait( )

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文