线程如何使用少于 100% 的挂起时间?
在分析应用程序时(使用 dotTrace),我注意到一件非常奇怪的事情。我使用了“wall time”测量,理论上这意味着所有线程都会运行相同的时间。
但事实并非如此:某些线程(实际上是我最感兴趣的线程)显示的总时间比其他线程少大约 2 倍。例如,分析运行了 230 秒,大多数线程报告线程花费了 230 秒,但 5 个线程仅显示 100-110 秒。这些不是线程池线程,它们肯定是在分析开始之前创建并启动的。
这是怎么回事?
更新我将添加更多可能相关或不相关的信息。相关应用程序(它是游戏服务器)大约有 20-30 个持续运行的线程。大多数线程遵循简单的模式:它们检查传入队列是否有工作,如果有则执行工作。 thread func 的代码看起来像这样:
while(true){
if(TryDequeueWork()){ // if queue is not empty
DoWork(); // do whatever is was on top
}else{
m_WaitHandle.WaitOne(MaxTimeout); // m_WaitHandle gets signaled when work is added to queue
}
}
显示奇怪时间的线程是这样的,除了它们服务于多个队列,如下所示:
while(true){
bool hasAnyWork=false;
foreach(var queue in m_Queues){
if(queue.TryDequeueWork()){
hasAnyWork=true;
DoWork();
}
}
if(!hasAnyWork){
m_WaitHandle.WaitOne(MaxTimeout);
}
}
奇怪的线程除了日志记录之外不执行任何 IO。其他非奇怪的线程也进行日志记录。等待 WaitHandle 所花费的时间在分析器中报告;实际上,一些非奇怪的线程几乎将所有时间都花在等待上(因为它们从来没有任何工作)。
该应用程序在 8 核虚拟机(VPS 托管)上运行。我不知道那里使用什么物理处理器。
When profiling an application (using dotTrace), I noticed a very strange thing. I used "wall time" measurement, which should in theory mean that all threads would run for a same amount of time.
But this wasn't true: some threads (actually those I was most interested in) displayed total time about 2 times less than others. For example, profiling ran for 230 seconds, most threads report 230 seconds spent in thread, but 5 threads only show 100-110 seconds. These are not threadpool threads, and they were definitely created and started before profiling started.
What is going on here?
Update I'll add more info that may or may not be relevant. The application in question (it is a game server) has about 20-30 constantly running threads. Most threads follow simple pattern: they check an incoming queue for work, and do work if there is some. The code for thread func looks something like this:
while(true){
if(TryDequeueWork()){ // if queue is not empty
DoWork(); // do whatever is was on top
}else{
m_WaitHandle.WaitOne(MaxTimeout); // m_WaitHandle gets signaled when work is added to queue
}
}
The threads that display weird times are like this, except they serve multiple queues, like this:
while(true){
bool hasAnyWork=false;
foreach(var queue in m_Queues){
if(queue.TryDequeueWork()){
hasAnyWork=true;
DoWork();
}
}
if(!hasAnyWork){
m_WaitHandle.WaitOne(MaxTimeout);
}
}
The weird threads don't do any IO except maybe logging. Other, non-weird threads, do logging too. Time spent waiting for a WaitHandle is reported in profiler; actually, some of the non-weird threads spend almost all of their time waiting (as they never have any work).
The application was running on an 8-core virtual machine (VPS hosting). I don't know what physical processors are used there.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许他们在分析器完成之前就完成了?
Did they finish before the profiler finished, perhaps?
时,你才能获得 100% 的挂机时间,
这两种情况都不太可能发生,很少有问题能够很好地扩展。 阿姆达尔定律是相关的。
You can only get 100% wall time if
Neither is very likely, few problems scale well. Amdahl's law is relevant.