如何在XPerf中定位空闲时间(以及网络IO时间等)?
假设我有一个人为的程序:
#include <Windows.h>
void useless_function()
{
Sleep(5000);
}
void useful_function()
{
// ... do some work
useless_function();
// ... do some more work
}
int main()
{
useful_function();
return 0;
}
目标:我希望探查器告诉我 useful_function()
不必要地调用 useless_function()
,它等待没有明显的原因。在 XPerf 下,这不会出现在我的任何图表中,因为对 WaitForMultipleObjects()
的调用似乎是由 Idle.exe
而不是我自己的程序负责的。
这是我当前运行的 xperf 命令行:
xperf -on Latency -stackwalk Profile
有什么想法吗?
(这不仅限于等待函数。上述问题可能可以通过在 NtWaitForMultipleObjects
处放置断点来解决。理想情况下,可以有一种方法来查看占用大量内存的堆栈示例 -时钟时间而不只是 CPU 时间)
Let's say I have a contrived program:
#include <Windows.h>
void useless_function()
{
Sleep(5000);
}
void useful_function()
{
// ... do some work
useless_function();
// ... do some more work
}
int main()
{
useful_function();
return 0;
}
Objective: I want the profiler to tell me useful_function()
is needlessly calling useless_function()
which waits for no obvious reasons. Under XPerf, this doesn't show up in any of the graphs I have because the call to WaitForMultipleObjects()
seem to be accounted to Idle.exe
instead of my own program.
And here's the xperf command line that I currently run:
xperf -on Latency -stackwalk Profile
Any ideas?
(This is not restricted to wait functions. The above might have been solved by placing breakpoints at NtWaitForMultipleObjects
. Ideally there could be a way to see the stack sample that's taking up a lot of wall-clock time as opposed to only CPU time)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为您正在寻找的是 Xperf 中的使用就绪线程进行等待分析功能。它捕获每个上下文切换,并在线程从睡眠(或其他阻塞操作)唤醒后为您提供线程的调用堆栈。在您的情况下,您会在调用 sleep(5000) 后看到堆栈以及睡眠时间。
该功能使用起来有点晦涩。但幸运的是,这里有很好的描述:
使用 Xperf 的等待分析进行应用程序性能故障排除
I think what you are looking for is the Wait analysis with Ready Thread functionality in Xperf. It captures every context switch and gives you the call stack of the thread once it wakes up from sleep (or an otherwise blocked operation). In your case, you would see the stack just after the call sleep(5000) as well as the time spend sleeping.
The functionality is a bit obscure to use. But it is fortunately well described here:
Use Xperf's Wait Analysis for Application-Performance Troubleshooting
等待分析就是做到这一点的方法。您应该:
然后您使用 WPA(或 xperfview,但这是古老的)中的 CPU 使用率(精确)来查看上下文切换并找到您的 TimeSinceLast 处于高位的位置一个不应该闲置的线程。您通常希望 CPU 使用率(精确)中的列按以下顺序排列:
有关详细信息,请参阅我的博客中的这些特定文章:
- https://randomascii.wordpress.com/2014 /08/19/etw-training-videos-available-now/
- https://randomascii.wordpress.com/2012/06 /19/wpaxperf-trace-analysis-reimagined/
Wait Analysis is the way to do this. You should:
Then you use CPU Usage (Precise) in WPA (or xperfview, but that's ancient) to look at the context switches and find where your TimeSinceLast is high on a thread that shouldn't be going idle. You'll typically want the columns in CPU Usage (Precise) in this sort of order:
For details see these particular articles from my blog:
- https://randomascii.wordpress.com/2014/08/19/etw-training-videos-available-now/
- https://randomascii.wordpress.com/2012/06/19/wpaxperf-trace-analysis-reimagined/
这个“分析器”会告诉您 - 只需随机暂停几次并查看堆栈即可。如果
做一些工作
需要 5 秒,而做更多工作
需要 5 秒,那么 33% 的时间堆栈将如下所示所以大约 33% 的堆栈样品将准确地表明这一点。任何花费了挂钟时间一部分的代码行都会出现在大约该部分的样本上。
在其余的示例中,您将看到它执行其他操作。
有一些自动分析器可以以更漂亮的方式执行相同的操作,例如 Zoom 和 LTProf,尽管他们实际上并没有向您展示示例。
我查看了 xperf 文档,试图弄清楚是否可以在挂钟时间上获取堆栈样本并获取行级分辨率的百分比。看来你必须使用 Windows 7 或 Vista。他们只关心功能,而不关心线路,如果您有真正的大功能,线路就很重要。我不知道如何访问各个样本,我认为这对于了解该程序为何花费时间非常重要。
This "profiler" will tell you - just randomly pause it a few times and look at the stack. If
do some work
takes 5 seconds, anddo some more work
takes 5 seconds, then 33% of the time the stack will look like thisSo roughly 33% of your stack samples will show exactly that. Any line of code that's costing some fraction of wall-clock time will appear on roughly that fraction of samples.
On the rest of the samples you will see it doing the other things.
There are automated profilers that do the same thing in a more pretty way, such as Zoom and LTProf, although they don't actually show you the samples.
I looked at the xperf doc, trying to figure out if you could get stack samples on wall-clock time and get percents at line-level resolution. It seems you gotta be on Windows 7 or Vista. They only bother with functions, not lines, which if you have realistically big functions, is important. I couldn't figure out how to get access to the individual samples, which I think is important for seeing why the program is spending its time.