增加Python中cProfiler的深度以报告更多功能?
我正在尝试分析一个调用其他函数的函数。我按如下方式调用探查器:
from mymodule import foo
def start():
# ...
foo()
import cProfile as profile
profile.run('start()', output_file)
p = pstats.Stats(output_file)
print "name: "
print p.sort_stats('name')
print "all stats: "
p.print_stats()
print "cumulative (top 10): "
p.sort_stats('cumulative').print_stats(10)
我发现探查器说所有时间都花在 mymodule 的函数“foo()”上,而不是将其分解为子函数 foo() 调用,这正是我想要看到的。如何让探查器报告这些函数的性能?
谢谢。
I'm trying to profile a function that calls other functions. I call the profiler as follows:
from mymodule import foo
def start():
# ...
foo()
import cProfile as profile
profile.run('start()', output_file)
p = pstats.Stats(output_file)
print "name: "
print p.sort_stats('name')
print "all stats: "
p.print_stats()
print "cumulative (top 10): "
p.sort_stats('cumulative').print_stats(10)
I find that the profiler says all the time was spend in function "foo()" of mymodule, instead of brekaing it down into the subfunctions foo() calls, which is what I want to see. How can I make the profiler report the performance of these functions?
thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您需要
p.print_callees()
来获取方法调用的层次结构细分。输出是非常不言自明的:在左侧列中,您可以找到感兴趣的函数,例如foo()
,然后转到右侧列,显示所有调用的子函数及其作用域总时间和累积时间。还包括这些子调用的细分等。You need
p.print_callees()
to get hierarchical breakdown of method calls. The output is quite self explanatory: On the left column you can find your function of interest e.g.foo()
, then going to the right side column shows all called sub-functions and their scoped total and cumulative times. Breakdowns for these sub-calls are also included etc.首先,我想说我无法复制提问者的问题。探查器(在 py2.7 中)肯定会深入到被调用的函数和方法中。 (py3.6 的文档看起来相同,但我还没有在 py3 上进行测试。)我的猜测是,通过将其限制为前 10 个返回(按累积时间排序),其中前 N 个是非常高级的函数,称为最短的时间,由
foo()
调用的函数就从列表的底部消失了。我决定使用一些大数字进行测试。这是我的测试代码:
以及包含文件(与 Asker 的非常相似):
注意最后一行。我添加了一个按
时间
排序的视图,这是函数“排除调用子功能的时间”。我发现这个视图更有用,因为它倾向于支持正在执行实际工作并且可能需要优化的函数。以下是提问者所处理的结果的一部分(
累积
排序):查看此显示中的前 3 个函数如何仅被调用一次。让我们看看按
时间
排序的视图:现在排名第一的条目有意义了。显然,通过重复乘法将某个值提高到 200 次方是一种“天真的”策略。让我们替换它:
结果:
这是一个很好的时间改进。现在
str.format()
是我们最严重的罪犯。我在 bar() 中添加了一行来打印数字的长度,因为我的第一次尝试(只是计算数字而不对其进行任何操作)得到了优化,并且我尝试避免这种情况(打印数字(它变得非常大、速度非常快)看起来可能会阻塞 I/O,所以我在打印数字的长度方面做出了妥协。嘿,这是以 10 为底的日志。让我们尝试一下:结果:
嗯,即使没有
str.format()
,仍然在bar()
上花费了相当多的时间。让我们去掉那个打印:结果:
现在看起来执行实际工作的东西是最繁忙的功能,所以我认为我们已经完成了优化。
希望有帮助!
First, I want to say that I was unable to replicate the Asker's issue. The profiler (in py2.7) definitely descends into the called functions and methods. (The docs for py3.6 look identical, but I haven't tested on py3.) My guess is that by limiting it to the top 10 returns, sorted by cumulative time, the first N of those were very high-level functions called a minimum of time, and the functions called by
foo()
dropped off the bottom of the list.I decided to play with some big numbers for testing. Here's my test code:
And the including file (very similiar to Asker's):
Notice the last line. I added a view sorted by
time
, which is the total time spent in the function "excluding time made in calls to sub-functions". I find this view much more useful, as it tends to favor the functions that are doing actual work, and may be in need of optimization.Here's the part of the results that the Asker was working from (
cumulative
-sorted):See how the top 3 functions in this display were only called once. Let's look at the
time
-sorted view:Now the number one entry makes sense. Obviously raising something to the 200th power by repeated multiplication is a "naive" strategy. Let's replace it:
And the results:
That's a nice improvement in time. Now
str.format()
is our worst offender. I added the line inbar()
to print the length of the number, because my first attempt (just computing the number and doing nothing with it) got optimized away, and my attempt to avoid that (printing the number, which got really big really fast) seemed like it might be blocking on I/O, so I compromised on printing the length of the number. Hey, that's the base-10 log. Let's try that:And the results:
Hmm, still a fair amount of time spent in
bar()
, even without thestr.format()
. Let's get rid of that print:And the results:
Now it looks like the stuff doing the actual work is the busiest function, so I think we're done optimizing.
Hope that helps!
也许您遇到了类似的问题,所以我将在这里描述我的问题。我的分析代码如下所示:
最终的分析输出仅包含
events()
调用。我花了不少时间才意识到我的循环分析是空的。当然,客户端代码多次调用 foobar(),但有意义的分析结果已被最后一次空循环调用覆盖。Maybe you faced with a similar problem, so I'm going to describe here my issue. My profiling code looked like this:
And the final profiling output contained only
events()
call. And I spent not so little time to realise that I had empty loop profiling. Of course, there was more than one call offoobar()
from a client code, but meaningful profiling results had been overwritten by last one call with empty loop.