jupyternotebook中关于cpu时间和wall time的问题
作为一名物理专业的学生,我有一个关于程序运行时的问题想问你。
我正在做一个关于算法优化的小项目,我的玩具代码是在jupyternotebook上编写的,但是当我在单元格中使用 %%time 和 %%timeit 来测试CPU时间时和我的代码的实际时间,我发现以下结果
CPU时间:总计:2分钟16秒 Wall time: 24.2s
这两个时间之间的巨大差距让我疑惑,我的困惑是
- 我应该用哪个作为衡量算法速度的标准,尽管我知道用python来衡量是不合理的测量代码的速度
- 为什么这些之间的时间差距看起来这么大?
非常期待好心人对这个问题的任何见解
As a physics student, I have a questions I would like to ask you about program runtime.
I'm doing a small project about algorithm optimization, my toy code is written on jupyternotebook, but when I use %%time and %%timeit in cell to test cpu time and wall time of my code, I found the following results
CPU times: total: 2min 16s
Wall time: 24.2s
The huge gap between these two times makes me wonder, my confusion is
- Which should I use as the standard to measure the speed of the algorithm, although I know that it is not reasonable to use python to measure the speed of the code
- Why the time gap between these looks so big?
Really looking forward to any insights from kind people on this issue
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,什么是“Wall time”:它是运行单元所需的总时间。
其次,CPU时间是CPU计算不同核心所花费的时间。
例如,如果我有一个单元需要 1 秒的挂壁时间。如果该单元在整个时间内使用 2 个核心,我们将获得 2 秒的 CPU 时间。
对于你的第二个问题,你的结果似乎表明你在计算中平均使用 5.6 个核心。您可能正在使用一些使用 numba 进行并行化的模块,因为它是物理学中最常见的。
回到你的第一个问题,我建议你使用
%prun
通过探查器运行你的单元,这样你就会知道你的程序在哪里花费了更多的时间以及哪些部分需要优化。First, what is Wall time : it's the total time needed to run the cell.
Secondly, the CPU time is the time spend by the CPU, counting the different cores.
For example, if I have a cell that require a wall time of 1 second. If this cell uses during the entire time 2 cores, we will get a CPU time of 2s.
For your second question, what your results seems to indicate is that you are using along your calculus an average of 5.6 cores. And you are probably using some modules that uses numba to parallelize as it's the most commonly found in physics.
Coming back to your first question, I would advice your to run your cell with the profiler using
%prun
this way you will know where your program spend the more time and which parts needs to be optimize.1 - CPU 时间可以细分为几个时间,如果您不确切知道 CPU 时间是如何细分的,您应该使用 wall time 又名 Elapsed real-time。
2 - 造成巨大差距的原因有很多:CPU 时间包含:空闲时间、系统资源可用的等待时间等……而挂起时间则仅包括:完成日期 - 开始日期
1 - CPU times can be subdivided into several times, if you don't know exactly how CPU time was subdivided you should use wall time aka Elapsed real-time.
2 - They can be many reasons for the huge gap: CPU time contains: Idle time, waiting time for systems resources to be available, etc... whereas wall time if only : finish date - starting date