NVIDIA-SMI仅提供一些测量GPU利用率的指标。最重要的是, lifitization.gpu
表示一个或多个内核正在执行GPU 。因此,似乎100%的值根本没有表示“完整”的GPU使用情况。
另外, nSight compute提供了许多详细的指标,但我找到了它即使在小型神经网络上也非常缓慢地运行 - 似乎并不是用例。另一个选择似乎是 dlprof 仅提供相当颗粒状的指标,例如“ GPU利用率”和“张量核心效率”,这些定义我找不到。
因此,是否有另一个工具(或参数)提供详细的 gpu用法指标?
Nvidia-smi only provides a few metrics to measure GPU utilization. Most importantly, utilization.gpu
represents the percent of time over the past sample period during which one or more kernels was executing on the GPU. Thus, it seems that a value of 100% does not at all indicate "full" GPU usage.
Alternatively, Nsight Compute provides many detailed metrics, but I found it to run very slowly on even small neural networks - it doesn't seem to be the use case. Another option seems to be DLProf, but this again only provides rather granular metrics such as "GPU utilization" and "Tensor Core Efficiency", whose definitions I could not find.
Therefore, is there another tool (or parameter) which provides detailed GPU usage metrics?
发布评论