对 CUDA 操作进行计时
我需要计算 CUDA 内核执行的时间。最佳实践指南说我们可以使用事件或标准计时函数,例如 Windows 中的 clock()
。我的问题是使用这两个函数给出了完全不同的结果。 事实上,与实践中的实际速度相比,事件给出的结果似乎是巨大的。
我实际上需要这一切的目的是能够通过首先在较小的数据集上运行计算的简化版本来预测计算的运行时间。不幸的是,这个基准测试的结果完全不现实,要么太乐观(clock()
),要么太悲观(事件)。
I need to time a CUDA kernel execution. The Best Practices Guide says that we can use either events or standard timing functions like clock()
in Windows. My problem is that using these two functions gives me a totally different result.
In fact, the result given by events seems to be huge compared to the actual speed in practice.
What I actually need all this for is to be able to predict the running time of a computation by first running a reduced version of it on a smaller data set. Unfortunately, the results of this benchmark are totally unrealistic, being either too optimistic (clock()
) or waaaay too pessimistic (events).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
你可以按照以下方式做一些事情:
或:
You could do something along the lines of :
or:
有一个开箱即用的 GpuTimer 使用结构:
There is an out-of-box GpuTimer struct for use:
您的问题已经得到满意的答复。
我已经构建了用于计时 C/C++ 以及 CUDA 操作的类,并希望与其他人分享,希望它们对下一个用户有所帮助。您只需将下面报告的
4
文件添加到您的项目中,并#include
这两个头文件即可,如下所示使用这两个类。
CPU 部分计时
GPU 部分计时
在这两种情况下,计时均以毫秒为单位。另外,这两个类在linux或windows下都可以使用。
以下是
4
文件:TimingCPU.cpp
TimingCPU.h
TimingGPU.cu
TimingGPU.cuh< /强>
A satisfactory answer has been already given to your question.
I have constructed classes for timing C/C++ as well as CUDA operations and want to share with other hoping they could be helpful to next users. You will just need to add the
4
files reported below to your project and#include
the two header files asThe two classes can be used as follows.
Timing CPU section
Timing GPU section
In both the cases, the timing is in milliseconds. Also, the two classes can be used under linux or windows.
Here are the
4
files:TimingCPU.cpp
TimingCPU.h
TimingGPU.cu
TimingGPU.cuh
如果你想测量 GPU 时间,你几乎必须使用事件。 nvidia 论坛上有一个关于应用程序计时注意事项的精彩讨论主题在这里。
If you want to measure GPU time you pretty much have to use events. Theres a great discussion thread on the do's and don'ts of timing your application over on the nvidia forums here.
您可以使用计算 visula 分析器,这将非常适合您的目的。它测量每个 cuda 函数的时间并告诉您调用它的次数。
You can use the compute visula profiler which will be great for your purpose. it measures the time of every cuda function and tells you how many times you called it .