CUDA 中的双精度浮点
CUDA支持双精度浮点数吗?
另外,同样的原因是什么?
Does CUDA support double precision floating point numbers?
Also, what are the reasons for the same?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您的 GPU 具有计算能力 1.3,那么您可以执行双精度。您应该知道,1.3 硬件每个 MP 只有一个双精度 FP 单元,必须由该 MP 上的所有线程共享,而有 8 个单精度 FPU,因此每个活动线程都有自己的单精度 FPU。换句话说,您很可能会发现双精度的性能比单精度的性能差 8 倍。
If your GPU has compute capability 1.3 then you can do double precision. You should be aware though that 1.3 hardware has only one double precision FP unit per MP, which has to be shared by all the threads on that MP, whereas there are 8 single precision FPUs, so each active thread has its own single precision FPU. In other words you may well see 8x worse performance with double precision than with single precision.
提示:
如果您想使用双精度,则必须将 GPU 架构设置为
sm_13
(如果您的 GPU 支持)。否则它仍然会将所有双精度数转换为浮点数并仅给出警告(如 faya 的帖子中所示)。
(如果因此而出现错误,则非常烦人:-))
标志为:
-arch=sm_13
As a tip:
If you want to use double precision you have to set the GPU architecture to
sm_13
(if your GPU supports it).Otherwise it will still convert all doubles to floats and gives only a warning (as seen in faya's post).
(Very annoying if you get a error because of this :-) )
The flag is:
-arch=sm_13
根据 Paul R 的评论,计算能力 2.0 设备(又名 Fermi)大大改进了双精度支持,但性能仅为单精度的一半。
此费米白皮书提供了有关双核的更多详细信息新设备的性能。
Following on from Paul R's comments, Compute Capability 2.0 devices (aka Fermi) have much improved double-precision support, with performance only half that of single-precision.
This Fermi whitepaper has more details about the double performance of the new devices.
正如其他人提到的,较旧的 CUDA 卡不支持
double
类型。但是,如果您想要比旧 GPU 提供的精度更高,您可以使用 float-float 解决方案,该解决方案类似于 双双技术。有关该技术的更多信息,请阅读当然,在现代 GPU 上,您也可以使用双倍来实现大于双倍的精度。
double-double
也用于long double
在 PowerPC 上As mentioned by others, older CUDA cards don't support the
double
type. But if you want more precision than the one your old GPU provides you can use the float-float solution which is similar to the double-double technique. For more information about that technique readOf course on modern GPUs you can also use double-double to achieve an accuracy larger than double.
double-double
is also used forlong double
on PowerPC