cl_khr_fp64 和 cl_amd_fp64 之间的区别?
我刚刚发现在我的(相当昂贵的)Radeon 6970 上,仅支持 cl_amd_fp64
扩展。使用 cl_amd_fp64
运行时,我在代码的某些部分得到奇怪的结果(访问 0.005
的值实际上使用 1.99916e+37
?) 。在 CPU 上使用 cl_khr_fp64
和 Intel SDK 效果很好。 (输入缓冲区完全相同)
扩展页面给出信息很少。
两者到底有什么区别?
I just found that on my (pretty expensive) Radeon 6970, only cl_amd_fp64
extension is supported. I am getting odd results in some parts of the code (accessing the value of 0.005
actually uses 1.99916e+37
?) when running with cl_amd_fp64
. Using cl_khr_fp64
with Intel SDK on the CPU works just fine. (The input buffers are exactly the same)
The extension page gives very little information.
What are exactly the differences between both?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
cl_khr_fp64
是 Khronos 官方双精度浮点精度扩展。它要求算术符合 IEEE 754-2008,并支持全系列 OpenCL 向量类型和标准库函数。最初,AMD 仅实现了 cl_khr_fp64 扩展所需的一部分,因此他们发布了自己的供应商扩展 cl_amd_fp64 以支持其 GPU 硬件上的双精度。当它第一次出现时,支持范围非常有限(可能只有 +、-、* 以及非标准舍入行为 IIRC),但随着连续的 SDK 版本和新的硬件修订,它已经慢慢扩展。如果我没记错的话,他们会在发行说明中列出支持的内容。
我已经有一段时间没有密切关注他们的进展了,所以我不确定为什么会发生你所看到的情况。如果您安装了最新的驱动程序和流 SDK 版本,我建议您整理一个重现案例并提交错误报告。可能是您正在使用他们不支持或不保证结果的东西,但也可能是您发现了错误。
cl_khr_fp64
is the Khronos official double precision floating point precision extension. It requires that arithmetic be IEEE 754-2008 compliant, and the full range of OpenCL vector types and standard library functions be supported.Initially, AMD only implemented a subset of what the
cl_khr_fp64
extension requires, so they issued there own vendor extensioncl_amd_fp64
for supporting double precision on their GPU hardware. When it first appeared, the range of support was very limited (perhaps only +,-,* with non standard rounding behaviour IIRC), but it has slowly expanded with successive SDK releases and newly hardware revisions. They list what is supported in their release notes, if my memory serves correctly.I haven't followed their progress closely for a while, so I am not sure why what you are seeing might be occurring. If you have the latest driver and stream SDK version installed, I would suggest putting together a repro case and filing a bug report with them. It might be you are using something they don't support or guarantee the results of, but it could also be that you have found a bug.