cuda浮点精度
有人可以评论一下吗,
我想做一个矢量点积。我的浮点向量是 [2080:2131] 和 [2112:2163],它们每个包含 52 个元素。
a[52] = {2080 2081 2082 ... ... 2129 2130 2131};
b[52] = {2112 2113 2114 ... ... 2161 2162 2163};
for (int i = 0; i < 52; i++)
{
sum += a[i]*b[i];
}
我的内核得出的全长(52 个元素)的结果和是 234038032,而 matlab 给出的是 234038038。对于 1 到 9 个元素的乘积和,我的内核结果与 matlab 结果一致。对于10个元素和,它减少1并逐渐增加。结果是可重复的。我检查了所有元素,没有发现任何问题。
Can someone comment on this,
I want to do a vector dot product. My float vector are [2080:2131] and [2112:2163], each one of them contains 52 elements.
a[52] = {2080 2081 2082 ... ... 2129 2130 2131};
b[52] = {2112 2113 2114 ... ... 2161 2162 2163};
for (int i = 0; i < 52; i++)
{
sum += a[i]*b[i];
}
The result sum for whole length (52 element)was 234038032 by my kernel while matlab gave 234038038. For 1 to 9 element sum of product, my kernel result agrees with matlab result. For 10 element sum, it is off by 1 and gradually increases. The results were reproducible. I checked all the elements and found no problem.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于向量是浮点数,因此您会遇到舍入错误。 Matlab 将以更高的精度(双精度)存储所有内容,因此不会这么早就看到舍入误差。
您可能需要查看 大卫·戈德堡 (David Goldberg) 的《每个计算机科学家都应该了解浮点知识》 - 非常宝贵的读物。
C++ 中的简单演示(即与 CUDA 无关):
运行这个,您会得到:
那么您能对此做什么?您可以朝以下几个方向发展...
Since the vectors are float you are experiencing rounding errors. Matlab will store everything with much higher precision (double) and hence won't see the rounding errors so early.
You may want to check out What Every Computer Scientist Should Know About Floating Point by David Goldberg - invaluable reading.
Simple demo in C++ (i.e. nothing to do with CUDA):
Run this and you get:
So what can you do about this? There are several directions you could go in...