浮点值发生变化。不知道为什么
我有两个文件。
- TreeSearch.cpp
- TreeSearchCUDA.cu
在 TreeSearch.cpp 中,我有:
int* searchTree(vector<TreeNode> &tree, vector<ImageFeature> featureList)
{
float** features = makeMatrix(featureList, CHILDREN);
float* featuresArray = makeArray(features, featureList.size());
float* centroidNodes = convertTree(tree);
int numFeatures = featureList.size();
for(int j = 0; j < 10; j++)
{
cout << "C++ " << centroidNodes[j] << endl;
}
cout << "" << endl;
int* votes = startSearch(centroidNodes, tree.size(), featuresArray, numFeatures);
return votes;
}
startSearch 存在于 TreeSearchCUDA.cu 中,看起来像这样:
int* startSearch(float* centroids, int nodesCount, float* features, int featuresCount)
{
for(int j = 0; j < 10; j++)
{
printf("CUDA %f \n", centroids[j]);
}
...
}
现在,如果我们查看输出,它看起来像这样:
C++ 0
C++ 2.52435e-29
C++ 0
C++ 2.52435e-29
C++ 6.72623e-44
C++ 1.26117e-44
C++ 2.03982e+12
C++ 4.58477e-41
C++ 0
C++ 1.26117e-44
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 2039820058624.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
结果不一样。有人有什么想法吗? :) 我有一个想法,这是因为代码的某些部分是用 -m64 编译的,而某些部分不是。然而,这是不可能改变的。链接对象时我使用 -m64。
我希望有人能提供解决方案或解释:)
I have two files.
- TreeSearch.cpp
- TreeSearchCUDA.cu
In TreeSearch.cpp I have:
int* searchTree(vector<TreeNode> &tree, vector<ImageFeature> featureList)
{
float** features = makeMatrix(featureList, CHILDREN);
float* featuresArray = makeArray(features, featureList.size());
float* centroidNodes = convertTree(tree);
int numFeatures = featureList.size();
for(int j = 0; j < 10; j++)
{
cout << "C++ " << centroidNodes[j] << endl;
}
cout << "" << endl;
int* votes = startSearch(centroidNodes, tree.size(), featuresArray, numFeatures);
return votes;
}
startSearch exists in TreeSearchCUDA.cu which looks like this:
int* startSearch(float* centroids, int nodesCount, float* features, int featuresCount)
{
for(int j = 0; j < 10; j++)
{
printf("CUDA %f \n", centroids[j]);
}
...
}
Now if we look at the output it looks like this:
C++ 0
C++ 2.52435e-29
C++ 0
C++ 2.52435e-29
C++ 6.72623e-44
C++ 1.26117e-44
C++ 2.03982e+12
C++ 4.58477e-41
C++ 0
C++ 1.26117e-44
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
CUDA 2039820058624.000000
CUDA 0.000000
CUDA 0.000000
CUDA 0.000000
The results are not the same. Does anyone have any ideas? :)
I have an idea that it is because some parts of the code is compiled with -m64 and some parts are not. However it is not possible to change this. When linking the objects I use -m64.
I hope someone has a solution or explanation :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
从您的输出来看,CUDA 正在将非常小的浮点近似为
0
。除了2.03982e+12
之外,您的所有输入都是非常小的浮点数或0
。2.03982e+12
在输出中保持不变。你的质心应该很小吗?It looks like from your output that CUDA is approximating really small floats to
0
. All your inputs are really small floats or0
except2.03982e+12
.2.03982e+12
remains the same in the output. Are your centroids supposed to be really small?float
并不精确,您可以使用标头
来获取有关您可以安全使用且有保证的(十进制)浮点数中有多少位的信息保持不变。在我的系统上,它输出
6
,这意味着我可以确保使用最多 6 位小数时浮点数是准确的。但由更多小数组成的数字并不被认为是精确的,numeric_limits
不能保证这些数字将保持并保持 100% 不变。我可能要补充的另一件事是 printf 不像 std::ostream 那样打印浮点,浮点的处理和打印方式通常存在内部差异。我不认为以
%f
作为默认“信任”的printf
与科学表示中的std::ostream
一样多的小数。输出(在我的系统上)
float
s are not exact, you can use the header<limits>
to get information about how many digits in your (decimal) float you can safely use and that are guaranteed to remain unchanged.On my system this outputs
6
, which means I can be sure that the float will be exact when using up to 6 decimals. But a number consisting of more decimals are not considered to be exact,numeric_limits<float>
doesn't guarantee that these numbers will stick and remain 100% unchanged.Another thing I might add is that
printf
doesn't print floating points asstd::ostream
does, there are often internal difference on how floating points are handled and printed. I don't thinkprintf
with%f
as default "trusts" as many decimals asstd::ostream
in scientific representation.output (on my system)
根据 http://developer.download.nvidia .com/assets/cuda/files/NVIDIA-CUDA-Floating-Point.pdf,在“计算能力为 1.2 的设备”[或浮点数为 1.3]中,“非正规数(接近于的小数)零)被刷新为零。”非正规数是 ~1.4e−45 和 ~1.18e−38 之间的数(http://en.wikipedia.org/wiki/Single- precision_floating-point_format),这解释了您的 e-41 和 e-44 数字如何最终结果为 0。我不确定 e-29 的情况如何,但类似的情况可能会发生。
According to http://developer.download.nvidia.com/assets/cuda/files/NVIDIA-CUDA-Floating-Point.pdf, in "Devices with compute capability 1.2" [or 1.3 for floats], "Denormal numbers (small numbers close to zero) are flushed to zero." Denormal numbers are the ones between ~1.4e−45 and ~1.18e−38 (http://en.wikipedia.org/wiki/Single-precision_floating-point_format), so that explains how your e-41 and e-44 numbers wound up at 0. I'm not sure about the e-29 ones, but something similar could have happened.