_mm_mul_ps 无法正确地将 10001 与 10001 相乘，但可以将 10000 与 10000 相乘

发布于 2024-11-29 06:03:08 字数 780 浏览 4 评论 0原文

我有一个非常简单的程序来乘四个数字。有用当它们每个都是 10000 时很好，但如果我将它们更改为 10001，则不行。结果会减少 1。

我在 AMD Opteron 和 Intel Xeon 上使用 gcc -msse2 main_sse.c -o sse 编译了该程序，并在两台机器上获得了相同的结果。

我将不胜感激任何帮助。在网上找不到有关此主题的任何内容。

#include <stdlib.h>
#include <stdio.h>
#include <xmmintrin.h>

int main(){

    float x[4], y[4], temp[4]; int i;  __m128 X, Y, result;
 
    for(i=0; i < 4; i++) { x[i] = 10000; y[i] = 10000; }

    X = _mm_load_ps(&x[0]);  Y = _mm_load_ps(&y[0]);  
    result = _mm_mul_ps(X,Y); _mm_store_ps(&temp[0], result);

    for(i=0; i < 4; i++) { x[i] = 10001; y[i] = 10001; }

    X = _mm_load_ps(&x[0]);  Y = _mm_load_ps(&y[0]);  
    result = _mm_mul_ps(X,Y); _mm_store_ps(&temp[0], result);
}

原文

I have a very simple program to multiply four numbers. It works
fine when each of them is 10000 but does not if I change them to 10001. The result is off by one.

I compiled the program with gcc -msse2 main_sse.c -o sse on both AMD Opteron and Intel Xeon and get the same result on both machines.

I would appreciate any help. Couldn't find anything online on this topic.

#include <stdlib.h>
#include <stdio.h>
#include <xmmintrin.h>

int main(){

    float x[4], y[4], temp[4]; int i;  __m128 X, Y, result;
 
    for(i=0; i < 4; i++) { x[i] = 10000; y[i] = 10000; }

    X = _mm_load_ps(&x[0]);  Y = _mm_load_ps(&y[0]);  
    result = _mm_mul_ps(X,Y); _mm_store_ps(&temp[0], result);

    for(i=0; i < 4; i++) { x[i] = 10001; y[i] = 10001; }

    X = _mm_load_ps(&x[0]);  Y = _mm_load_ps(&y[0]);  
    result = _mm_mul_ps(X,Y); _mm_store_ps(&temp[0], result);
}

分享到QQ

分享到微博