aarch64-gcc simd inline asm,结果始终为0

发布于 2025-01-30 08:07:42 字数 703 浏览 2 评论 0原文

我尝试使用Inline Assembler进行SIMD乘法。但是,结果始终为零,或者(在其他情况下)变得难以置信(对我来说)值。

#include <stdio.h>

int main(void)
{
        double x[2] = {2.0, 3.0};
        double y[2] = {0.0, 0.0};

        asm volatile (
              "fmul %[y].2d, %[x].2d, %[x].2d\n"
        : /* outputs */
          [y] "=&w" (y)
        : /* inputs */
          [x] "w" (x)
        : /* clobbers */
          "cc"
        );

        printf("result = (%f, %f)\n",
               y[0], y[1]);

        return 0;
}

我总是会收集

aarch64-linux-gnu-gcc -mcpu=cortex-a73 -march='armv8-a'

输出,

result = (0.000000, 0.000000)

但我希望(4.0,9.0)。请帮忙!

I try to do SIMD multiplication with inline assembler. However, the result is always zero or (in other cases) gets ununderstandable (for me) values.

#include <stdio.h>

int main(void)
{
        double x[2] = {2.0, 3.0};
        double y[2] = {0.0, 0.0};

        asm volatile (
              "fmul %[y].2d, %[x].2d, %[x].2d\n"
        : /* outputs */
          [y] "=&w" (y)
        : /* inputs */
          [x] "w" (x)
        : /* clobbers */
          "cc"
        );

        printf("result = (%f, %f)\n",
               y[0], y[1]);

        return 0;
}

Compiled with

aarch64-linux-gnu-gcc -mcpu=cortex-a73 -march='armv8-a'

I always get the output

result = (0.000000, 0.000000)

but I would expect (4.0, 9.0). Please help!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

尛丟丟 2025-02-06 08:07:42

正如杰斯特所说,您必须将 value 传递给asm语句,而不是指向相关基准的指针。此值的正确类型是float64x2_t来自arm_neon.h。因此,请按以下方式进行:

#include <stdio.h>
#include <arm_neon.h>

int main(void)
{
        double x[2] = {2.0, 3.0};
        double y[2] = {0.0, 0.0};

        asm volatile (
              "fmul %[y].2d, %[x].2d, %[x].2d\n"
        : /* outputs */
          [y] "=&w" (*(float64x2_t *)y)
        : /* inputs */
          [x] "w" (*(float64x2_t *)x)
        : /* clobbers */
          "cc"
        );

        printf("result = (%f, %f)\n",
               y[0], y[1]);

        return 0;
}

请注意,当您包含内在标题时,您不妨直接使用内在信息:

int bar(void)
{
        double x[2] = {2.0, 3.0};
        double y[2] = {0.0, 0.0};
        float64x2_t *xx = x, *yy = y;

        *yy = vmulq_f64(*xx, *xx);

        printf("result = (%f, %f)\n",
               y[0], y[1]);

        return 0;
}

As Jester said, you have to pass a value to the asm statement, not a pointer to the datum in question. The correct type for this value is float64x2_t from arm_neon.h. So proceed as follows:

#include <stdio.h>
#include <arm_neon.h>

int main(void)
{
        double x[2] = {2.0, 3.0};
        double y[2] = {0.0, 0.0};

        asm volatile (
              "fmul %[y].2d, %[x].2d, %[x].2d\n"
        : /* outputs */
          [y] "=&w" (*(float64x2_t *)y)
        : /* inputs */
          [x] "w" (*(float64x2_t *)x)
        : /* clobbers */
          "cc"
        );

        printf("result = (%f, %f)\n",
               y[0], y[1]);

        return 0;
}

Note that when you include the intrinsics header, you might as well just use intrinsics directly:

int bar(void)
{
        double x[2] = {2.0, 3.0};
        double y[2] = {0.0, 0.0};
        float64x2_t *xx = x, *yy = y;

        *yy = vmulq_f64(*xx, *xx);

        printf("result = (%f, %f)\n",
               y[0], y[1]);

        return 0;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文