SIMD 常量浮点数

发布于 2024-11-18 10:33:45 字数 724 浏览 3 评论 0原文

我一直在尝试使用微软的 sse 内在函数来优化一些代码。优化代码时最大的问题之一是每当我想使用常量时就会发生 LHS。似乎有一些关于生成某些常量的信息（此处和这里 - 第 13.4 节），但它是所有汇编（我宁愿避免）。

问题是，当我尝试使用内在函数实现相同的功能时，msvc 会抱怨类型不兼容等。有谁知道使用内在函数有任何等效的技巧吗？

示例 - 生成 {1.0,1.0,1.0,1.0}

//pcmpeqw xmm0,xmm0 
__m128 t = _mm_cmpeq_epi16( t, t );

//pslld xmm0,25 
_mm_slli_epi32(t, 25);

//psrld xmm0,2
return _mm_srli_epi32(t, 2);

这会生成一堆有关不兼容类型的错误（__m128 与 _m128i）。我对此很陌生，所以我很确定我错过了一些明显的东西。有人可以帮忙吗？

tldr - 如何生成一个 __m128 vec，其中充满带有 ms 内在函数的单精度常量浮点数？

感谢您的阅读:)

原文

I've been trying my hand at optimising some code I have using microsoft's sse intrinsics. One of the biggest problems when optimising my code is the LHS that happens whenever I want to use a constant. There seems to be some info on generating certain constants (here and here - section 13.4), but its all assembly (which I would rather avoid).

The problem is when I try to implement the same thing with intrinsics, msvc complains about incompatible types etc. Does anyone know of any equivalent tricks using intrinsics?

Example - Generate {1.0,1.0,1.0,1.0}

//pcmpeqw xmm0,xmm0 
__m128 t = _mm_cmpeq_epi16( t, t );

//pslld xmm0,25 
_mm_slli_epi32(t, 25);

//psrld xmm0,2
return _mm_srli_epi32(t, 2);

This generates a bunch of errors about incompatible type (__m128 vs _m128i). I'm pretty new to this, so I'm pretty sure I'm missing something obvious. Can anyone help?

tldr - How do I generate an __m128 vec filled with single precision constant floats with ms intrinsics?

Thanks for reading :)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

笙痞 2024-11-25 10:33:45

尝试_mm_set_ps， <一href="https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_set_ps1&expand=4584,4587" rel="nofollow">_mm_set_ps1 或 <一个href="https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_set1_ps&expand=4584,4587,4634,4587,4634" rel="nofollow">_mm_set1_ps 。