快双->使用 SSE 进行短转换和钳位？

发布于 2024-09-05 14:11:39 字数 301 浏览 2 评论 0原文

有没有一种快速方法可以将双精度值转换为短裤（16 位有符号），目前我正在做这样的事情：

double  dval = <sum junk>
int16_t sval;
if (val > int16_max) { 
   sval = int16_max;
} else if (val < int16_min) {
   sval = int16_min;
} else 
   sval = (int16_t)val;

我怀疑有一种使用 SSE 来实现此操作的快速方法，该方法会显着提高效率。

原文

Is there a fast way to cast double values to shorts (16 bits signed), currently I'm doing something like this:

double  dval = <sum junk>
int16_t sval;
if (val > int16_max) { 
   sval = int16_max;
} else if (val < int16_min) {
   sval = int16_min;
} else 
   sval = (int16_t)val;

I suspect there's a fast way to do this using SSE that will be significantly more efficient.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

仲春光 2024-09-12 14:11:40

查找 minsd、maxsd 和 cvtsd2si，或者如果您想并行执行 2 个操作，则使用 minpd、maxpd 和 cvtpd2dq。

使用第一种方法的唯一真正好处是您可以保存分支。无论如何，生成的 SSE2 代码几乎与 double using 编译到 SSE2 的代码一样快……真正的胜利来自于一次执行其中 2 个代码。

编辑：如果您想使用 Visual Studio 内在函数来完成此操作，那么我相信代码将如下所示：

 __m128d sseDbl = _mm_set_sd( dbl );
 sseDbl         = _mm_min_sd( dbl, _mm_set_sd( 32767.0 ) );
 sseDbl         = _mm_max_sd( dbl, _mm_set_sd( -32768.0 ) );
 short shrtVal  = (short)_mm_cvtsd_si32( sseDbl );

工作完成。使用汇编程序执行此操作也非常相似，但上述内容肯定会给您使用 Visual Studio 带来更好的性能。

Look up minsd, maxsd and cvtsd2si, or if you want to do 2 in parallel then use minpd, maxpd and cvtpd2dq.

The only real bonus of using the first method is that you save the branches. The SSE2 code generated will be, pretty much, as fast as double using code compiled to SSE2 anyway... The real win comes from doing 2 of them at a time.

Edit: If you wanted to do it using Visual Studio intrinsics then I believe the code would look like the following:

 __m128d sseDbl = _mm_set_sd( dbl );
 sseDbl         = _mm_min_sd( dbl, _mm_set_sd( 32767.0 ) );
 sseDbl         = _mm_max_sd( dbl, _mm_set_sd( -32768.0 ) );
 short shrtVal  = (short)_mm_cvtsd_si32( sseDbl );

And job done. Doing it using assembler is pretty similar as well but the above would definitely give you better performance with Visual Studio.

回复收藏 0 原文

~没有更多了~