上证所 SIMD 的上限/下限

发布于 2024-10-21 12:36:55 字数 707 浏览 8 评论 0原文

任何人都可以建议一种使用 SSE4.1 之前的 SIMD 计算 float 下限/上限的快速方法吗？我需要正确处理所有极端情况，例如，当我有一个无法用 32 位 int 表示的 float 值时。

目前我正在使用类似于以下代码（为了清晰起见，我使用 C 内在函数，转换为 asm）：

;make many copies of the data
movaps       xmm0,   [float_value]
movaps       xmm1,   xmm0
movaps       xmm2,   xmm0

;check if the value is not too large in magnitude
andps        xmm1,   [exp_mask]
pcmpgtd      xmm1,   [max_exp]

;calculate the floor()
cvttps2dq    xmm3,   xmm2
psrld        xmm2,   31
psubd        xmm3,   xmm2
cvtsq2ps     xmm2,   xmm3

;combine the results
andps        xmm0,   xmm1
andnps       xmm1,   xmm2
orps         xmm0,   xmm1

是否有更有效的方法来检查浮点值对于 32 位 int 是否不太大？

原文

Can anyone suggest a fast way to compute float floor/ceil using pre-SSE4.1 SIMD? I need to correctly handle all the corner cases, e.g. when I have a float value, that is not representable by 32-bit int.

Currently I'm using similar to the following code (I use C intrinsics, converted to asm for clarity):

;make many copies of the data
movaps       xmm0,   [float_value]
movaps       xmm1,   xmm0
movaps       xmm2,   xmm0

;check if the value is not too large in magnitude
andps        xmm1,   [exp_mask]
pcmpgtd      xmm1,   [max_exp]

;calculate the floor()
cvttps2dq    xmm3,   xmm2
psrld        xmm2,   31
psubd        xmm3,   xmm2
cvtsq2ps     xmm2,   xmm3

;combine the results
andps        xmm0,   xmm1
andnps       xmm1,   xmm2
orps         xmm0,   xmm1

Is there a more efficient way to check if the float value is not too large for 32bit int?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

猫七 2024-10-28 12:36:55

以下是单个元素的一些伪代码，应可直接转换为向量指令：

float f;
int i = (int)f; /* 0x80000000 if out of range (as from cvtps2dq) */
if (i == 0x80000000)
    return f;
else
    return (float)i;

您将在第二行中使用舍入模式将其转换为 int。您还可以在 MXCSR 中测试 IE 标志以检测超出范围的值。

Here is some pseudocode for a single element that should be directly convertible into vector instructions:

float f;
int i = (int)f; /* 0x80000000 if out of range (as from cvtps2dq) */
if (i == 0x80000000)
    return f;
else
    return (float)i;

You would use your rounding mode for the cast to int in the second line. You can also test the IE flag in MXCSR to detect out of range values.

回复收藏 0 原文

~没有更多了~

关于作者

幻梦

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

上证所 SIMD 的上限/下限

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

上证所 SIMD 的上限/下限

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

Promise

qq_lbRlsh

待＂谢繁草

yy2010hell

漫无边际

傲娇萝莉攻

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。