将 uint32 向量转换为 float 向量的最有效方法？

发布于 2025-01-02 06:45:30 字数 384 浏览 4 评论 0原文

x86 没有从无符号 int32 转换为浮点的 SSE 指令。实现这一目标最有效的指令序列是什么？

编辑：为了澄清，我想做以下标量运算的向量序列：

unsigned int x = ...
float res = (float)x;

EDIT2：这是一个用于进行标量转换的简单算法。

unsigned int x = ...
float bias = 0.f;
if (x > 0x7fffffff) {
    bias = (float)0x80000000;
    x -= 0x80000000;
}
res = signed_convert(x) + bias;

原文

x86 does not have an SSE instruction to convert from unsigned int32 to floating point. What would be the most efficient instruction sequence for achieving this?

EDIT:
To clarify, i want to do the vector sequence of the following scalar operation:

unsigned int x = ...
float res = (float)x;

EDIT2: Here is a naive algorithm for doing a scalar conversion.

unsigned int x = ...
float bias = 0.f;
if (x > 0x7fffffff) {
    bias = (float)0x80000000;
    x -= 0x80000000;
}
res = signed_convert(x) + bias;

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

同展鸳鸯锦 2025-01-09 06:45:30

你的幼稚标量算法无法提供正确舍入的转换 - 它将遭受某些输入的双重舍入。例如：如果 x 为 0x88000081，则转换为 float 的正确舍入结果为 2281701632.0f，但标量算法将返回改为2281701376.0f。

在我的脑海中，您可以按如下方式进行正确的转换（正如我所说，这是我的头脑中的想法，因此很可能在某处保存指令）：

movdqa   xmm1,  xmm0    // make a copy of x
psrld    xmm0,  16      // high 16 bits of x
pand     xmm1, [mask]   // low 16 bits of x
orps     xmm0, [onep39] // float(2^39 + high 16 bits of x)
cvtdq2ps xmm1, xmm1     // float(low 16 bits of x)
subps    xmm0, [onep39] // float(high 16 bits of x)
addps    xmm0,  xmm1    // float(x)

其中常量具有以下值：

mask:   0000ffff 0000ffff 0000ffff 0000ffff
onep39: 53000000 53000000 53000000 53000000

这是什么所做的是将每个通道的高半部分和低半部分分别转换为浮点数，然后将这些转换后的值相加。由于每一半只有 16 位宽，因此转换为浮点型不会产生任何舍入。仅当两半相加时才会进行四舍五入；因为加法是正确舍入的运算，所以整个转换都是正确舍入的。

相比之下，您的简单实现首先将低 31 位转换为浮点数，这会导致舍入，然后有条件地将 2^31 添加到该结果，这可能会导致第二次舍入。每当您在转换中有两个单独的舍入点时，除非您非常小心它们是如何发生的，否则您不应期望结果能够正确舍入。

Your naive scalar algorithm doesn't deliver a correctly-rounded conversion -- it will suffer from double rounding on certain inputs. As an example: if x is 0x88000081, then the correctly-rounded result of conversion to float is 2281701632.0f, but your scalar algorithm will return 2281701376.0f instead.

Off the top of my head, you can do a correct conversion as follows (as I said, this is off the top of my head, so it's likely possible to save an instruction somewhere):

movdqa   xmm1,  xmm0    // make a copy of x
psrld    xmm0,  16      // high 16 bits of x
pand     xmm1, [mask]   // low 16 bits of x
orps     xmm0, [onep39] // float(2^39 + high 16 bits of x)
cvtdq2ps xmm1, xmm1     // float(low 16 bits of x)
subps    xmm0, [onep39] // float(high 16 bits of x)
addps    xmm0,  xmm1    // float(x)

where the constants have the following values:

mask:   0000ffff 0000ffff 0000ffff 0000ffff
onep39: 53000000 53000000 53000000 53000000

What this does is separately convert the high- and low-halves of each lane to floating-point, then add these converted values together. Because each half is only 16 bits wide, the conversion to float does not incur any rounding. Rounding only occurs when the two halves are added; because addition is a correctly-rounded operation, the entire conversion is correctly rounded.

By contrast, your naive implementation first converts the low 31 bits to float, which incurs a rounding, then conditionally adds 2^31 to that result, which may cause a second rounding. Any time you have two separate rounding points in a conversion, unless you are exceedingly careful about how they occur, you should not expect the result to be correctly rounded.

回复收藏 0 原文

原谅过去的我 2025-01-09 06:45:30

这是基于旧的但有用的 Apple AltiVec-SSE 迁移文档中的示例，不幸的是，该文档现在不再在 http:// developer.apple.com：

inline __m128 _mm_ctf_epu32(const __m128i v)
{
    const __m128 two16 = _mm_set1_ps(0x1.0p16f);

    // Avoid double rounding by doing two exact conversions
    // of high and low 16-bit segments
    const __m128i hi = _mm_srli_epi32((__m128i)v, 16);
    const __m128i lo = _mm_srli_epi32(_mm_slli_epi32((__m128i)v, 16), 16);
    const __m128 fHi = _mm_mul_ps(_mm_cvtepi32_ps(hi), two16);
    const __m128 fLo = _mm_cvtepi32_ps(lo);

    // do single rounding according to current rounding mode
    return _mm_add_ps(fHi, fLo);
}

This is based on an example from the old but useful Apple AltiVec-SSE migration documentation which unfortunately is now no longer available at http://developer.apple.com:

inline __m128 _mm_ctf_epu32(const __m128i v)
{
    const __m128 two16 = _mm_set1_ps(0x1.0p16f);

    // Avoid double rounding by doing two exact conversions
    // of high and low 16-bit segments
    const __m128i hi = _mm_srli_epi32((__m128i)v, 16);
    const __m128i lo = _mm_srli_epi32(_mm_slli_epi32((__m128i)v, 16), 16);
    const __m128 fHi = _mm_mul_ps(_mm_cvtepi32_ps(hi), two16);
    const __m128 fLo = _mm_cvtepi32_ps(lo);

    // do single rounding according to current rounding mode
    return _mm_add_ps(fHi, fLo);
}

回复收藏 0 原文