128位浮点二进制表示错误

发布于 2025-02-01 11:04:38 字数 785 浏览 4 评论 0原文

假设我们有一些128位浮点数，例如x = 2.6（1.3 * 2^1 IEEE-754）。我这样加入了这样的联合：

union flt {
        long double flt;
        int64_t byte8[OCTALC];
    } d;
d = x;

然后我运行它以在内存中获得十六进制的表示：

void print_bytes(void *ptr, int size) 
{
    unsigned char *p = ptr;
    int i;
    for (i=0; i<size; i++) {
        printf("%02hhX ", p[i]);
    }
    printf("\n");
}

// some where in the code
print_bytes(&d.byte8[0], 16);

我得到类似的东西

66 66 66 66 66 66 66 A6 00 40 00 00 00 00 00 00

，所以假设我希望看到一个领先位（左侧）为1（因为指数为2.6是1），但实际上，我认为正确的位是1（就像它处理价值的大个子）一样。如果我翻转签名，则输出更改为：

66 66 66 66 66 66 66 A6 00 C0 00 00 00 00 00 00

因此，标志位似乎比我想象的。而且，如果您计算字节，似乎只有10个字节剩余的6个字节就像截断之类的东西。我试图找出为什么会发生任何帮助？

原文

Let's say we have some 128bit floating point number, for example x = 2.6 (1.3 * 2^1 ieee-754).
I put in in union like this:

union flt {
        long double flt;
        int64_t byte8[OCTALC];
    } d;
d = x;

Then i run this to get it hexadecimal representation in memory:

void print_bytes(void *ptr, int size) 
{
    unsigned char *p = ptr;
    int i;
    for (i=0; i<size; i++) {
        printf("%02hhX ", p[i]);
    }
    printf("\n");
}

// some where in the code
print_bytes(&d.byte8[0], 16);

And i get something like

66 66 66 66 66 66 66 A6 00 40 00 00 00 00 00 00

So by assumption i expect to see one of the leading bits(the left ones) to be 1(because exponent of 2.6 is 1) but in fact i see right bits to be 1(like it treating value big-endian). If i flip sign the output changes to:

66 66 66 66 66 66 66 A6 00 C0 00 00 00 00 00 00

So it seems like sign bit is righter than i thought. And if you count the bytes it seems like there is only 10 bytes used remaining 6 is like truncated or something.
I trying to find out why this happens any help?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

热鲨 2025-02-08 11:04:38

您有许多误解。

首先，您没有128位的浮点数。 长可能是 x86扩展精度格式在X86-64上。这是一个80位（10个字节）值，将其填充至16个字节。（我怀疑这是出于对齐的目的。）

当然，它将以小字节订单（因为这是x86/x86-64）。这不是指每个字节中的位顺序，而是指整体上的字节顺序。

最后，指数有偏见。 1指数未存储为1。它存储为1+0x3fff。这允许负数。

因此，我们将得到以下内容：

66 66 66 66 66 66 66 A6 00 40 00 00 00 00 00 00

demo

如果我们删除填充物并反向字节来换取字节，以更好地匹配匹配Wikipedia页面中的图像，我们将

4000A666666666666666

其转换为

+0x1.4CCCCCCCCCCCCCCC × 2^(0x4000-0x3FFF)

（0xa66 ... 6 = 0B1010 0110 0110 ... 0110⇒0B1.01001100 1100 1100 ... 110 [0] = 0x1.4cc ... c）

或

+1.29999999999999999995663191310057982263970188796520233154296875 × 2^1

小数使用

perl -Mv5.10 -e'
   use Math::BigFloat;
   Math::BigFloat->div_scale( 1000 );
   say
      Math::BigFloat->from_hex(  "4CCCCCCCCCCCCCCC" ) /
      Math::BigFloat->from_hex( "10000000000000000" )
'

或

perl -Mv5.10 -e'
   use Math::BigFloat;
   Math::BigFloat->div_scale( 1000 );
   say
      Math::BigFloat->from_hex( "A666666666666666" ) /
      Math::BigFloat->from_hex( "8000000000000000" )
'

You have a number of misconceptions.

First of all, you don't have a 128-bit floating point number. long double is probably a float in the x86 extended precision format on an x86-64. This is an 80 bit (10 byte) value, which is padded to 16 bytes. (I suspect this is for alignment purposes.)

And of course, it's going to be in little-endian byte order (since this is an x86/x86-64). This doesn't refer to the order of the bits in each byte, it refers to the order of the bytes in the whole.

And finally, the exponent is biased. An exponent of 1 isn't stored as 1. It's stored as 1+0x3FFF. This allows for negative exponents.

So we get the following:

66 66 66 66 66 66 66 A6 00 40 00 00 00 00 00 00

Demo on Compiler Explorer

If we remove the padding and reverse the bytes to better match the image in the Wikipedia page, we get

4000A666666666666666

This translates to

+0x1.4CCCCCCCCCCCCCCC × 2^(0x4000-0x3FFF)

(0xA66...6 = 0b1010 0110 0110...0110 ⇒ 0b1.0100 1100 1100...110[0] = 0x1.4CC...C)

+1.29999999999999999995663191310057982263970188796520233154296875 × 2^1

Decimal conversion obtained using

perl -Mv5.10 -e'
   use Math::BigFloat;
   Math::BigFloat->div_scale( 1000 );
   say
      Math::BigFloat->from_hex(  "4CCCCCCCCCCCCCCC" ) /
      Math::BigFloat->from_hex( "10000000000000000" )
'

perl -Mv5.10 -e'
   use Math::BigFloat;
   Math::BigFloat->div_scale( 1000 );
   say
      Math::BigFloat->from_hex( "A666666666666666" ) /
      Math::BigFloat->from_hex( "8000000000000000" )
'

回复收藏 0 原文

唯憾梦倾城 2025-02-08 11:04:38

您被某些非常的奇怪方面所困扰，通常在Intel Architectures的C中实现了扩展精确的浮点。所以不要感到难过。 :-)

您所看到的是，尽管sizeof（长double）可能是16（== 128位），但内心深处是 80-1位Intel扩展格式。它被6个字节填充，在您的情况下，这是0。

我在机器上看到了同样的东西，这是我一直想知道的。这似乎是真正的浪费，不是吗？我曾经认为这是与机器的某种兼容性，实际上确实有128位长双打。但这是不可能的，因为这种0个0个字节格式是 binary-compatiabile with true IEEE 128位浮点，除其他外，因为填充物在错误的端。