将科学记数法的十进制数转换为 IEEE 754

发布于 2024-12-04 22:36:26 字数 157 浏览 0 评论 0原文

我已经阅读了一些显示如何从十进制转换为 IEEE 754 的文本和线程,但我仍然对如何在不扩展小数的情况下转换数字(以科学计数法表示)感到困惑

我特别工作的数字with 是 9.07 * 10^23,但任何数字都可以;我将弄清楚如何针对我的特定示例进行操作。

I've read a few texts and threads showing how to convert from a decimal to IEEE 754 but I am still confused as to how I can convert the number without expanding the decimal (which is represented in scientific notation)

The number I am particularly working with is 9.07 * 10^23, but any number would do; I will figure out how to do it for my particular example.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

淡笑忘祈一世凡恋 2024-12-11 22:36:26

如果您知道如何进行 IEEE 浮点加法和乘法,则将数字从十进制字符串转换为二进制 IEEE 是相当简单的。 (或者如果您使用任何基本编程语言,例如 C/C++)

有很多不同的方法,但最简单的方法是直接评估 9.07 * 10^23

首先,从 9.07 开始:

9.07 = 9 + 0 * 10^-1 + 7 * 10^-2

现在评估 10^23。这可以通过从 10 开始并使用任何供电算法来完成。

然后将结果相乘。

下面是 C/C++ 中的一个简单实现:

double mantissa = 9;
mantissa += 0 / 10.;
mantissa += 7 / 100.;

double exp = 1;
for (int i = 0; i < 23; i++){
    exp *= 10;
}

double result = mantissa * exp;

现在,向后(IEEE -> 十进制)要困难得多。

同样,还有很多不同的方法。这是我能想到的最简单的一种。

我将使用 1.0011101b * 2^40 作为示例。 (尾数是二进制的)

首先,将尾数转换为十进制:(这应该很容易,因为没有指数)

1.0011101b * 2^40 = 1.22656 * 2^40

现在,“缩放”数字以使二进制指数消失。这是通过乘以 10 的适当幂来“消除”二进制指数来完成的。

1.22656 * 2^40 = 1.22656 * (2^40 * 10^-12) * 10^12
               = 1.22656 * (1.09951) * 10^12
               = 1.34861 * 10^12

所以答案是:

1.0011101b * 2^40 = 1.34861 * 10^12

在此示例中,需要 10^12 来“缩小”2^40。确定所需的 10 次方就等于:

power of 10 = (power of 2) * log(2)/log(10)

Converting a number from a decimal string to binary IEEE is fairly straight-forward if you know how to do IEEE floating-point addition and multiplication. (or if you're using any basic programming language like C/C++)

There's a lot of different approaches to this, but the easiest is to evaluate 9.07 * 10^23 directly.

First, start with 9.07:

9.07 = 9 + 0 * 10^-1 + 7 * 10^-2

Now evaluate 10^23. This can be done by starting with 10 and using any powering algorithm.

Then multiply the results together.

Here's a simple implementation in C/C++:

double mantissa = 9;
mantissa += 0 / 10.;
mantissa += 7 / 100.;

double exp = 1;
for (int i = 0; i < 23; i++){
    exp *= 10;
}

double result = mantissa * exp;

Now, going backwards (IEEE -> to decimal) is a lot harder.

Again, there's also a lot of different approaches. Here's the easiest one I can think of it.

I'll use 1.0011101b * 2^40 as the example. (the mantissa is in binary)

First, convert the mantissa to decimal: (this should be easy, since there's no exponent)

1.0011101b * 2^40 = 1.22656 * 2^40

Now, "scale" the number such that the binary exponent vanishes. This is done by multiplying by an appropriate power of 10 to "get rid" of the binary exponent.

1.22656 * 2^40 = 1.22656 * (2^40 * 10^-12) * 10^12
               = 1.22656 * (1.09951) * 10^12
               = 1.34861 * 10^12

So the answer is:

1.0011101b * 2^40 = 1.34861 * 10^12

In this example, 10^12 was needed to "scale away" the 2^40. Determining the power of 10 that is needed is simply equal to:

power of 10 = (power of 2) * log(2)/log(10)
離殇 2024-12-11 22:36:26

我假设您希望结果是最接近十进制数的浮点数,并且您使用的是双精度浮点数。

对于大多数数字,有一种方法可以相对快速地完成。简而言之,它是如何工作的。

您需要将数字拆分为一个乘积或一部分数字,这些数字具有浮点数的精确表示形式。可以精确表示的最大 10 次方是 10^22。因此,要获得浮点形式的 9.07e+23,我们可以这样写:

9.07e+23 = 907 * 10^21

根据 IEEE-754 标准,单个浮点运算保证正确舍入,因此上述乘积计算为 2 的乘积双精度浮点数,将给出正确舍入的结果。

如果要在转换函数中使用它,您可能会将 10 的幂存储在数组中。

请注意,您无法在 9.07e-23 中使用此方法。该数字等于 907 / 10^23,因此分母太大而无法精确表示。在这种情况以及其他处理非常大或非常小的数字的情况下,您必须使用某种形式的高精度算术。

请参阅快速路径十进制到浮点转换进一步的细节和例子。

I'm assuming you want the result to be the floating-point number closest to the decimal number, and that you are using double-precision floating-point numbers.

For most numbers, there is a way to do it relatively quickly. Here's how it works in a nutshell.

You need to split the number into either a product or a fraction of numbers that have an exact representation as a floating-point number. The largest power of 10 that is exactly representable is 10^22. So, to get 9.07e+23 in floating-point form, we can write:

9.07e+23 = 907 * 10^21

According to the IEEE-754 standard, a single floating-point operation is guaranteed to be correctly rounded, so the above product, computed as a product of 2 double precision floating-point numbers, will give the correctly rounded result.

If you were to use this in a conversion function, you would probably store the powers of 10 in an array.

Note that you can't use this method for 9.07e-23. This number equals 907 / 10^23, so the denominator would be too large to be exactly representable. In this situation, and other dealings with very large or very small numbers, you have to use some form of high-precision arithmetic.

See Fast Path Decimal to Floating-Point Conversion for further details and examples.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文