浮点加法向上舍入
我有一个浮点加法,由于值的大小不同,因此可能会出错,因此相当多的有效数字被移出(甚至可能是全部)。在整个计算精度的范围内并不那么相关,只是结果大于或等于任意精度的结果(我在这里跟踪范围的末尾,并将其扩展 至少一定数量)。
因此,我需要在将被加数变为相同指数时进行四舍五入的加法(即,如果设置了从被加数移出的一位数字,则应使用 nextval(denormalized_summand, +infinity) 进行加法
.
有没有一种简单的方法来执行此加法(手动对较小的被加数进行反规范化并在其上使用 nextval
,但我怀疑这是否有效)?
I have a floating point addition that is somewhat likely to go wrong as the values have different magnitude, so quite a few significant digits are shifted out (possibly even all of them). In the scope of the entire calculation precision is not that relevant, only that the result is greater or equal to what would be the result with arbitrary precision (I'm keeping track of the end of a range here, and extend it by at least a certain amount).
So I'd need an addition that rounds up when bringing the summands to the same exponent (i.e. if one digit shifted out of a summand was set, the addition should take place with nextval(denormalized_summand, +infinity)
.
Is there an easy way to perform this addition (manually denormalizing the smaller summand and using nextval
on it springs to mind, but I doubt that would be efficient)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以将FPU舍入模式设置为“向上”,然后正常添加即可。
这是在 GNU 环境中完成的方式:
如果您有 Microsoft 编译器,则等效代码为:
You can set the FPU rounding mode to "upward" and then just add normally.
This is how it's done in GNU environments:
If you have a Microsoft compiler, the equivalent code is: