浮点误差的确定性有多大?

发布于 2024-07-10 15:52:59 字数 467 浏览 6 评论 0 原文

我知道浮点计算存在准确性问题,并且有很多问题可以解释原因。 我的问题是,如果我运行相同的计算两次,我是否可以始终依赖它产生相同的结果? 哪些因素可能会影响这个?

  • 计算之间的时间?
  • CPU当前状态?
  • 硬件不同?
  • 语言/平台/操作系统?
  • 太阳耀斑?

我有一个简单的物理模拟,并且想记录会话以便可以重播。 如果计算可以信赖,那么我只需要记录初始状态加上任何用户输入,并且我应该始终能够准确地重现最终状态。 如果计算不准确,开始时的错误可能会在模拟结束时产生巨大的影响。

我目前在 Silverlight 工作,但有兴趣知道这个问题是否可以得到一般性的回答。

更新:最初的答案表明是,但显然这并不完全明确,正如所选答案的评论中所讨论的那样。 看来我必须做一些测试,看看会发生什么。

I understand that floating point calculations have accuracy issues and there are plenty of questions explaining why. My question is if I run the same calculation twice, can I always rely on it to produce the same result? What factors might affect this?

  • Time between calculations?
  • Current state of the CPU?
  • Different hardware?
  • Language / platform / OS?
  • Solar flares?

I have a simple physics simulation and would like to record sessions so that they can be replayed. If the calculations can be relied on then I should only need to record the initial state plus any user input and I should always be able to reproduce the final state exactly. If the calculations are not accurate errors at the start may have huge implications by the end of the simulation.

I am currently working in Silverlight though would be interested to know if this question can be answered in general.

Update: The initial answers indicate yes, but apparently this isn't entirely clear cut as discussed in the comments for the selected answer. It looks like I will have to do some tests and see what happens.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

女皇必胜 2024-07-17 15:52:59

据我了解,只有当您处理相同的指令集和编译器,并且您运行的任何处理器都严格遵守相关标准(即 IEEE754)时,才能保证获得相同的结果。 也就是说,除非您正在处理一个特别混乱的系统,否则运行之间的任何计算偏差都不太可能导致错误行为。

我知道的具体问题:

  1. 某些操作系统允许您以破坏兼容性的方式设置浮点处理器的模式。

  2. 浮点中间结果在寄存器中通常使用80位精度,但在内存中仅使用64位精度。 如果程序以更改函数内寄存器溢出的方式重新编译,则与其他版本相比,它可能会返回不同的结果。 大多数平台都会为您提供一种方法,强制将所有结果截断为内存中的精度。

  3. 标准库函数可能会因版本而异。 我发现 gcc 3 与 4 中存在一些并不罕见的例子。

  4. IEEE 本身允许一些二进制表示有所不同......特别是 NaN 值,但我不记得细节了。

From what I understand you're only guaranteed identical results provided that you're dealing with the same instruction set and compiler, and that any processors you run on adhere strictly to the relevant standards (ie IEEE754). That said, unless you're dealing with a particularly chaotic system any drift in calculation between runs isn't likely to result in buggy behavior.

Specific gotchas that I'm aware of:

  1. some operating systems allow you to set the mode of the floating point processor in ways that break compatibility.

  2. floating point intermediate results often use 80 bit precision in register, but only 64 bit in memory. If a program is recompiled in a way that changes register spilling within a function, it may return different results compared to other versions. Most platforms will give you a way to force all results to be truncated to the in memory precision.

  3. standard library functions may change between versions. I gather that there are some not uncommonly encountered examples of this in gcc 3 vs 4.

  4. The IEEE itself allows some binary representations to differ... specifically NaN values, but I can't recall the details.

汐鸠 2024-07-17 15:52:59

简而言之,根据 IEEE 浮点标准,浮点计算完全是确定性的,但是并不意味着它们可以在机器、编译器、操作系统等之间完全重现。

这些问题的详细答案以及更多问题可以在可能是关于浮点的最佳参考文献 David Goldberg 的 每个计算机科学家都应该了解浮点运算。 请跳至有关 IEEE 标准的部分以了解关键详细信息。

简要回答您的要点:

  • 计算和状态之间的时间
    跟CPU关系不大
    这。

  • 硬件会影响事物(例如,某些 GPU 不会影响事物)
    IEEE 浮点兼容)。

  • 语言、平台和操作系统也可以
    影响事物。 要获得比我能提供的更好的描述,请参阅 Jason Watkins 的回答。 如果您使用 Java,请查看 Kahan 的对 Java 浮点不足的抱怨.

  • 希望太阳耀斑可能很重要
    很少。 我不会太担心,因为如果
    它们确实很重要,那么其他一切也都搞砸了。 我会将其与担心 EMP 归为同一类别。

最后,如果您对相同的初始输入执行相同的浮点计算序列,那么事情应该可以很好地重播。 确切的顺序可能会根据您的编译器/操作系统/标准库而改变,因此您可能会以这种方式遇到一些小错误。

通常在浮点中遇到问题的地方是,如果您有一个数值不稳定的方法,并且您从大约相同但不完全相同的 FP 输入开始。 如果您的方法稳定,您应该能够保证一定容差内的重现性。 如果您想要更多详细信息,请查看上面链接的 Goldberg 的 FP 文章或获取有关数值分析的介绍文本。

The short answer is that FP calculations are entirely deterministic, as per the IEEE Floating Point Standard, but that doesn't mean they're entirely reproducible across machines, compilers, OS's, etc.

The long answer to these questions and more can be found in what is probably the best reference on floating point, David Goldberg's What Every Computer Scientist Should Know About Floating Point Arithmetic. Skip to the section on the IEEE standard for the key details.

To answer your bullet points briefly:

  • Time between calculations and state
    of the CPU have little to do with
    this.

  • Hardware can affect things (e.g. some GPUs are not
    IEEE floating point compliant).

  • Language, platform, and OS can also
    affect things. For a better description of this than I can offer, see Jason Watkins's answer. If you are using Java, take a look at Kahan's rant on Java's floating point inadequacies.

  • Solar flares might matter, hopefully
    infrequently. I wouldn't worry too much, because if
    they do matter, then everything else is screwed up too. I would put this in the same category as worrying about EMP.

Finally, if you are doing the same sequence of floating point calculations on the same initial inputs, then things should be replayable exactly just fine. The exact sequence can change depending on your compiler/os/standard library, so you might get some small errors this way.

Where you usually run into problems in floating point is if you have a numerically unstable method and you start with FP inputs that are approximately the same but not quite. If your method's stable, you should be able to guarantee reproducibility within some tolerance. If you want more detail than this, then take a look at Goldberg's FP article linked above or pick up an intro text on numerical analysis.

毁梦 2024-07-17 15:52:59

我认为您的困惑在于浮点的不准确性类型。 大多数语言都实现 IEEE 浮点标准 该标准规定了浮点数中的各个位如何/double 用于产生数字。 通常,浮点数由四个字节和一个双八字节组成。

两个浮点数之间的数学运算每次都具有相同的值(按照标准中的规定)。

不准确来自于精度。 考虑 int 与 float。 两者通常占用相同数量的字节 (4)。 然而,每个数字可以存储的最大值却截然不同。

  • int:大约20亿
  • float:3.40282347E38(相当大)

差异在中间。 int,可以表示 0 到大约 20 亿之间的每个数字。 然而浮动却不能。 它可以表示 0 到 3.40282347E38 之间的 20 亿个值。 但这留下了无法表示的整个值范围。 如果数学方程达到这些值之一,则必须四舍五入为可表示的值,因此被认为是“不准确的”。 您对不准确的定义可能会有所不同:)。

I think your confusion lies in the type of inaccuracy around floating point. Most languages implement the IEEE floating point standard This standard lays out how individual bits within a float/double are used to produce a number. Typically a float consists of a four bytes, and a double eight bytes.

A mathmatical operation between two floating point numbers will have the same value every single time (as specified within the standard).

The inaccuracy comes in the precision. Consider an int vs a float. Both typically take up the same number of bytes (4). Yet the maximum value each number can store is wildly different.

  • int: roughly 2 billion
  • float: 3.40282347E38 (quite a bit larger)

The difference is in the middle. int, can represent every number between 0 and roughly 2 billion. Float however cannot. It can represent 2 billion values between 0 and 3.40282347E38. But that leaves a whole range of values that cannot be represented. If a math equation hits one of these values it will have to be rounded out to a representable value and is hence considered "inaccurate". Your definition of inaccurate may vary :).

秋意浓 2024-07-17 15:52:59

另外,虽然 Goldberg 是一个很好的参考,但原文也是错误的:IEEE754 不保证可移植。 考虑到这种说法是基于浏览文本而做出的,我怎么强调都不为过。 该文档的更高版本包括专门讨论此问题的部分

许多程序员可能没有意识到,即使一个程序仅使用 IEEE 标准规定的数字格式和运算,也可以在不同的系统上计算出不同的结果。 事实上,该标准的作者旨在允许不同的实现获得不同的结果。

Also, while Goldberg is a great reference, the original text is also wrong: IEEE754 is not gaurenteed to be portable. I can't emphasize this enough given how often this statement is made based on skimming the text. Later versions of the document include a section that discusses this specifically:

Many programmers may not realize that even a program that uses only the numeric formats and operations prescribed by the IEEE standard can compute different results on different systems. In fact, the authors of the standard intended to allow different implementations to obtain different results.

怂人 2024-07-17 15:52:59

C++ FAQ 中的这个答案可能是最好的描述:

http: //www.parashift.com/c++-faq-lite/newbie.html#faq-29.18

不仅不同的体系结构或编译器可能会给您带来麻烦,浮点数在同一体系结构或编译器中已经以奇怪的方式表现程序。 正如常见问题解答指出的那样,如果 y == x 为 true,则仍然意味着 cos(y) == cos(x) 将为 false。 这是因为 x86 CPU 使用 80 位计算该值,而该值在内存中以 64 位存储,因此您最终将截断的 64 位值与完整的 80 位值进行比较。

计算仍然是确定性的,从某种意义上说,运行相同的编译二进制文件每次都会给你相同的结果,但是当你稍微调整源代码、优化标志或使用不同的编译器编译它时,所有的赌注都会消失。可以发生。

实际上,我的情况并没有那么糟糕,我可以在 32 位 Linux 上使用不同版本的 GCC 一点一点地重现简单的浮点数学,但当我切换到 64 位 Linux 时,结果不再相同。 在 32 位上创建的演示录音无法在 64 位上运行,反之亦然,但在同一架构上运行时可以正常运行。

This answer in the C++ FAQ probably describes it the best:

http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.18

It is not only that different architectures or compiler might give you trouble, float pointing numbers already behave in weird ways within the same program. As the FAQ points out if y == x is true, that can still mean that cos(y) == cos(x) will be false. This is because the x86 CPU calculates the value with 80bit, while the value is stored as 64bit in memory, so you end up comparing a truncated 64bit value with a full 80bit value.

The calculation are still deterministic, in the sense that running the same compiled binary will give you the same result each time, but the moment you adjust the source a bit, the optimization flags or compile it with a different compiler all bets are off and anything can happen.

Practically speaking, I it is not quite that bad, I could reproduce simple float pointing math with different version of GCC on 32bit Linux bit for bit, but the moment I switched to 64bit Linux the result were no longer the same. Demos recordings created on 32bit wouldn't work on 64bit and vice versa, but would work fine when run on the same arch.

深海夜未眠 2024-07-17 15:52:59

由于您的问题被标记为 C#,因此值得强调 .NET 上面临的问题:

  1. 浮点数学不具有关联性 - 也就是说,(a + b) + c 不能保证等于 a + (b + c) ;
  2. 不同的编译器将以不同的方式优化您的代码,这可能涉及重新排序算术运算。
  3. 在 .NET 中,CLR 的 JIT 编译器将动态编译您的代码,因此编译取决于运行时计算机上的 .NET 版本。

这意味着,您不应依赖 .NET 应用程序在不同版本的 .NET CLR 上运行时生成相同的浮点计算结果。

例如,在您的情况下,如果您记录模拟的初始状态和输入,然后安装更新 CLR 的服务包,则您的模拟在下次运行时可能不会以相同的方式重放。

请参阅 Shawn Hargreaves 的博客文章 浮点数学是确定性的吗? 进行与 .NET 相关的进一步讨论。

Since your question is tagged C#, it's worth emphasising the issues faced on .NET:

  1. Floating point maths is not associative - that is, (a + b) + c is not guaranteed to equal a + (b + c);
  2. Different compilers will optimize your code in different ways, and that may involve re-ordering arithmetic operations.
  3. In .NET the CLR's JIT compiler will compile your code on the fly, so compilation is dependent upon the version of the .NET on the machine at runtime.

This means, that you shouldn't rely upon your .NET application producing the same floating point calculation results when run on different versions of the .NET CLR.

For example, in your case, if you record the initial state and inputs to your simulation, then install a service pack that updates the CLR, your simulation may not replay identically the next time you run it.

See Shawn Hargreaves's blog post Is floating point math deterministic? for further discussion relevant to .NET.

微暖i 2024-07-17 15:52:59

抱歉,但我忍不住认为每个人都没有抓住重点。

如果不准确性对您正在做的事情很重要,那么您应该寻找不同的算法。

你说如果计算不准确,开始时的错误可能会在模拟结束时产生巨大的影响。

我的朋友不是模拟的。 如果由于舍入和精度造成的微小差异而得到截然不同的结果,那么很可能所有结果都没有任何有效性。 仅仅因为您可以重复结果并不意味着它就变得更加有效。

对于任何包含测量或非整数计算的重要现实世界问题,引入小错误来测试算法的稳定性始终是一个好主意。

Sorry, but I can't help thinking that everybody is missing the point.

If the inaccuracy is significant to what you are doing then you should look for a different algorithm.

You say that if the calculations are not accurate, errors at the start may have huge implications by the end of the simulation.

That my friend is not a simulation. If you are getting hugely different results due to tiny differences due to rounding and precision then the chances are that none of the results has any validity. Just because you can repeat the result does not make it any more valid.

On any non-trivial real world problem that includes measurements or non-integer calculation, it is always a good idea to introduce minor errors to test how stable your algorithm is.

山田美奈子 2024-07-17 15:52:59

嗯。 由于 OP 要求使用 C#:

C# 字节码 JIT 是确定性的还是在不同的运行之间生成不同的代码? 我不知道,但我不会相信吉特。

我可以想到这样的场景:JIT 具有一些服务质量功能,并决定花更少的时间进行优化,因为 CPU 在其他地方进行大量的数字运算(想想后台 DVD 编码)? 这可能会导致细微的差异,并可能在以后导致巨大的差异。

此外,如果 JIT 本身得到改进(也许作为服务包的一部分),生成的代码肯定会发生变化。 80位内部精度问题已经提到过。

HM. Since the OP asked for C#:

Is the C# bytecode JIT deterministic or does it generate different code between different runs? I don't know, but I wouldn't trust the Jit.

I could think of scenarios where the JIT has some quality of service features and decides to spend less time on optimization because the CPU is doing heavy number crunching somewhere else (think background DVD encoding)? This could lead to subtle differences that may result in huge differences later on.

Also if the JIT itself gets improved (maybe as part of a service pack maybe) the generated code will change for sure. The 80 bit internal precision issue has already been mentioned.

司马昭之心 2024-07-17 15:52:59

这不是您问题的完整答案,但这里有一个示例,证明 C# 中的双重计算是不确定的。 我不知道为什么,但看似不相关的代码显然会影响下游双重计算的结果。

  1. 在 Visual Studio 版本 12.0.40629.00 Update 5 中创建新的 WPF 应用程序,并接受所有默认选项。
  2. 将 MainWindow.xaml.cs 的内容替换为:

    使用系统; 
      使用系统.Windows; 
    
      命名空间 WpfApplication1 
      { 
          /// <摘要> 
          /// MainWindow.xaml 的交互逻辑 
          ///  
          公共部分类 MainWindow : 窗口 
          { 
              公共主窗口() 
              { 
                  初始化组件(); 
                  内容 = FooConverter.Convert(new Point(950, 500), new Point(850, 500)); 
              } 
          } 
    
          公共静态类 FooConverter 
          { 
              公共静态字符串转换(点curIPJos,点oppIJPos) 
              { 
                  var ij = " 绝缘接头"; 
                  var deltaX = oppIJPos.X - curIPJos.X; 
                  var deltaY = oppIJPos.Y - curIPJos.Y; 
                  var teta = Math.Atan2(deltaY, deltaX); 
                  字符串结果; 
                  if (-Math.PI / 4 <= teta && teta <= Math.PI / 4) 
                      结果 = "左" + ij; 
                  else if (Math.PI / 4 < teta && teta <= Math.PI * 3 / 4) 
                      结果 = "顶部" + ij; 
                  否则 if (Math.PI * 3 / 4 
  3. 将构建配置设置为“Release”并构建,但不要在 Visual Studio 中运行。

  4. 双击生成的 exe 来运行它。
  5. 请注意,该窗口显示“底部绝缘接头”。
  6. 现在在“字符串结果”之前添加此行:

    字符串调试 = teta.ToString(); 
      
  7. 重复步骤 3 和 4。

  8. 请注意,窗口显示“Right Insulated Joint”。

此行为在同事的机器上得到了证实。 请注意,如果满足以下任一条件,则窗口始终显示“右绝缘接头”:exe 从 Visual Studio 中运行、exe 是使用调试配置生成的,或者在项目属性中未选中“首选 32 位”。

弄清楚发生了什么是相当困难的,因为任何观察过程的尝试似乎都会改变结果。

This is not a full answer to your question, but here is an example demonstrating that double calculations in C# are non-deterministic. I don't know why, but seemingly unrelated code can apparently affect the outcome of a downstream double calculation.

  1. Create a new WPF application in Visual Studio Version 12.0.40629.00 Update 5, and accept all the default options.
  2. Replace the contents of MainWindow.xaml.cs with this:

    using System;
    using System.Windows;
    
    namespace WpfApplication1
    {
        /// <summary>
        /// Interaction logic for MainWindow.xaml
        /// </summary>
        public partial class MainWindow : Window
        {
            public MainWindow()
            {
                InitializeComponent();
                Content = FooConverter.Convert(new Point(950, 500), new Point(850, 500));
            }
        }
    
        public static class FooConverter
        {
            public static string Convert(Point curIPJos, Point oppIJPos)
            {
                var ij = " Insulated Joint";
                var deltaX = oppIJPos.X - curIPJos.X;
                var deltaY = oppIJPos.Y - curIPJos.Y;
                var teta = Math.Atan2(deltaY, deltaX);
                string result;
                if (-Math.PI / 4 <= teta && teta <= Math.PI / 4)
                    result = "Left" + ij;
                else if (Math.PI / 4 < teta && teta <= Math.PI * 3 / 4)
                    result = "Top" + ij;
                else if (Math.PI * 3 / 4 < teta && teta <= Math.PI || -Math.PI <= teta && teta <= -Math.PI * 3 / 4)
                    result = "Right" + ij;
                else
                    result = "Bottom" + ij;
                return result;
            }
        }
    }
    
  3. Set build configuration to "Release" and build, but do not run in Visual Studio.

  4. Double-click the built exe to run it.
  5. Note that the window shows "Bottom Insulated Joint".
  6. Now add this line just before "string result":

    string debug = teta.ToString();
    
  7. Repeat steps 3 and 4.

  8. Note that the window shows "Right Insulated Joint".

This behavior was confirmed on a colleague's machine. Note that the window consistently shows "Right Insulated Joint" if any of the following are true: the exe is run from within Visual Studio, the exe was built using the Debug configuration, or "Prefer 32-bit" is unchecked in project properties.

It's quite difficult to figure out what's going on, since any attempt to observe the process appears to change the result.

帅冕 2024-07-17 15:52:59

很少有 FPU 符合 IEEE 标准(尽管他们声称)。 所以在不同的硬件上运行相同的程序确实会得到不同的结果。 结果可能出现在极端情况下,作为在软件中使用 FPU 的一部分,您应该已经避免这种情况。

IEEE 错误通常会在软件中进行修补,您确定您当前运行的操作系统包含来自制造商的正确陷阱和修补程序吗? 操作系统更新之前或之后怎么样? 是否删除了所有错误并添加了错误修复? C 编译器是否与所有这些同步,并且 C 编译器是否生成正确的代码?

测试这一点可能是徒劳的。 在交付产品之前您不会看到问题。

遵守 FP 规则 1:切勿使用 if(something==something) 比较。 IMO 的第二条规则与 ascii 到 fp 或 fp 到 ascii(printf、scanf 等)有关。 与硬件相比,那里存在更多的准确性和错误问题。

随着每一代新硬件(密度)的出现,太阳的影响变得更加明显。 我们已经在行星表面的 SEU 方面遇到了问题,因此独立于浮点计算,您将会遇到问题(很少有供应商愿意关心,因此预计新硬件会更频繁地崩溃)。

由于消耗大量逻辑,fpu 可能会非常快(单个时钟周期)。 不比整数 alu 慢。 不要将此与像 alus 这样简单的现代 fpu 混淆,fpu 很昂贵。 (alus 同样消耗更多逻辑来进行乘法和除法,以将其降低到一个时钟周期,但它几乎没有 fpu 那么大)。

遵循上面的简单规则,多研究一下浮点,了解随之而来的缺点和陷阱。 您可能需要定期检查无穷大或 nan。 您的问题更有可能出现在编译器和操作系统中,而不是硬件中(通常不仅仅是 fp 数学)。 如今,现代硬件(和软件)从定义上来说就充满了错误,因此请尽量减少软件运行时的错误。

Very few FPUs meet the IEEE standard (despite their claims). So running the same program in different hardware will indeed give you different results. The results are likely to be in corner cases that you should already avoid as part of using an FPU in your software.

IEEE bugs are often patched in software and are you sure that the operating system you are running today includes the proper traps and patches from the manufacturer? What about before or after the OS has an update? Are all bugs removed and bug fixes added? Is the C compiler in sync with all of this and is the C compiler producing the proper code?

Testing this may prove futile. You wont see the problem until you deliver the product.

Observe FP rule number 1: Never use an if(something==something) comparison. And rule number two IMO would have to do with ascii to fp or fp to ascii (printf, scanf, etc). There are more accuracy and bug problems there than in the hardware.

With each new generation of hardware (density) the affects from the sun are more apparent. We already have problems with SEU's on the planets surface, so independent of floating point calculations, you will have problems (few vendors have bothered to care, so expect crashes more often with new hardware).

By consuming enormous amounts of logic the fpu is likely to be a very fast (a single clock cycle). Not any slower than an integer alu. Do not confuse this with modern fpus being as simple as alus, fpus are expensive. (alus likewise consume more logic for multiply and divide to get that down to one clock cycle but its not nearly as big as the fpu).

Keep to the simple rules above, study floating point a bit more, understand the warts and traps that go with it. You may want to check for infinity or nans periodically. Your problems are more likely to be found in the compiler and operating system than the hardware (in general not just fp math). Modern hardware (and software) is, these days, by definition full of bugs, so just try to be less buggy than what your software runs on.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文