dtoa vs sprintf vs Grisu3 算法
What is the best way to render double precision numbers as strings in C++?
I ran across the article Here be dragons: advances in problems you didn’t even know you had which discusses printing floating point numbers.
I have been using sprintf
. I don't understand why I would need to modify the code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果您对 sprintf_s 感到满意,则不应更改。但是,如果您需要以库不支持的方式格式化输出,则可能需要重新实现 sprintf 的专用版本(使用任何已知算法)。
例如,JavaScript 对于如何打印其数字有非常精确的要求(请参阅 规范)。简单地调用 sprintf 并不能完成正确的输出。事实上,Grisu 的开发目的是为 JavaScript 编译器实现正确的数字打印。
Grisu 也比 sprintf 更快,但除非浮点打印是应用程序中的瓶颈,否则这不应成为切换到其他库的理由。
If you are happy with sprintf_s you shouldn't change. However if you need to format your output in a way that is not supported by your library, you might need to reimplement a specialized version of sprintf (with any of the known algorithms).
For example JavaScript has very precise requirements on how its numbers must be printed (see section 9.8.1 of the specification). The correct output can't be accomplished by simply calling sprintf. Indeed, Grisu has been developed to implement correct number-printing for a JavaScript compiler.
Grisu is also faster than sprintf, but unless floating-point printing is a bottleneck in your application this should not be a reason to switch to a different library.
啊哈!
您在文章中概述的问题是,对于某些数字,计算机显示的内容在理论上是正确的,但不是我们人类会使用的内容。
例如,就像文章所说,1.2999999... = 1.3,所以如果您的结果是 1.3,则计算机将其显示为 1.299999999 是(相当)正确的...但这不是您所看到的...
现在问题是计算机为什么这么做?原因是计算机以 2 为基数(二进制)进行计算,而我们通常以 10 为基数(十进制)进行计算。结果是相同的(感谢上帝!),但内部存储和表示却不同。
有些数字在以 10 为基数显示时看起来不错,例如 1.3,但其他数字则不然,例如 1/3 = 0.333333333.... 在以 2 为基数显示时也是如此,有些数字在以 2 为基数显示时“看起来”不错(通常当由 2) 的分数组成且其他不是时。当计算机在内部存储数字时,它可能无法“精确”存储它并存储最接近的可能表示形式,即使该数字在十进制中看起来“有限”。所以是的,在这种情况下,它有点“漂移”。如果你一次又一次地这样做,你可能会失去精度。但没有其他办法(除非使用能够存储分数的特殊数学库)
当计算机尝试以 10 为基数返回您所给出的数字时,就会出现问题。那么计算机可能会给出 1.299999,而不是您期望的 1.3。
这也是为什么您永远不要将浮点数与 ==、<、> 进行比较,而是使用特殊函数 islessgreater(a, b) isgreater(a, b)
并且尽可能精确,它给你正确的值,你只需要知道在处理浮点数时,最大精度的 1.2999999 是可以的,如果你本来期望 1.3
现在,如果你想“漂亮地打印”这些数字以获得最佳的“人类”表示(以 10 为基数),你可能需要使用一个特殊的库,比如你的 grisu3 ,它会尝试消除可能存在的漂移发生并将数字与最接近的以 10 为基数的表示形式对齐。
现在,库无法使用水晶球来查找哪些数字发生了漂移,因此可能会发生这样的情况:您实际上是指计算机中存储的最高精度 1.2999999,而库会将其“转换”为1.3...但它并不比显示 1.29999 而不是 1.3 更差,也不比显示 1.29999 更不精确。
如果您需要良好的可读性,这样的库将会很有用。如果没有,那只是浪费时间。
希望这有帮助!
Ahah !
The problem outlined in the article you give is that for some numbers, the computer displays something that is theoritically correct but not what we, humans, would have used.
For example, like the article says, 1.2999999... = 1.3, so if your result is 1.3, it's (quite) correct for the computer to display it as 1.299999999... But that's not what you would have seen...
Now the question is why does the computer do that ? The reason is the computer compute in base 2 (binary) and that we usually compute in base 10 (decimal). The results are the same (thanks god !) but the internal storage and the representation are not.
Some numbers looks nice when displayed in base 10, like 1.3 for example, but others don't, for example 1/3 = 0.333333333.... It's the same in base 2, some numbers "looks" nice in base 2 (usually when composed of fractions of 2) and other not. When the computer stores number internally, it may not be able to store it "exactly" and store the closest possible representation, even if the number looked "finite" in decimal. So yes, in this case, it "drifts" a little bit. If you do that again and again, you may lose precision. But there is no other way (unless using special math libs able to store fractions)
The problem arise when the computer tries to give you back in base 10 the number you gave it. Then the computer may gives you 1.299999 instead of the 1.3 you were expected.
That's also the reason why you should never compare floats with ==, <, >, but instead use the special functions islessgreater(a, b) isgreater(a, b) etc.
So the actual function you use (sprintf) is fine and as exact as it can, it gives you correct values, you just have to know that when dealing with floats, 1.2999999 at maximum precision is OK if you were expecting 1.3
Now if you want to "pretty print" those numbers to have the best "human" representation (base 10), you may want to use a special library, like your grisu3 which will try to undo the drift that may have happen and align the number to the closest base 10 representation.
Now the library cannot use a crystal ball and find what numbers were drifted or not, so it may happen that you really meant 1.2999999 at maximum precision as stored in the computer and the lib will "convert" it to 1.3... But it's not worse nor less precise than displaying 1.29999 instead of 1.3.
If you need a good readability, such lib will be useful. If not, it's just a waste of time.
Hope this help !
您可能想要使用 Grisu 之类的东西(或更快的方法),因为它为您提供了带有往返保证的最短十进制表示形式与仅采用固定精度的 sprintf 不同。好消息是 C++20 包含默认提供此功能的
std::format
。例如:打印
0.29999999999999999
而打印
0.3
(godbolt)。同时,您可以使用{fmt} 库、
std::format
是基于。 {fmt} 还提供了print
功能,使这变得更加简单和高效 (godbolt< /a>):免责声明:我是 {fmt} 和 C++20
std::format
的作者。You might want to use something like Grisu (or a faster method) because it gives you the shortest decimal representation with round trip guarantee unlike
sprintf
which only takes a fixed precision. The good news is that C++20 includesstd::format
that gives you this by default. For example:prints
0.29999999999999999
whileprints
0.3
(godbolt).In the meantime you can use the {fmt} library,
std::format
is based on. {fmt} also provides theprint
function that makes this even easier and more efficient (godbolt):Disclaimer: I'm the author of {fmt} and C++20
std::format
.在任何合理的语言中执行此操作的最佳方法是:
我并不是想让你或任何人灰心。这些实际上是令人着迷的功能,但它们也非常复杂,并且尝试为任何非幼稚的实现设计良好的测试覆盖率甚至更加复杂。除非您准备好花几个月的时间思考这个问题,否则不要开始。
The best way to do this in any reasonable language is:
I don't mean to discourage you, or anyone. These are actually fascinating functions to work on, but they are also shocking complex, and trying to design good test coverage for any non-naive implementation is even more involved. Don't get started unless you're prepared to spend months thinking about the problem.
在 C++ 中为什么不使用 iostream?您可能应该使用
cout
作为控制台,使用ostringstream
进行面向字符串的输出(除非您有非常具体的需要使用printf
系列)方法)。您不必担心格式化性能,除非实际分析表明 CPU 是瓶颈(与 I/O 相比)。
In C++ why aren't you using iostreams? You should probably be using
cout
for the console andostringstream
for string-oriented output (unless you have a very specific need to use aprintf
family method).You shouldn't worry about formatting performance unless actual profiling shows that CPU is the bottleneck (compared to say I/O).
http://www.cplusplus.com/reference/iostream/ostringstream/
http://www.cplusplus.com/reference/iostream/ostringstream/