C++/Haskell 中的精确实数算术和惰性列表性能
我最近在阅读本文<之后遇到了精确实数算术的主题< /a> 和本文。
我发现了许多讨论使用带符号数字流实现精确算术的论文。使用无限流实现任意精度可以在函数式语言(如 Haskell)中使用惰性列表实现良好的实用实现。然而,讨论函数式语言的此类实现的论文似乎得出的结论是性能非常差。
现在,我意识到,与标准浮点表示相比,精确的非硬件实现通常具有相对较差的性能,但我有兴趣以命令式语言(特别是 C++)和操作集合提供更有效的实现/函数(算术运算、三角函数、exp、log 等)。
我的问题:有符号数字/惰性流表示是否存在本质上较慢的因素导致性能不佳,或者是Haskell?是什么让它变得缓慢?是否有可能使用 C++ 中的惰性流实现带符号的数字流表示,从而实现比 Haskell 对应物(显着)更好的性能,或者这是徒劳的练习?也许重建为迭代?
我知道有两个 C++ 库 RealLib 和 iRRAM 可以实现高效的实数计算。然而,这些似乎使用区间算术,将实数表示为缩小的嵌套区间,这似乎不像无限流那样“纯粹”的方法(如果您不同意,请纠正我!)。但也许这些是实现良好效率的唯一方法?
任何意见表示赞赏!
I recently came across the subject of exact real arithmetic after reading this paper and this paper.
I have found a number of papers that discuss realizations of exact arithmetic using signed digit streams. The use of infinite streams for arbitrary precision leads to nice practical implementations in functional languages, like Haskell, using lazy lists. However, the papers that discuss such implementations in functional languages seem to come to the conclusion that performance is very poor.
Now, I appreciate that exact, non-hardware implementations will generally have relatively poor performance compared to the standard floating-point representation, but I am interested in providing a more efficient implementation in an imperative language (specifically, C++) and a collection of operations/functions (arithmetic operations, trigonometric functions, exp, log, etc.).
My question(s): is there something inherently slow about a signed digit/lazy stream representation that causes the bad performance, or is it Haskell? What is it that makes it slow? Would it be possible to implement a signed digit stream representation using lazy streams in C++ that achieves (significantly) better performance than its Haskell counterpart, or is this an exercise in futility? Perhaps reconstructing as iterations?
I know there exists two C++ libraries, RealLib and iRRAM, that achieve efficient real number computation. However, these seem to use interval arithmetic, representing real numbers as shrinking nested intervals, which doesn't seem as 'pure' an approach as infinite streams (please correct me if you disagree!). But perhaps these are the only approaches that achieve good efficiency?
Any input is appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
惰性流最有效地表示为具有延迟函数组件的数据。这是 GHC 中相当高效的 Haskell 实现所使用的表示形式,无论如何您都需要在 C++ 中使用它。没有特殊的“快速”版本的惰性可以用 C++ 编写,在 20 年的 Haskell 编译器研究中还没有尝试过,更多的可以追溯到 Algol。
有关如何最有效地实现惰性数据结构的研究的更多详细信息,请参阅 关于 GHC 实现的良好介绍性文本?
现在,鉴于缺乏有关基准的信息,有几个可能的答案:
我的猜测是后两点。 C++ 版本的懒惰只会很难达到 GHC 已经达到的相同点,所以为什么不使用本文中的代码,看看是否可以使其更快。
Lazy streams are represented most efficiently as data with delayed function components. That's the representation the rather efficient Haskell implementation in GHC uses, and one you'll need to use in C++ anyway. There's no special "fast" version of laziness you could write in C++, that hasn't already been tried in 20 years of Haskell compiler research, and more going back to Algol.
For more details on the research on how to most efficiently implement lazy data structures, see Good introductory text about GHC implementation?
Now, given the lack of information about the benchmark, there's a couple of possible answers:
My guess is the latter two points. The C++ version of laziness will only be hard work to get to the same point GHC already is at, so why not play with the code from the article, and see if you can make it faster.
我担心“数字是数字的惰性流”方法注定会比更直接的方法效率低,即使数字的基数很大(例如 2^64 或更多):
惰性求值意味着最后,你所认为的数字实际上是代表其计算结果的 DAG。请求多一位数字可能会重新触发该 DAG 的每个节点中的计算。归根结底,您花在整理事务上的时间比花在计算上的时间多得多。
您可能无法使用更复杂的算法进行基本计算。例如,FFT 乘法方法显然是不可能的。
这并不意味着算法会很简单。考虑如何在加法中处理当前结果 99999。您需要准备好对所有 9 进行进位。现在尝试考虑如何进行乘法。一种具有惰性表达的语言可能有助于表达它,但你会因为第一个问题而变得更加困难;我很高兴得知编译器能够对其进行优化,但我担心它们不能。
I fear the "numbers are a lazy stream of digits" approach is doomed to be less efficient than a more direct approach, even if the digits are in a large base (says 2^64 or more):
lazy evaluation means that at the end, what you are thinking of as number are in fact the DAG representing the computation leading to it. Asking for one more digit may re-trigger computation in every node of this DAG. At the end of the day, you spend far more time in house keeping than in computation.
you probably can't use the more sophisticated algorithms for basic computation. For instance, an FFT approach to multiplication is obviously out of question.
that doesn't mean the algorithms will be simple. Think about how to handle a current result of 99999 in an addition. You need to be prepared to ripple a carry for all those 9. Now try to think about how to do a multiplication. A language with lazy expression may help expressing it, but then you will hit get harder by the first problem; I would be happy to learn that compilers are able to optimize for it, but I fear they aren't.