使用 bignum 的开销
我遇到了这个问题,即当涉及数字时是否在我的语言中使用 bignum 作为默认数据类型。 我自己对此进行了评估,并将其简化为便利性和舒适性与性能的问题。 这个问题的答案取决于未优化的程序对性能的影响有多大。
在 Fixnum 或整数就足够的地方使用 bignum 的开销有多小? 最好的实现可以有多小? 什么样的实现可以实现最小的开销以及它们会导致什么样的额外权衡?
如果我将我的语言默认设置为 bignums,我会对整体语言性能产生什么样的影响?
I have hit upon this problem about whether to use bignums in my language as a default datatype when there's numbers involved. I've evaluated this myself and reduced it to a convenience&comfort vs. performance -question. The answer to that question depends about how large the performance hit is in programs that aren't getting optimized.
How small is the overhead of using bignums in places where a fixnum or integer would had sufficed? How small can it be at best implementations? What kind of implementations reach the smallest overhead and what kind of additional tradeoffs do they result in?
What kind of hit can I expect to the results in the overall language performance if I'll put my language to default on bignums?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
你或许可以看看 Lisp 是如何做到的。 它几乎总是会做完全正确的事情,并在必要时隐式转换类型。 它有fixnums(“正常”整数)、bignums、ratio(表示为一组两个整数的约化真分数)和floats(不同大小)。 只有浮点数才有精度误差,而且它们具有传染性,即一旦计算涉及浮点数,结果也是浮点数。 《Practical Common Lisp》对这种行为有很好的描述。
You can perhaps look at how Lisp does it. It will almost always do the exactly right thing and implicitly convert the types as it becomes necessary. It has fixnums ("normal" integers), bignums, ratios (reduced proper fractions represented as a set of two integers) and floats (in different sizes). Only floats have a precision error, and they are contagious, i.e. once a calculation involves a float, the result is a float, too. "Practical Common Lisp" has a good description of this behaviour.
老实说,最好的答案是“尝试一下”。
显然,bignum 不能像本机类型那样高效,本机类型通常适合单个 CPU 寄存器,但每个应用程序都是不同的 - 如果您的应用程序不执行整数运算的全部负载,那么开销可以忽略不计。
To be honest, the best answer is "try it and see".
Clearly bignums can't be as efficient as native types, which typically fit in a single CPU register, but every application is different - if yours doesn't do a whole load of integer arithmetic then the overhead could be negligible.
想想看...我认为它根本不会对性能产生太大影响。
因为 bignum 本质上将具有非常的基数,例如 65536 或更大的基数,这通常是传统固定数和整数的最大可能值。
我不知道您将 bignum 的基数设置为多大,但如果您将其设置得足够大,以便当它用于代替 fixnum 和/或整数时,它永远不会超过其第一个 bignum 数字,因此操作将与普通的 fixnums/int 几乎相同。
这为优化提供了机会,对于永远不会超过其第一个大数数字的大数,您可以用超快速的一位大数数字运算来替换它们。
然后在需要第二个 bignum 位时切换到 n 位算法。
这可以通过一个位标志和对所有算术运算的验证操作来实现,粗略地想,你可以使用最高位来表示bignum,如果一个数据块的最高位设置为0,那么将它们处理为如果它们是正常的 fixnum/ints 但如果它设置为 1,则将块解析为 bignum 结构并从那里使用 bignum 算法。
这应该避免简单循环迭代器变量的性能影响,我认为这是性能影响的第一个可能来源。
这只是我的粗略想法,一个建议,因为你应该比我更了解:-)
ps 抱歉,忘记了 bignum-digit 和 bignum-base 的技术术语是什么
Come to think of it... I don't think it will have much performance hits at all.
Because bignums by nature, will have a very large base, say a base of 65536 or larger for which is usually a maximum possible value for traditional fixnum and integers.
I don't know how large you would set the bignum's base to be but if you set it sufficiently large enough so that when it is used in place of fixnums and/or integers, it would never exceeds its first bignum-digit thus the operation will be nearly identical to normal fixnums/int.
This opens an opportunity for optimizations where for a bignum that never grows over its first bignum-digit, you could replace them with uber-fast one-bignum-digit operation.
And then switch over to n-digit algorithms when the second bignum-digit is needed.
This could be implemented with a bit flag and a validating operation on all arithmetic operations, roughly thinking, you could use the highest-order bit to signify bignum, if a data block has its highest-order bit set to 0, then process them as if they were normal fixnum/ints but if it is set to 1, then parse the block as a bignum structure and use bignum algorithms from there.
That should avoid performance hits from simple loop iterator variables which I think is the first possible source of performance hits.
It's just my rough thinking though, a suggestion since you should know better than me :-)
p.s. sorry, forgot what the technical terms of bignum-digit and bignum-base were
您的简化是正确的,但选择取决于您的语言的性能特征,而我们不可能知道!
一旦你实现了你的语言,你就可以衡量性能差异,也许还可以为程序员提供选择默认语言的指令
your reduction is correct, but the choice depends on the performance characteristics of your language, which we cannot possibly know!
once you have your language implemented, you can measure the performance difference, and perhaps offer the programmer a directive to choose the default
在创建自己的基准测试之前,您永远不会知道实际的性能影响,因为结果会因语言、语言版本和 CPU 的不同而有所不同。 没有独立于语言的方法来衡量这一点,除了 32 位整数使用的内存是 16 位整数的两倍这一明显事实之外。
You will never know the actual performance hit until you create your own benchmark as the results will vary per language, per language revision and per cpu and. There's no language independent way to measure this except for the obvious fact that a 32bit integer uses twice the memory of a 16bit integer.
坏消息是,即使在最好的软件实现中,BigNum 也会比内置算术慢几个数量级(即从 10 倍到 1000 倍)。
我没有确切的数字,但我认为在这种情况下确切的数字不会有太大帮助:如果您需要大数字,请使用它们。 如果没有,就不要。 如果您的语言默认使用它们(哪种语言是这样?某些动态语言是这样……),请考虑切换到另一种语言的缺点是否可以通过性能的提高来弥补(这应该很少是)。
(这可以粗略地翻译为:存在巨大差异,但应该不重要。如果(且仅当)它很重要,请使用另一种语言,因为即使有最好的实现,这种语言显然也不是不太适合这项任务。)
The bad news is that even in the best possible software implementation, BigNum is going to be slower than the builtin arithmetics by orders of magnitude (i.e. everything from factor 10 up to factor 1000).
I don't have exact numbers but I don't think exact numbers will help very much in such a situation: If you need big numbers, use them. If not, don't. If your language uses them by default (which language does? some dynamic languages do …), think whether the disadvantage of switching to another language is compensated for by the gain in performance (which it should rarely be).
(Which could roughly be translated to: there's a huge difference but it shouldn't matter. If (and only if) it matters, use another language because even with the best possible implementation, this language evidently isn't well-suited for the task.)
我完全怀疑这是否值得,除非它是特定领域的。
首先想到的是整个程序中的所有小 for 循环,小迭代器变量都将是 bignum 吗? 太可怕了!
但如果你的语言相当实用……那么也许不是。
I totally doubt that it would be worth it, unless it is very domain-specific.
The first thing that comes to mind are all the little for loops throughout programs, are the little iterator variables all gonna be bignums? That's scary!
But if your language is rather functional... then maybe not.