将 bignum 类型结构转换为人类可读字符串的有效方法是什么?
我有一点问题。为了增长我的 C 知识,我决定尝试实现一个基本的 bigint 库。
bigint 结构的核心将是一个 32 位整数数组,选择它们是因为它们适合寄存器。这将允许我在数字之间进行操作,这些操作将在 64 位整数中溢出(这也将适合寄存器,因为我在 x86-64 上),并且我可以将结果的每个部分移出。我已经实现了基本的加法,为了测试它是否有效,我必须打印数组。出于我自己的测试目的,如果我使用 printf() 并以十六进制输出每个数字就可以了。我能读得很好。
然而,大多数人无法阅读十六进制。由于该数字(本质上)以 2^32 为基数存储,因此打印有点问题。转换为基数 10 的好方法是什么?
编辑:
这不涉及了解如何从基础转换为基础,而是关于实现这一点的好方法。我正在考虑用另一个基数制作另一个 bigint 并进行打印转换。
I've got a bit of a problem. In order to grow in my knowledge of C, I've decided to try to implement a basic bigint library.
The core of the bigint structure will be an array of 32 bit integers, chosen because they will fit in a register. This will allow me to do operations between digits that will overflow in a 64 bit integer (which will also fit in a register, as I'm on x86-64), and I can bitshift out each part of the result. I've implemented basic addition, and to test that it is working, I have to print the array. For my own testing purposes, it's fine if I use printf()
and output each digit in hex. I can read that just fine.
However, most people can't read hex. As the number is stored in (essentially) base 2^32, printing is a bit of a problem. What would be a good way to convert to base 10?
EDIT:
This deals not with knowing how to convert from base to base, but about a good way to implement this. I was thinking along the lines of making another bigint with another base with conversion for printing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
首先,如果没有基本操作(例如除法和模数),您就无法以合理的方式进行 I/O。为了有效实现将 bigint 转换为以 10 为基数的字符串,我正在研究两种可能的优化:
首先,您可以除以 10 的某个幂,而不是精确除以 10。这意味着,例如,每次将数字除以 10000 时,您都会得到四个以 10 为基数的数字。
其次,你会如何选择除以 10 的哪个次方? 10、100、1000、10000 等...
似乎有一个不错的选择,即可以适合您的单词(32 位)的最大 10 次幂。幸运的是,与两个“bigint”相比,您可以更有效地通过一个单词实现除法/模数。
我还没有给出实现,因为我仍在业余时间研究这个问题,因为我已经在我的库中实现了基本操作,并且 I/O 希望是下一步;)
First of all, you can't do I/O in a sensible way without the basic operations(e.g. division and modulus). To provide efficient implementation of converting the bigint to base-10 string, I am researching two possible optimizations:
First, you can divide by some power of ten instead of ten exactly. What that means, you will get four base-10 digits every time you divide the number by 10000 for example.
Second, how would you choose which power of ten to divide by? 10, 100, 1000, 10000, etc...
There seems to be a good choice which is the maximum power of ten that can fit in your word(32-bit). Fortunately, you can implement division/modulus by one word much more efficiently than when it comes to two "bigint"s.
I haven't given an implementation because I am still researching the problem in my spare time because I have implemented the basic operations in my library and I/O is the next step hopefully ;)
除以适合您的基本类型的最大 10 次幂是最好的开始方法。在你的例子中,这将除以 10^9。此代码应该是通用的,因为您将能够将其重用于部分通用除法/模数代码。
运行时间将为 O(n^2) (即,如果您的数字是两倍大,则转换的时间将增加四倍),但对于中等大小的数字来说,它应该足够快。
对于非常大的值,您需要缓存 10 的大幂,例如 10^1000、10^2000、10^4000、10^8000,....,然后除以大于或的 10 次方等于您要转换的数字的 1/2。重复此过程,直到数字足够小,可以使用除以 10^9 进行快速转换。根据除法算法的效率,这种方法可能不会更快,直到遇到超过一百万位或更多的数字。
如果您正在编写一个交互式计算器,其中每个数字都会显示,那么使用基数 10^9 的显示速度会更快(它将是 O(n),即如果您的数字是两倍大,则转换只需要两倍的时间)长的)。
Dividing by the largest power of 10 that will fit in your base type is the best way to start. In your case, this would be dividing by 10^9. This code should be general purpose since you will be able to reuse it for part of your general division/modulo code.
The running time will be O(n^2) (i.e. if your number is twice as big, the conversion will talk four times longer) but it should be fast enough for moderate sized numbers.
For very large values, you will want to cache large powers of 10, say 10^1000, 10^2000, 10^4000, 10^8000, ...., and then divide by the power of 10 that is greater than or equal to 1/2 of the number you are trying to convert. Repeat this process until the numbers are small enough to convert quickly using division by 10^9. Depending on how efficient your division algorithm is, this approach may not be faster until you encounter numbers in excess of a million digits or more.
If you are writing an interactive calculator where every number will be displayed, then using base 10^9 will be faster for display (it will be O(n), i.e. if your number is twice as big, the conversion will only take twice as long).
重复除以 10 的正常方法显然会慢得令人痛苦。
一个明显的快速方法是预先计算出与每个位置中每个数字的值相对应的 bigint 数组。然后,您可以进行二分搜索和相对便宜的比较/减法来找到 ms 数字,然后依次找到每个数字。
当您到达最后 32(或 64)位时,您可以恢复除以 10。
The normal way of repeatedly dividing by 10 is obviously going to be painfully slow.
An obvious quick way is to have precomputed arrays of bigints corresponding to the value of each digit in each position. You can then do binary search and relatively cheap comparisons/subtractions to find the ms digit and then each digit in turn.
You could revert to division by 10 when you get down to the last 32 (or 64) bits.
我能想到的最有效的算法如下。它的运行时复杂度应该为 O(n·(log n)²·log log n),而不是具有二次运行时复杂度的朴素算法。
二次运行时算法(如果它们足够短),或者通过递归调用该算法。
在步骤 2 和 5 中,您可以需要一个十进制乘法算法。对于数万位数字,您应该使用以 10 为基数的 Schönhage-Strassen 算法版本。这将导致上述运行时复杂性。对于较短的数字,根据其长度,应使用 Toom-Cook 算法、Karatsuba 算法或长乘法。然而,我目前无法讲述如何以 10 为基数实现 Schönhage-Strassen 算法,因为我能找到的所有完整描述都是针对 2 基数的,而且我不知道足够的数论来自己推导它。
The most efficient algorithm I can think of is the following. It should have a runtime complexity in O(n·(log n)²·log log n), as opposed to the naive algorithm which has quadratic runtime complexity.
quadratic runtime algorithm if they are short enough, or by recursively calling this algorithm.
In steps 2 and 5 you need a decimal multiplication algorithm. For numbers with tens of thousands of digits, you should use a version of the Schönhage-Strassen algorithm that works in base 10. This will lead to the runtime complexity stated above. For shorter numbers, depending on their length, the Toom-Cook algorithm, Karatsuba algorithm or long multiplication should be used. However, I cannot currently tell how to implement the Schönhage-Strassen algorithm in base 10, as all complete descriptions of it I could find are for base 2 and I do not know enough number theory to derive it myself.