查找整数的位数
查找正整数的位数的最佳方法是什么?
我发现了这 3 种基本方法:
转换为字符串
String s = new Integer(t).toString(); int len = s.length();
for 循环
for(long long int temp = number; temp >= 1;) { 温度/=10; 小数位数++; }
对数计算
数字 = 下限( log10( 数字 ) ) + 1;
,您可以在大多数语言中计算 log10(x) = ln(x) / ln(10)。
首先,我认为字符串方法是最脏的方法,但我想得越多,我就越认为它是最快的方法。或者是吗?
What is the best method to find the number of digits of a positive integer?
I have found this 3 basic methods:
conversion to string
String s = new Integer(t).toString(); int len = s.length();
for loop
for(long long int temp = number; temp >= 1;) { temp/=10; decimalPlaces++; }
logaritmic calculation
digits = floor( log10( number ) ) + 1;
where you can calculate log10(x) = ln(x) / ln(10) in most languages.
First I thought the string method is the dirtiest one but the more I think about it the more I think it's the fastest way. Or is it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(19)
正确的答案是测量它 - 但您应该能够猜测转换字符串并遍历它们寻找结束标记所涉及的 CPU 步骤数
然后想想您的处理器可以执行多少个 FPU 操作以及计算单个对数是多么容易。
编辑:在周一早上浪费更多时间:-)
高级语言的问题之一是猜测系统在一个看似简单的语句的幕后做了多少工作。强制 Joel 链接
此语句涉及为字符串分配内存,可能还涉及几个临时副本的一个字符串。它必须解析整数并将其数字复制到字符串中,如果数字很大,可能需要重新分配和移动现有内存。它可能必须检查一堆区域设置来决定您的国家/地区是否使用“,”或“.”,它可能必须执行一堆 unicode 转换。
然后找到长度必须扫描整个字符串,再次考虑 unicode 和任何本地特定设置,例如 - 您使用的是右->左语言吗?
或者:
仅仅因为这对你来说在纸上做起来比较困难,并不意味着这对计算机来说很难!事实上,高性能计算中的一个好规则似乎是——如果某件事对人类来说很难(流体动力学、3D 渲染),那么对计算机来说就很容易,而如果对人类来说很容易(面部识别、检测声音)房间很吵)电脑很难用!
您通常可以假设内置数学函数 log/sin/cos 等 - 50 年来一直是计算机设计的重要组成部分。因此,即使它们没有直接映射到 FPU 中的硬件功能,您也可以打赌替代实现非常高效。
Well the correct answer would be to measure it - but you should be able to make a guess about the number of CPU steps involved in converting strings and going through them looking for an end marker
Then think how many FPU operations/s your processor can do and how easy it is to calculate a single log.
edit: wasting some more time on a monday morning :-)
One of the problems with high level languages is guessing how much work the system is doing behind the scenes of an apparently simple statement. Mandatory Joel link
This statement involves allocating memory for a string, and possibly a couple of temporary copies of a string. It must parse the integer and copy the digits of it into a string, possibly having to reallocate and move the existing memory if the number is large. It might have to check a bunch of locale settings to decide if your country uses "," or ".", it might have to do a bunch of unicode conversions.
Then finding the length has to scan the entire string, again considering unicode and any local specific settings such as - are you in a right->left language?.
Alternatively:
Just because this would be harder for you to do on paper doesn't mean it's hard for a computer! In fact a good rule in high performance computing seems to have been - if something is hard for a human (fluid dynamics, 3d rendering) it's easy for a computer, and if it's easy for a human (face recognition, detecting a voice in a noisy room) it's hard for a computer!
You can generally assume that the builtin maths functions log/sin/cos etc - have been an important part of computer design for 50years. So even if they don't map directly into a hardware function in the FPU you can bet that the alternative implementation is pretty efficient.
我不知道,而且答案可能会有所不同,具体取决于您的个人语言的实现方式。
所以,压力测试吧!实施所有三个解决方案。在 1 到 1,000,000(或代表解决方案将运行的数字的其他一些巨大数字集)上运行它们,并计算每个数字所需的时间。
让您的解决方案相互竞争,然后让他们一决胜负。就像智力角斗士一样。三种算法登场!一种算法离开!
I don't know, and the answer may well be different depending on how your individual language is implemented.
So, stress test it! Implement all three solutions. Run them on 1 through 1,000,000 (or some other huge set of numbers that's representative of the numbers the solution will be running against) and time how long each of them takes.
Pit your solutions against one another and let them fight it out. Like intellectual gladiators. Three algorithms enter! One algorithm leaves!
这个算法也可能很好,假设:
我们不知道数字边界
该算法的速度应该与提供的 for-loop (2) 相当,但由于(2 位移位、加法和减法,而不是除法),速度要快一些)。
至于Log10算法,它只会给你近似的答案(接近真实,但仍然),因为计算Log函数的解析公式有无限循环,无法精确计算 维基。
This algorithm might be good also, assuming that:
We don't known number boundaries
This algorithm, should have speed comparable to for-loop (2) provided, but a bit faster due to (2 bit-shifts, add and subtract, instead of division).
As for Log10 algorithm, it will give you only approximate answer (that is close to real, but still), since analytic formula for computing Log function have infinite loop and can't be calculated precisely Wiki.
测试条件
结果
注意:作者避免做出任何结论对于超过 10 位的数字。
脚本
Test conditions
Results
Note: Author refrains from making any conclusions for numbers with more than 10 digits.
Script
关于您提出的“确定在给定基数中表示给定数字所需的位数”的三种方法,实际上我不喜欢其中任何一个;我更喜欢下面给出的方法。
关于你的方法#1(字符串):任何涉及字符串和数字之间来回转换的操作通常都非常慢。
Re your method #2 (temp/=10): 这是致命的缺陷,因为它假设 x/10 总是意味着“x 除以 10”。但在许多编程语言(例如:C、C++)中,如果“x”是整数类型,那么“x/10”表示“整数除法”,这与浮点除法不是一回事,它引入了每次迭代时都会舍入误差,并且它们会累积在递归公式中,例如您的解决方案 #2 使用的那样。
关于你的方法 #3(日志):对于大数(至少在 C 中,可能还有其他语言),它是有问题的,因为浮点数据类型往往不如 64 位整数那么精确。
因此,我不喜欢所有这 3 种方法:#1 可以工作,但速度很慢,#2 已损坏,#3 对于大量数据来说有问题。相反,我更喜欢这个,它适用于从 0 到大约 18.44 quintillion 的数字:
Regarding the three methods you propose for "determining the number of digits necessary to represent a given number in a given base", I don't like any of them, actually; I prefer the method I give below instead.
Re your method #1 (strings): Anything involving converting back-and-forth between strings and numbers is usually very slow.
Re your method #2 (temp/=10): This is fatally flawed because it assumes that x/10 always means "x divided by 10". But in many programming languages (eg: C, C++), if "x" is an integer type, then "x/10" means "integer division", which isn't the same thing as floating-point division, and it introduces round-off errors at every iteration, and they accumulate in a recursive formula such as your solution #2 uses.
Re your method #3 (logs): it's buggy for large numbers (at least in C, and probably other languages as well), because floating-point data types tend not to be as precise as 64-bit integers.
Hence I dislike all 3 of those methods: #1 works but is slow, #2 is broken, and #3 is buggy for large numbers. Instead, I prefer this, which works for numbers from 0 up to about 18.44 quintillion:
转换为字符串:这将必须迭代每个数字,找到映射到当前数字的字符,将一个字符添加到字符集合中。然后获取生成的 String 对象的长度。将在 O(n) 中运行 n=#digits。
for循环:将执行2个数学运算:将数字除以10并递增计数器。将在 O(n) for n=#digits 中运行。
对数:将调用log10和floor,并加1。看起来像O(1),但我不太确定log10或floor函数有多快。由于缺乏使用,我对这类事情的了解已经萎缩,因此这些函数中可能存在隐藏的复杂性。
所以我想这可以归结为:查找数字映射是否比多个数学运算或
log10
中发生的任何事情更快?答案可能会有所不同。可能有一些平台的字符映射速度更快,而其他平台的计算速度也更快。另外要记住的是,第一个方法将创建一个新的 String 对象,该对象仅出于获取长度的目的而存在。这可能会比其他两种方法使用更多的内存,但这可能重要也可能无关紧要。conversion to string: This will have to iterate through each digit, find the character that maps to the current digit, add a character to a collection of characters. Then get the length of the resulting String object. Will run in O(n) for n=#digits.
for-loop: will perform 2 mathematical operation: dividing the number by 10 and incrementing a counter. Will run in O(n) for n=#digits.
logarithmic: Will call log10 and floor, and add 1. Looks like O(1) but I'm not really sure how fast the log10 or floor functions are. My knowledge of this sort of things has atrophied with lack of use so there could be hidden complexity in these functions.
So I guess it comes down to: is looking up digit mappings faster than multiple mathematical operations or whatever is happening in
log10
? The answer will probably vary. There could be platforms where the character mapping is faster, and others where doing the calculations is faster. Also to keep in mind is that the first method will creats a new String object that only exists for the purpose of getting the length. This will probably use more memory than the other two methods, but it may or may not matter.显然,您可以将方法 1 从竞争中排除,因为它使用的 atoi/toString 算法与方法 2 类似。
方法 3 的速度取决于是否为指令集包含 log base 10 的系统编译代码。
You can obviously eliminate the method 1 from the competition, because the atoi/toString algorithm it uses would be similar to method 2.
Method 3's speed depends on whether the code is being compiled for a system whose instruction set includes log base 10.
无论您使用什么编程语言,都使用最简单的解决方案。我想不出计算整数中的数字会成为任何(有用)程序的瓶颈的情况。
C、C++:
Haskell:
JavaScript:
PHP:
Visual Basic(未经测试):
Use the simplest solution in whatever programming language you're using. I can't think of a case where counting digits in an integer would be the bottleneck in any (useful) program.
C, C++:
Haskell:
JavaScript:
PHP:
Visual Basic (untested):
对于非常大的整数,对数方法要快得多。例如,对于 2491327 位数字(第 11920928 个斐波那契数,如果你关心的话),Python 需要几分钟来执行除以 10 算法,并需要几毫秒来执行
1+floor(log(n,10) )
。For very large integers, the log method is much faster. For instance, with a 2491327 digit number (the 11920928th Fibonacci number, if you care), Python takes several minutes to execute the divide-by-10 algorithm, and milliseconds to execute
1+floor(log(n,10))
.保持简单:
Keep it simple:
log(x,n)-mod(log(x,n),1)+1
其中 x 是底数,n 是数字。
log(x,n)-mod(log(x,n),1)+1
Where x is a the base and n is the number.
这是 Swift 4 中的测量。
算法代码:
测量代码:
在此测量基础上字符串转换是 Swift 语言的最佳选择。
Here is the measurement in Swift 4.
Algorithms code:
Measurement code:
On this measurement basis String conversion is the best option for the Swift language.
您可以使用递归解决方案而不是循环,但在某种程度上相似:
对于长数,情况可能会改变 - 针对不同的算法独立地测量小数和长数,并根据您的典型输入选择合适的算法。 :)
当然,没有什么比 switch 更好的了:
除了普通的数组:
有些人会告诉你优化代码大小,但是你知道,过早的优化......
You can use a recursive solution instead of a loop, but somehow similar:
With longs, the picture might change - measure small and long numbers independently against different algorithms, and pick the appropriate one, depending on your typical input. :)
Of course nothing beats a switch:
except a plain-o-array:
Some people will tell you to optimize the code-size, but yaknow, premature optimization ...
看到 @daniel.sedlacek 结果后我很好奇,因此我使用 Swift 对 10 位以上的数字进行了一些测试。我在操场上运行了以下脚本。
I was curious after seeing @daniel.sedlacek results so I did some testing using Swift for numbers having more than 10 digits. I ran the following script in the playground.
F# 版本,无需转换为字符串。
F# Version, without casting to a string.
在 Swift 5.x 中,您可以按如下方式获取整数中的位数:
In Swift 5.x, you get the number of digit in integer as below :
在许多已经提到的方法中添加一种方法。
这个想法是在包含基于
int
数据类型的digits
的整数范围的数组上使用 binarySearch。Java Arrays 类
binarySearch
的签名是:binarySearch(dataType[] array, dataType key) 返回搜索键的索引(如果它包含在数组中);否则,
(-(插入点) – 1)
。插入点定义为将键插入数组的点。
下面是实现:
请注意,上述方法仅适用于:Integer.MIN_VALUE <=
N
<= Integer.MAX_VALUE,但可以轻松扩展为 <通过向digits
数组添加更多值来实现 code>Long 数据类型。例如,
I) 对于 N = 555,digitCount = Arrays.binarySearch(digits , 555) 返回
-3
(-(2)-1),因为它不存在于数组中,但应该在点处插入2
位于 9 和 之间99 类似于 [9, 55, 99]。由于我们得到的索引是负数,因此我们需要对结果进行按位补。
最后,我们需要将结果加 1,以获得数字
N
中的实际位数。Adding one more approach to many of the already mentioned approaches.
The idea is to use binarySearch on an array containing the range of integers based on the
digits
of theint
data type.The signature of Java Arrays class
binarySearch
is :binarySearch(dataType[] array, dataType key) which returns the index of the search key, if it is contained in the array; otherwise,
(-(insertion point) – 1)
.The insertion point is defined as the point at which the key would be inserted into the array.
Below is the implementation:
Please note that the above approach only works for : Integer.MIN_VALUE <=
N
<= Integer.MAX_VALUE, but can be easily extended forLong
data type by adding more values to thedigits
array.For example,
I) for N = 555, digitCount = Arrays.binarySearch(digits , 555) returns
-3
(-(2)-1) as it's not present in the array but is supposed to be inserted at point2
between 9 & 99 like [9, 55, 99].As the index we got is negative we need to take the bitwise compliment of the result.
At last, we need to add 1 to the result to get the actual number of digits in the number
N
.总是有这样的方法:
There's always this method: