如何计算整数范围内的每个数字?

发布于 2024-08-18 11:17:18 字数 5258 浏览 8 评论 0原文

想象一下,您出售用于对房屋、储物柜门、酒店房间等进行编号的金属数字。当您的客户需要对门/房屋进行编号时,您需要确定要运送的每个数字的数量:

  • 1 到 100
  • 51 到 300
  • 1 到 2,000向左补零

显而易见的解决方案是从第一个数字到最后一个数字进行循环,将计数器转换为左侧有或没有零的字符串,提取每个数字并将其用作索引以递增 10 个整数的数组。

我想知道是否有更好的方法来解决这个问题,而不必循环遍历整个整数范围。

欢迎任何语言或伪代码的解决方案。


编辑:

答案审核
CashCommons 的 JohnWayne Conrad 评论说我当前的方法很好而且足够快。让我打个愚蠢的比喻:如果你被要求在 1 分钟内数出棋盘上的方格,你可以通过逐个数方格来完成任务,但更好解决方案是计算边数并进行乘法,因为稍后可能会要求您计算建筑物中的瓷砖。
Alex Reisner 指出了一个非常有趣的数学定律,不幸的是,它似乎与这个问题无关。
Andres 建议我使用相同的算法,但使用 %10 操作而不是子字符串来提取数字。
CashCommons 的 John 和 phord 建议预先计算所需的数字并将其存储在查找表中,或者为了原始速度而存储在数组中。如果我们有一个绝对的、不可移动的、一成不变的最大整数值,这可能是一个很好的解决方案。我从未见过其中之一。
高性能标记过滤器计算各种范围所需的数字。一百万的结果似乎表明存在比例,但其他数字的结果显示不同的比例。
strainer发现了一些可用于计算十的幂的数字的公式。 罗伯特·哈维在 MathOverflow 上发布这个问题的经历非常有趣。一位数学专家使用数学符号写了一个解决方案。
Aaronaught 使用数学开发并测试了一个解决方案。发布后,他查看了源自 Math Overflow 的公式,发现其中存在缺陷(指向 Stackoverflow :)。
noahlavine 开发了一种算法并以伪代码形式呈现。

新的解决方案
在阅读了所有答案并做了一些实验后,我发现对于从 1 到 10n-1 的整数范围:

  • 对于数字 1 到 9,n*10(n-1 ) 块是必需的
  • 对于数字 0,如果不使用前导零,则 n*10n-1 - ((10n-1) / 9) 为需要
  • 对于数字 0,如果使用前导零,则需要 n*10n-1 - n

第一个公式是由 strainer (可能还有其他人)找到的,我通过反复试验找到了另外两个(但它们可能包含在其他答案中)。

例如,如果 n = 6,范围为 1 到 999,999:

  • 对于数字 1 到 9,我们需要 6*105 = 每个 600,000
  • 对于数字 0,没有前导零,我们需要 6*10 5 – (106-1)/9 = 600,000 - 111,111 = 488,889
  • 对于带有前导零的数字 0,我们需要 6*105 – 6 = 599,994

这些数字可以使用高性能标记结果进行检查。

使用这些公式,我改进了原始算法。它仍然从整数范围内的第一个数字到最后一个数字循环,但是,如果它找到一个 10 的幂的数字,它会使用公式将 1 到 9 的整个范围的数量添加到数字计数中或 1 到 99 或 1 到 999 等。以下是伪代码中的算法:

integer First,Last //First and last number in the range
integer Number     //Current number in the loop
integer Power      //Power is the n in 10^n in the formulas
integer Nines      //Nines is the resut of 10^n - 1, 10^5 - 1 = 99999
integer Prefix     //First digits in a number. For 14,200, prefix is 142
array 0..9  Digits //Will hold the count for all the digits

FOR Number = First TO Last
  CALL TallyDigitsForOneNumber WITH Number,1  //Tally the count of each digit 
                                              //in the number, increment by 1
  //Start of optimization. Comments are for Number = 1,000 and Last = 8,000.
  Power = Zeros at the end of number //For 1,000, Power = 3
  IF Power > 0                       //The number ends in 0 00 000 etc 
    Nines = 10^Power-1                 //Nines = 10^3 - 1 = 1000 - 1 = 999
    IF Number+Nines <= Last            //If 1,000+999 < 8,000, add a full set
      Digits[0-9] += Power*10^(Power-1)  //Add 3*10^(3-1) = 300 to digits 0 to 9
      Digits[0]   -= -Power              //Adjust digit 0 (leading zeros formula)
      Prefix = First digits of Number    //For 1000, prefix is 1
      CALL TallyDigitsForOneNumber WITH Prefix,Nines //Tally the count of each 
                                                     //digit in prefix,
                                                     //increment by 999
      Number += Nines                    //Increment the loop counter 999 cycles
    ENDIF
  ENDIF 
  //End of optimization
ENDFOR  

SUBROUTINE TallyDigitsForOneNumber PARAMS Number,Count
  REPEAT
    Digits [ Number % 10 ] += Count
    Number = Number / 10
  UNTIL Number = 0

例如,对于范围 786 到 3,021,计数器将递增:

  • 从 786 到 790 增加 1(5 个周期)
  • 从 790 到 799 增加 9(1 个周期) )
  • 从 799 到 800
  • 增加 1 从 800 到 899
  • 增加 99 从 899 到 900
  • 增加 1 从 900
  • 999增加 99 从 999 到 1000 增加 1 从 1000 到 1999 增加 999
  • 从 1999 到 2000 增加 1
  • 999
  • 从 2000 到 2999增加 1从2999到3000
  • 乘1从3000到3010(10个周期)
  • 乘9从3010到3019(1个周期)
  • 乘1从3019到3021(2个周期)

总计:28个周期 没有优化:2,235 个周期

请注意,该算法解决了没有前导零的问题。为了将其与前导零一起使用,我使用了一种技巧:

如果需要 700 到 1,000 范围内的前导零,请使用 10,700 到 11,000 的算法,然后从数字 1 的计数中减去 1,000 - 700 = 300。

基准和源代码

我测试了原始方法,使用 %10 的相同方法和一些大范围的新解决方案,结果如下:

Original             104.78 seconds
With %10              83.66
With Powers of Ten     0.07

基准应用程序的屏幕截图:
替代文本
(来源:clarion.sca.mx

如果您想查看完整的源代码或运行基准测试,请使用以下链接:

接受的答案

noahlavine 解决方案可能是正确的,但我只是无法遵循伪代码,我认为有一些细节丢失或没有完全解释。

Aaronaught 解决方案似乎是正确的,但代码对我来说太复杂了。

我接受了strainer的回答,因为他的思路引导我开发了这个新的解决方案。

Imagine you sell those metallic digits used to number houses, locker doors, hotel rooms, etc. You need to find how many of each digit to ship when your customer needs to number doors/houses:

  • 1 to 100
  • 51 to 300
  • 1 to 2,000 with zeros to the left

The obvious solution is to do a loop from the first to the last number, convert the counter to a string with or without zeros to the left, extract each digit and use it as an index to increment an array of 10 integers.

I wonder if there is a better way to solve this, without having to loop through the entire integers range.

Solutions in any language or pseudocode are welcome.


Edit:

Answers review
John at CashCommons and Wayne Conrad comment that my current approach is good and fast enough. Let me use a silly analogy: If you were given the task of counting the squares in a chess board in less than 1 minute, you could finish the task by counting the squares one by one, but a better solution is to count the sides and do a multiplication, because you later may be asked to count the tiles in a building.
Alex Reisner points to a very interesting mathematical law that, unfortunately, doesn’t seem to be relevant to this problem.
Andres suggests the same algorithm I’m using, but extracting digits with %10 operations instead of substrings.
John at CashCommons and phord propose pre-calculating the digits required and storing them in a lookup table or, for raw speed, an array. This could be a good solution if we had an absolute, unmovable, set in stone, maximum integer value. I’ve never seen one of those.
High-Performance Mark and strainer computed the needed digits for various ranges. The result for one millon seems to indicate there is a proportion, but the results for other number show different proportions.
strainer found some formulas that may be used to count digit for number which are a power of ten.
Robert Harvey had a very interesting experience posting the question at MathOverflow. One of the math guys wrote a solution using mathematical notation.
Aaronaught developed and tested a solution using mathematics. After posting it he reviewed the formulas originated from Math Overflow and found a flaw in it (point to Stackoverflow :).
noahlavine developed an algorithm and presented it in pseudocode.

A new solution
After reading all the answers, and doing some experiments, I found that for a range of integer from 1 to 10n-1:

  • For digits 1 to 9, n*10(n-1) pieces are needed
  • For digit 0, if not using leading zeros, n*10n-1 - ((10n-1) / 9) are needed
  • For digit 0, if using leading zeros, n*10n-1 - n are needed

The first formula was found by strainer (and probably by others), and I found the other two by trial and error (but they may be included in other answers).

For example, if n = 6, range is 1 to 999,999:

  • For digits 1 to 9 we need 6*105 = 600,000 of each one
  • For digit 0, without leading zeros, we need 6*105 – (106-1)/9 = 600,000 - 111,111 = 488,889
  • For digit 0, with leading zeros, we need 6*105 – 6 = 599,994

These numbers can be checked using High-Performance Mark results.

Using these formulas, I improved the original algorithm. It still loops from the first to the last number in the range of integers, but, if it finds a number which is a power of ten, it uses the formulas to add to the digits count the quantity for a full range of 1 to 9 or 1 to 99 or 1 to 999 etc. Here's the algorithm in pseudocode:

integer First,Last //First and last number in the range
integer Number     //Current number in the loop
integer Power      //Power is the n in 10^n in the formulas
integer Nines      //Nines is the resut of 10^n - 1, 10^5 - 1 = 99999
integer Prefix     //First digits in a number. For 14,200, prefix is 142
array 0..9  Digits //Will hold the count for all the digits

FOR Number = First TO Last
  CALL TallyDigitsForOneNumber WITH Number,1  //Tally the count of each digit 
                                              //in the number, increment by 1
  //Start of optimization. Comments are for Number = 1,000 and Last = 8,000.
  Power = Zeros at the end of number //For 1,000, Power = 3
  IF Power > 0                       //The number ends in 0 00 000 etc 
    Nines = 10^Power-1                 //Nines = 10^3 - 1 = 1000 - 1 = 999
    IF Number+Nines <= Last            //If 1,000+999 < 8,000, add a full set
      Digits[0-9] += Power*10^(Power-1)  //Add 3*10^(3-1) = 300 to digits 0 to 9
      Digits[0]   -= -Power              //Adjust digit 0 (leading zeros formula)
      Prefix = First digits of Number    //For 1000, prefix is 1
      CALL TallyDigitsForOneNumber WITH Prefix,Nines //Tally the count of each 
                                                     //digit in prefix,
                                                     //increment by 999
      Number += Nines                    //Increment the loop counter 999 cycles
    ENDIF
  ENDIF 
  //End of optimization
ENDFOR  

SUBROUTINE TallyDigitsForOneNumber PARAMS Number,Count
  REPEAT
    Digits [ Number % 10 ] += Count
    Number = Number / 10
  UNTIL Number = 0

For example, for range 786 to 3,021, the counter will be incremented:

  • By 1 from 786 to 790 (5 cycles)
  • By 9 from 790 to 799 (1 cycle)
  • By 1 from 799 to 800
  • By 99 from 800 to 899
  • By 1 from 899 to 900
  • By 99 from 900 to 999
  • By 1 from 999 to 1000
  • By 999 from 1000 to 1999
  • By 1 from 1999 to 2000
  • By 999 from 2000 to 2999
  • By 1 from 2999 to 3000
  • By 1 from 3000 to 3010 (10 cycles)
  • By 9 from 3010 to 3019 (1 cycle)
  • By 1 from 3019 to 3021 (2 cycles)

Total: 28 cycles
Without optimization: 2,235 cycles

Note that this algorithm solves the problem without leading zeros. To use it with leading zeros, I used a hack:

If range 700 to 1,000 with leading zeros is needed, use the algorithm for 10,700 to 11,000 and then substract 1,000 - 700 = 300 from the count of digit 1.

Benchmark and Source code

I tested the original approach, the same approach using %10 and the new solution for some large ranges, with these results:

Original             104.78 seconds
With %10              83.66
With Powers of Ten     0.07

A screenshot of the benchmark application:
alt text
(source: clarion.sca.mx)

If you would like to see the full source code or run the benchmark, use these links:

Accepted answer

noahlavine solution may be correct, but l just couldn’t follow the pseudo code, I think there are some details missing or not completely explained.

Aaronaught solution seems to be correct, but the code is just too complex for my taste.

I accepted strainer’s answer, because his line of thought guided me to develop this new solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

倾城°AllureLove 2024-08-25 11:17:18

对于这样的问题有一个明确的数学解决方案。让我们假设该值被零填充到最大位数(事实并非如此,但我们稍后会对此进行补偿),并进行推理:

  • 从 0-9,每个数字出现一次
  • 从 0-99,每个数字出现一次20 次(位置 1 为 10x,位置 2 为 10x)
  • 从 0-999,每个数字出现 300 次(P1 中 100x,P2 中 100x,P3 中 100x)

如果范围是从 0,则任何给定数字的明显模式10 的幂,是 N * 10N-1,其中 N 是 10 的幂。

如果范围不是10 的幂?从 10 的最低幂开始,然后逐渐增加。最容易处理的情况是像 399 这样的最大值。我们知道,对于 100 的每个倍数,每个数字至少出现 20 次,但我们必须补偿它在最高有效数字位置,对于数字 0-3 来说正好是 100,对于所有其他数字来说正好是 0。具体来说,相关数字的额外添加量为 10N

将其代入公式,对于比 10 的某个倍数(即 399、6999 等)小 1 的上限,它变为:M * N * 10N-1 + iif(d <= M, 10N, 0)

现在您只需处理余数(我们将其称为 R) 。以445为例。这是 399 的结果加上 400-445 的范围。在此范围内,MSD 出现R 的次数较多,并且所有数字(包括 MSD)也以与范围 [0 - R] 中的相同频率出现。

现在我们只需补偿前导零即可。这种模式很简单 - 只是:

10N + 10N-1 + 10N-2 + ... + **10 0

更新:此版本正确考虑了“填充零”,即处理余数时中间位置的零([400, 401、402、...])。计算出填充零有点难看,但修改后的代码(C 风格伪代码)可以处理它:

function countdigits(int d, int low, int high) {
    return countdigits(d, low, high, false);
}

function countdigits(int d, int low, int high, bool inner) {
    if (high == 0)
        return (d == 0) ? 1 : 0;

    if (low > 0)
        return countdigits(d, 0, high) - countdigits(d, 0, low);

    int n = floor(log10(high));
    int m = floor((high + 1) / pow(10, n));
    int r = high - m * pow(10, n);
    return
        (max(m, 1) * n * pow(10, n-1)) +                             // (1)
        ((d < m) ? pow(10, n) : 0) +                                 // (2)
        (((r >= 0) && (n > 0)) ? countdigits(d, 0, r, true) : 0) +   // (3)
        (((r >= 0) && (d == m)) ? (r + 1) : 0) +                     // (4)
        (((r >= 0) && (d == 0)) ? countpaddingzeros(n, r) : 0) -     // (5)
        (((d == 0) && !inner) ? countleadingzeros(n) : 0);           // (6)
}

function countleadingzeros(int n) {
      int tmp= 0;
      do{
         tmp= pow(10, n)+tmp;
         --n;
         }while(n>0);
         return tmp;
         }

function countpaddingzeros(int n, int r) {
    return (r + 1) * max(0, n - max(0, floor(log10(r))) - 1);
}

如您所见,它变得有点难看,但它仍然在 O(log n) 时间内运行,所以如果您需要处理数十亿的数字,这仍然会给你即时的结果。 :-) 如果你在 [0 - 1000000] 范围内运行它,你会得到与 High-Performance Mark 发布的分布完全相同的分布,所以我几乎肯定它是正确的。

仅供参考,inner 变量的原因是前导零函数已经是递归的,因此只能在第一次执行 countdigits 时进行计数。

更新 2: 如果代码难以阅读,这里有一个关于 countdigits return 语句每一行含义的参考(我尝试了内联注释,但它们使代码变得均匀)难以阅读):

  1. 任何数字的频率,最高为 10 的最高幂(0-99 等)
  2. 高于 10 的最高幂的任意倍数的 MSD 频率(100-399)
  3. 余数中的任何数字的频率(400-445, R = 45)
  4. 余数中 MSD 的附加频率
  5. 计算余数范围中间位置的零 (404, 405...)
  6. 仅减去前导零一次(在最外层循环上)

There's a clear mathematical solution to a problem like this. Let's assume the value is zero-padded to the maximum number of digits (it's not, but we'll compensate for that later), and reason through it:

  • From 0-9, each digit occurs once
  • From 0-99, each digit occurs 20 times (10x in position 1 and 10x in position 2)
  • From 0-999, each digit occurs 300 times (100x in P1, 100x in P2, 100x in P3)

The obvious pattern for any given digit, if the range is from 0 to a power of 10, is N * 10N-1, where N is the power of 10.

What if the range is not a power of 10? Start with the lowest power of 10, then work up. The easiest case to deal with is a maximum like 399. We know that for each multiple of 100, each digit occurs at least 20 times, but we have to compensate for the number of times it appears in the most-significant-digit position, which is going to be exactly 100 for digits 0-3, and exactly zero for all other digits. Specifically, the extra amount to add is 10N for the relevant digits.

Putting this into a formula, for upper bounds that are 1 less than some multiple of a power of 10 (i.e. 399, 6999, etc.) it becomes: M * N * 10N-1 + iif(d <= M, 10N, 0)

Now you just have to deal with the remainder (which we'll call R). Take 445 as an example. This is whatever the result is for 399, plus the range 400-445. In this range, the MSD occurs R more times, and all digits (including the MSD) also occur at the same frequencies they would from range [0 - R].

Now we just have to compensate for the leading zeros. This pattern is easy - it's just:

10N + 10N-1 + 10N-2 + ... + **100

Update: This version correctly takes into account "padding zeros", i.e. the zeros in middle positions when dealing with the remainder ([400, 401, 402, ...]). Figuring out the padding zeros is a bit ugly, but the revised code (C-style pseudocode) handles it:

function countdigits(int d, int low, int high) {
    return countdigits(d, low, high, false);
}

function countdigits(int d, int low, int high, bool inner) {
    if (high == 0)
        return (d == 0) ? 1 : 0;

    if (low > 0)
        return countdigits(d, 0, high) - countdigits(d, 0, low);

    int n = floor(log10(high));
    int m = floor((high + 1) / pow(10, n));
    int r = high - m * pow(10, n);
    return
        (max(m, 1) * n * pow(10, n-1)) +                             // (1)
        ((d < m) ? pow(10, n) : 0) +                                 // (2)
        (((r >= 0) && (n > 0)) ? countdigits(d, 0, r, true) : 0) +   // (3)
        (((r >= 0) && (d == m)) ? (r + 1) : 0) +                     // (4)
        (((r >= 0) && (d == 0)) ? countpaddingzeros(n, r) : 0) -     // (5)
        (((d == 0) && !inner) ? countleadingzeros(n) : 0);           // (6)
}

function countleadingzeros(int n) {
      int tmp= 0;
      do{
         tmp= pow(10, n)+tmp;
         --n;
         }while(n>0);
         return tmp;
         }

function countpaddingzeros(int n, int r) {
    return (r + 1) * max(0, n - max(0, floor(log10(r))) - 1);
}

As you can see, it's gotten a bit uglier but it still runs in O(log n) time, so if you need to handle numbers in the billions, this will still give you instant results. :-) And if you run it on the range [0 - 1000000], you get the exact same distribution as the one posted by High-Performance Mark, so I'm almost positive that it's correct.

FYI, the reason for the inner variable is that the leading-zero function is already recursive, so it can only be counted in the first execution of countdigits.

Update 2: In case the code is hard to read, here's a reference for what each line of the countdigits return statement means (I tried inline comments but they made the code even harder to read):

  1. Frequency of any digit up to highest power of 10 (0-99, etc.)
  2. Frequency of MSD above any multiple of highest power of 10 (100-399)
  3. Frequency of any digits in remainder (400-445, R = 45)
  4. Additional frequency of MSD in remainder
  5. Count zeros in middle position for remainder range (404, 405...)
  6. Subtract leading zeros only once (on outermost loop)
╰ゝ天使的微笑 2024-08-25 11:17:18

我假设您想要一个数字在某个范围内的解决方案,并且您有起始数字和结束数字。想象一下从起始数字开始一直计数直到达到结束数字 - 它会起作用,但会很慢。我认为快速算法的技巧是要认识到,为了在 10^x 位置上增加一位数字并保持其他所有内容相同,您需要使用它之前的所有数字 10^x 次加上所有数字 0 -9 10^(x-1) 次。 (除非您的计数可能涉及到第 x 位数字的进位 - 我在下面对此进行了更正。)

这是一个示例。假设您要从 523 数到 1004。

  • 首先,您从 523 数到 524。这将使用数字 5、2 和 4 各一次。
  • 其次,从 524 数到 604。最右边的数字对所有数字进行 6 次循环,因此每个数字需要 6 个副本。第二个数字经历了数字 2 到 0,每个数字 10 次。第三位数字是6 5次和5 100-24次。
  • 第三,从 604 数到 1004。最右边的数字循环 40 次,因此每个数字添加 40 个副本。右数第二个数字执行 4 个循环,因此每个数字添加 4 个副本。最左边的数字分别是 7、8 和 9 的 100,加上 0 的 5 和 6 的 100 - 5。最后一位数字是 1 的 5 次。

为了加快最后一点的速度,请查看最右边两个地方的部分。每个数字使用 10 + 1 次。一般来说,1 + 10 + ... + 10^n = (10^(n+1) - 1)/9,我们可以用它来进一步加快计数速度。

我的算法是从开始数到结束数(使用以 10 为基数的计数),但使用上面的事实可以快速完成。您从最低有效位到最高有效位迭代起始数字的数字,并在每个位置进行计数,以便该数字与结束数字中的数字相同。在每个点,n 是在达到进位之前需要执行的向上计数的次数,m 是之后需要执行的次数。

现在让我们假设伪代码算作一种语言。那么,这就是我要做的:

convert start and end numbers to digit arrays start[] and end[]
create an array counts[] with 10 elements which stores the number of copies of
     each digit that you need

iterate through start number from right to left. at the i-th digit,
    let d be the number of digits you must count up to get from this digit
        to the i-th digit in the ending number. (i.e. subtract the equivalent
        digits mod 10)
    add d * (10^i - 1)/9 to each entry in count.
    let m be the numerical value of all the digits to the right of this digit,
        n be 10^i - m.
    for each digit e from the left of the starting number up to and including the
        i-th digit, add n to the count for that digit.
    for j in 1 to d
        increment the i-th digit by one, including doing any carries
        for each digit e from the left of the starting number up to and including
            the i-th digit, add 10^i to the count for that digit
    for each digit e from the left of the starting number up to and including the
        i-th digit, add m to the count for that digit.
    set the i-th digit of the starting number to be the i-th digit of the ending
        number.

哦,由于 i 的值每次都会增加 1,因此请跟踪旧的 10^i,然后将其乘以 10 以获得新的值,而不是每次都求幂。

I'm assuming you want a solution where the numbers are in a range, and you have the starting and ending number. Imagine starting with the start number and counting up until you reach the end number - it would work, but it would be slow. I think the trick to a fast algorithm is to realize that in order to go up one digit in the 10^x place and keep everything else the same, you need to use all of the digits before it 10^x times plus all digits 0-9 10^(x-1) times. (Except that your counting may have involved a carry past the x-th digit - I correct for this below.)

Here's an example. Say you're counting from 523 to 1004.

  • First, you count from 523 to 524. This uses the digits 5, 2, and 4 once each.
  • Second, count from 524 to 604. The rightmost digit does 6 cycles through all of the digits, so you need 6 copies of each digit. The second digit goes through digits 2 through 0, 10 times each. The third digit is 6 5 times and 5 100-24 times.
  • Third, count from 604 to 1004. The rightmost digit does 40 cycles, so add 40 copies of each digit. The second from right digit doers 4 cycles, so add 4 copies of each digit. The leftmost digit does 100 each of 7, 8, and 9, plus 5 of 0 and 100 - 5 of 6. The last digit is 1 5 times.

To speed up the last bit, look at the part about the rightmost two places. It uses each digit 10 + 1 times. In general, 1 + 10 + ... + 10^n = (10^(n+1) - 1)/9, which we can use to speed up counting even more.

My algorithm is to count up from the start number to the end number (using base-10 counting), but use the fact above to do it quickly. You iterate through the digits of the starting number from least to most significant, and at each place you count up so that that digit is the same as the one in the ending number. At each point, n is the number of up-counts you need to do before you get to a carry, and m the number you need to do afterwards.

Now let's assume pseudocode counts as a language. Here, then, is what I would do:

convert start and end numbers to digit arrays start[] and end[]
create an array counts[] with 10 elements which stores the number of copies of
     each digit that you need

iterate through start number from right to left. at the i-th digit,
    let d be the number of digits you must count up to get from this digit
        to the i-th digit in the ending number. (i.e. subtract the equivalent
        digits mod 10)
    add d * (10^i - 1)/9 to each entry in count.
    let m be the numerical value of all the digits to the right of this digit,
        n be 10^i - m.
    for each digit e from the left of the starting number up to and including the
        i-th digit, add n to the count for that digit.
    for j in 1 to d
        increment the i-th digit by one, including doing any carries
        for each digit e from the left of the starting number up to and including
            the i-th digit, add 10^i to the count for that digit
    for each digit e from the left of the starting number up to and including the
        i-th digit, add m to the count for that digit.
    set the i-th digit of the starting number to be the i-th digit of the ending
        number.

Oh, and since the value of i increases by one each time, keep track of your old 10^i and just multiply it by 10 to get the new one, instead of exponentiating each time.

谜兔 2024-08-25 11:17:18

要从数字中提取数字,如果我们无法进行取模,我们只需要进行昂贵的字符串转换,数字可以最快地被推入数字,如下所示:

feed=number;
do
{ digit=feed%10;
  feed/=10; 
  //use digit... eg. digitTally[digit]++;
  }
while(feed>0)

该循环应该非常快,并且可以是放置在起始数字到结束数字的循环内,以最简单的方式计算数字。

为了更快地处理更大范围的数字,我正在寻找一种优化方法来计算从 0 到数字 * 10^significance 的所有数字
(从头到尾都让我困惑)

这是一个表格,显示了一些单个有效数字的数字计数。
这些包含 0,但不包含最高值本身,-这是一个疏忽
但它可能更容易看到模式(这里没有最高值数字)
这些计数不包括尾随零,

  1 10 100 1000 10000 2 20 30 40 60 90 200 600 2000  6000

0 1 1  10  190  2890  1  2  3  4  6  9  30 110  490  1690
1 0 1  20  300  4000  1 12 13 14 16 19 140 220 1600  2800
2 0 1  20  300  4000  0  2 13 14 16 19  40 220  600  2800
3 0 1  20  300  4000  0  2  3 14 16 19  40 220  600  2800
4 0 1  20  300  4000  0  2  3  4 16 19  40 220  600  2800
5 0 1  20  300  4000  0  2  3  4 16 19  40 220  600  2800
6 0 1  20  300  4000  0  2  3  4  6 19  40 120  600  1800
7 0 1  20  300  4000  0  2  3  4  6 19  40 120  600  1800
8 0 1  20  300  4000  0  2  3  4  6 19  40 120  600  1800
9 0 1  20  300  4000  0  2  3  4  6  9  40 120  600  1800

编辑:清理我原来的
想法:

从暴力破解表中可以看出
计数从 0(含)到
poweroTen(notinc) 可见
tenpower 的主要数字:

increments tally[0 to 9] by md*tp*10^(tp-1)
increments tally[1 to md-1] by 10^tp
decrements tally[0] by (10^tp - 10) 
(to remove leading 0s if tp>leadingzeros)
can increment tally[moresignificantdigits] by self(md*10^tp) 
(to complete an effect)

如果对每个有效数字应用这些计数调整,
应该修改计数,就好像从 0 到 end-1 计数一样。

可以反转调整以删除前面的范围(起始编号),

感谢 Aaronaught 提供完整且经过测试的答案。

To reel of the digits from a number, we'd only ever need to do a costly string conversion if we couldnt do a mod, digits can most quickly be pushed of a number like this:

feed=number;
do
{ digit=feed%10;
  feed/=10; 
  //use digit... eg. digitTally[digit]++;
  }
while(feed>0)

that loop should be very fast and can just be placed inside a loop of the start to end numbers for the simplest way to tally the digits.

To go faster, for larger range of numbers, im looking for an optimised method of tallying all digits from 0 to number*10^significance
(from a start to end bazzogles me)

here is a table showing digit tallies of some single significant digits..
these are inclusive of 0, but not the top value itself, -that was an oversight
but its maybe a bit easier to see patterns (having the top values digits absent here)
These tallies dont include trailing zeros,

  1 10 100 1000 10000 2 20 30 40 60 90 200 600 2000  6000

0 1 1  10  190  2890  1  2  3  4  6  9  30 110  490  1690
1 0 1  20  300  4000  1 12 13 14 16 19 140 220 1600  2800
2 0 1  20  300  4000  0  2 13 14 16 19  40 220  600  2800
3 0 1  20  300  4000  0  2  3 14 16 19  40 220  600  2800
4 0 1  20  300  4000  0  2  3  4 16 19  40 220  600  2800
5 0 1  20  300  4000  0  2  3  4 16 19  40 220  600  2800
6 0 1  20  300  4000  0  2  3  4  6 19  40 120  600  1800
7 0 1  20  300  4000  0  2  3  4  6 19  40 120  600  1800
8 0 1  20  300  4000  0  2  3  4  6 19  40 120  600  1800
9 0 1  20  300  4000  0  2  3  4  6  9  40 120  600  1800

edit: clearing up my origonal
thoughts:

from the brute force table showing
tallies from 0 (included) to
poweroTen(notinc) it is visible that
a majordigit of tenpower:

increments tally[0 to 9] by md*tp*10^(tp-1)
increments tally[1 to md-1] by 10^tp
decrements tally[0] by (10^tp - 10) 
(to remove leading 0s if tp>leadingzeros)
can increment tally[moresignificantdigits] by self(md*10^tp) 
(to complete an effect)

if these tally adjustments were applied for each significant digit,
the tally should be modified as though counted from 0 to end-1

the adjustments can be inverted to remove preceeding range (start number)

Thanks Aaronaught for your complete and tested answer.

野心澎湃 2024-08-25 11:17:18

这是一个非常糟糕的答案,我羞于发布它。我要求 Mathematica 计算从 1 到 1,000,000 的所有数字中使用的数字,没有前导 0。我得到的信息是这样的:

0   488895
1   600001
2   600000
3   600000
4   600000
5   600000
6   600000
7   600000
8   600000
9   600000

下次当您订购粘性数字在五金店销售时,按这些比例订购,您就不会错太多。

Here's a very bad answer, I'm ashamed to post it. I asked Mathematica to tally the digits used in all numbers from 1 to 1,000,000, no leading 0s. Here's what I got:

0   488895
1   600001
2   600000
3   600000
4   600000
5   600000
6   600000
7   600000
8   600000
9   600000

Next time you're ordering sticky digits for selling in your hardware store, order in these proportions, you won't be far wrong.

檐上三寸雪 2024-08-25 11:17:18

在 Math Overflow 上问了这个问题,因为问这么简单的问题而被打屁股。一位用户同情我,并说如果我将其发布到解决问题的艺术,他会回答它;我就是这么做的。

这是他发布的答案:
http://www.artofproblemsolving.com/Forum/viewtopic.php? p=1741600#1741600

尴尬的是,我的数学能力不足以理解他发的内容(这家伙19岁了……真是令人沮丧)。我真的需要上一些数学课。

从好的方面来说,这个方程是递归的,所以对于懂数学的人来说,用几行代码将它变成一个递归函数应该是一件简单的事情。

I asked this question on Math Overflow, and got spanked for asking such a simple question. One of the users took pity on me and said if I posted it to The Art of Problem Solving, he would answer it; so I did.

Here is the answer he posted:
http://www.artofproblemsolving.com/Forum/viewtopic.php?p=1741600#1741600

Embarrassingly, my math-fu is inadequate to understand what he posted (the guy is 19 years old...that is so depressing). I really need to take some math classes.

On the bright side, the equation is recursive, so it should be a simple matter to turn it into a recursive function with a few lines of code, by someone who understands the math.

只等公子 2024-08-25 11:17:18

我知道这个问题有一个公认的答案,但我的任务是为面试编写这段代码,我想我想出了一个快速、不需要循环并且可以根据需要使用或丢弃前导零的替代解决方案。

事实上,这很简单,但不容易解释。

如果列出前 n 个数字,

     1
     2
     3

     .
     .
     .


     9
    10
    11

通常会以从左到右的方式开始计算从起始房间号到结束房间号所需的数字,因此对于上面的情况,我们有一个 1、一个 2、一个 3 ..一个 9、两个 1、一个 0、四个 1 等。我见过的大多数解决方案都使用这种方法并进行一些优化以加快速度。

我所做的就是按列垂直计数,如百位、十位和个位。您知道最大的房间号,因此我们可以通过一次除法来计算百位列中每个数字的个数,然后递归并计算十位列中有多少个,依此类推。然后,如果我们愿意,我们可以减去前导零。

如果您使用 Excel 写出数字,但对数字的每个数字使用单独的列,则更容易可视化。

     A    B    C
     -    -    -
     0    0    1  (assuming room numbers do not start at zero)
     0    0    2
     0    0    3
     .
     .
     .
     3    6    4
     3    6    5
     .
     .
     .

     6    6    9
     6    7    0
     6    7    1

     ^
     sum in columns not rows

因此,如果最大房间号是 671,则百列将垂直有 100 个零,后面跟着 100 个 1,依此类推,直到 71如果需要,请忽略 100 个 0,因为我们知道这些都是前导。

然后递归到十位并执行相同的操作,我们知道将有 10 个零,后面跟着 10 个一,依此类推,重复六次,然后最后一次下降到 2 个七。再次可以忽略前 10 个零,因为我们知道它们是前导的。最后当然是计算单位,根据需要忽略第一个零。

所以没有循环,一切都是用除法计算的。我使用递归在列中“向上”移动,直到达到最大一列(在本例中为数百),然后在进行过程中向下总计。

我用 C# 编写了此代码,如果有人感兴趣,可以发布代码,但尚未进行任何基准计时,但对于最多 10^18 个房间的值来说,它基本上是即时的。

找不到此处或其他地方提到的这种方法,因此认为它可能对某人有用。

I know this question has an accepted answer but I was tasked with writing this code for a job interview and I think I came up with an alternative solution that is fast, requires no loops and can use or discard leading zeroes as required.

It is in fact quite simple but not easy to explain.

If you list out the first n numbers

     1
     2
     3

     .
     .
     .


     9
    10
    11

It is usual to start counting the digits required from the start room number to the end room number in a left to right fashion, so for the above we have one 1, one 2, one 3 ... one 9, two 1's one zero, four 1's etc. Most solutions I have seen used this approach with some optimisation to speed it up.

What I did was to count vertically in columns, as in hundreds, tens, and units. You know the highest room number so we can calculate how many of each digit there are in the hundreds column via a single division, then recurse and calculate how many in the tens column etc. Then we can subtract the leading zeros if we like.

Easier to visualize if you use Excel to write out the numbers but use a separate column for each digit of the number

     A    B    C
     -    -    -
     0    0    1  (assuming room numbers do not start at zero)
     0    0    2
     0    0    3
     .
     .
     .
     3    6    4
     3    6    5
     .
     .
     .

     6    6    9
     6    7    0
     6    7    1

     ^
     sum in columns not rows

So if the highest room number is 671 the hundreds column will have 100 zeroes vertically, followed by 100 ones and so on up to 71 sixes, ignore 100 of the zeroes if required as we know these are all leading.

Then recurse down to the tens and perform the same operation, we know there will be 10 zeroes followed by 10 ones etc, repeated six times, then the final time down to 2 sevens. Again can ignore the first 10 zeroes as we know they are leading. Finally of course do the units, ignoring the first zero as required.

So there are no loops everything is calculated with division. I use recursion for travelling "up" the columns until the max one is reached (in this case hundreds) and then back down totalling as it goes.

I wrote this in C# and can post code if anyone interested, haven't done any benchmark timings but it is essentially instant for values up to 10^18 rooms.

Could not find this approach mentioned here or elsewhere so thought it might be useful for someone.

家住魔仙堡 2024-08-25 11:17:18

你的做法很好。我不确定为什么你需要比你所描述的更快的东西。

或者,这会给你一个即时的解决方案:在你真正需要它之前,计算你需要从 1 到某个最大数字的东西。您可以存储每一步所需的数字。如果您有一个像第二个示例一样的范围,那么它就是 1 到 300 所需的值,减去 1 到 50 所需的值。

现在您有了一个可以随意调用的查找表。执行最多 10,000 次只需要几 MB,而且计算一次只需几分钟?

Your approach is fine. I'm not sure why you would ever need anything faster than what you've described.

Or, this would give you an instantaneous solution: Before you actually need it, calculate what you would need from 1 to some maximum number. You can store the numbers needed at each step. If you have a range like your second example, it would be what's needed for 1 to 300, minus what's needed for 1 to 50.

Now you have a lookup table that can be called at will. Doing up to 10,000 would only take a few MB and, what, a few minutes to compute, once?

墟烟 2024-08-25 11:17:18

这并没有回答您的确切问题,但有趣的是根据 Benford 的第一个数字的分布法。例如,如果你随机选择一组数字,其中 30% 将以“1”开头,这有点违反直觉。

我不知道描述后续数字的任何分布,但您也许能够凭经验确定这一点,并提出一个简单的公式来计算任何数字范围所需的大约位数。

This doesn't answer your exact question, but it's interesting to note the distribution of first digits according to Benford's Law. For example, if you choose a set of numbers at random, 30% of them will start with "1", which is somewhat counter-intuitive.

I don't know of any distributions describing subsequent digits, but you might be able to determine this empirically and come up with a simple formula for computing an approximate number of digits required for any range of numbers.

〃温暖了心ぐ 2024-08-25 11:17:18

如果“更好”意味着“更清晰”,那么我对此表示怀疑。如果它意味着“更快”,那么是的,但如果没有迫切的需要,我不会使用更快的算法来代替更清晰的算法。

#!/usr/bin/ruby1.8

def digits_for_range(min, max, leading_zeros)
  bins = [0] * 10
  format = [
    '%',
    ('0' if leading_zeros),
    max.to_s.size,
    'd',
  ].compact.join
  (min..max).each do |i|
    s = format % i
    for digit in s.scan(/./)
      bins[digit.to_i] +=1  unless digit == ' '
    end
  end
  bins
end

p digits_for_range(1, 49, false) 
# => [4, 15, 15, 15, 15, 5, 5, 5, 5, 5]

p digits_for_range(1, 49, true)
# => [13, 15, 15, 15, 15, 5, 5, 5, 5, 5]

p digits_for_range(1, 10000, false)
# => [2893, 4001, 4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000]

Ruby 1.8 是一种以“狗慢”着称的语言,运行上述代码只需 0.135 秒。这包括加载解释器。除非您需要更快的速度,否则不要放弃明显的算法。

If "better" means "clearer," then I doubt it. If it means "faster," then yes, but I wouldn't use a faster algorithm in place of a clearer one without a compelling need.

#!/usr/bin/ruby1.8

def digits_for_range(min, max, leading_zeros)
  bins = [0] * 10
  format = [
    '%',
    ('0' if leading_zeros),
    max.to_s.size,
    'd',
  ].compact.join
  (min..max).each do |i|
    s = format % i
    for digit in s.scan(/./)
      bins[digit.to_i] +=1  unless digit == ' '
    end
  end
  bins
end

p digits_for_range(1, 49, false) 
# => [4, 15, 15, 15, 15, 5, 5, 5, 5, 5]

p digits_for_range(1, 49, true)
# => [13, 15, 15, 15, 15, 5, 5, 5, 5, 5]

p digits_for_range(1, 10000, false)
# => [2893, 4001, 4000, 4000, 4000, 4000, 4000, 4000, 4000, 4000]

Ruby 1.8, a language known to be "dog slow," runs the above code in 0.135 seconds. That includes loading the interpreter. Don't give up an obvious algorithm unless you need more speed.

病女 2024-08-25 11:17:18

如果您需要多次迭代的原始速度,请尝试查找表:

  1. 构建一个二维数组:10 x max-house-number 用

    int nDigits[10000][10] ;   // Don't try this on the stack, kids!
  1. 从零开始达到该数字所需的位数填充每行。
    提示:使用上一行作为开始:

    n=0..9999:
       if (n>0) nDigits[n] = nDigits[n-1]
       d=0..9:
           nDigits[n][d] += countOccurrencesOf(n,d)   // 

  1. 两个数字“之间”的位数变成简单的减法。

       For range=51 to 300, take the counts for 300 and subtract the counts for 50.
       0's = nDigits[300][0] - nDigits[50][0]
       1's = nDigits[300][1] - nDigits[50][1]
       2's = nDigits[300][2] - nDigits[50][2]
       3's = nDigits[300][3] - nDigits[50][3]
       etc.

If you need raw speed over many iterations, try a lookup table:

  1. Build an array with 2 dimensions: 10 x max-house-number

    int nDigits[10000][10] ;   // Don't try this on the stack, kids!
  1. Fill each row with the count of digits required to get to that number from zero.
    Hint: Use the previous row as a start:

    n=0..9999:
       if (n>0) nDigits[n] = nDigits[n-1]
       d=0..9:
           nDigits[n][d] += countOccurrencesOf(n,d)   // 
  1. Number of digits "between" two numbers becomes simple subtraction.
       For range=51 to 300, take the counts for 300 and subtract the counts for 50.
       0's = nDigits[300][0] - nDigits[50][0]
       1's = nDigits[300][1] - nDigits[50][1]
       2's = nDigits[300][2] - nDigits[50][2]
       3's = nDigits[300][3] - nDigits[50][3]
       etc.
丑疤怪 2024-08-25 11:17:18

您可以分隔每个数字(查看此处的示例),创建一个包含从 0..9 开始的直方图(它将计算数字中出现的位数)并乘以所要求的“数字”数量。

但如果这不是您想要的,您能举一个更好的例子吗?

编辑:

现在我想我遇到了问题。我想你可以这样认为(伪C):

int histogram[10];
memset(histogram, 0, sizeof(histogram));

for(i = startNumber; i <= endNumber; ++i)
{
    array = separateDigits(i);
    for(j = 0; k < array.length; ++j)
    {
        histogram[k]++;
    }
}

单独的数字实现了链接中的功能。

直方图的每个位置都会有每个数字的数量。例如

histogram[0] == total of zeros
histogram[1] == total of ones

...

问候

You can separate each digit (look here for a example), create a histogram with entries from 0..9 (which will count how many digits appeared in a number) and multiply by the number of 'numbers' asked.

But if isn't what you are looking for, can you give a better example?

Edited:

Now I think I got the problem. I think you can reckon this (pseudo C):

int histogram[10];
memset(histogram, 0, sizeof(histogram));

for(i = startNumber; i <= endNumber; ++i)
{
    array = separateDigits(i);
    for(j = 0; k < array.length; ++j)
    {
        histogram[k]++;
    }
}

Separate digits implements the function in the link.

Each position of the histogram will have the amount of each digit. For example

histogram[0] == total of zeros
histogram[1] == total of ones

...

Regards

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文