在Java中将数字填充到一定位数的最快方法

发布于 2024-09-04 18:17:30 字数 2370 浏览 14 评论 0原文

我正在尝试创建一段经过良好优化的代码,以基于数据库生成的序列号 (Y) 创建长度为 X 位数的数字(其中 X 是从运行时属性文件读取的),然后将其用于文件夹-保存文件时的名称。

到目前为止,我已经提出了三个想法,其中最快的是最后一个,但我很感激人们对此提出的任何建议...

1)实例化一个具有初始容量 X 的 StringBuilder 。附加 Y 。而长度< X,在零位置插入一个零。

2)实例化一个StringBuilder,初始容量为X。当长度<; X,附加一个零。基于 StringBuilder 值创建 DecimalFormat,然后在需要时格式化数字。

3) 创建 Math.pow( 10, X ) 的新 int 并添加 Y。对新数字使用 String.valueOf(),然后对其进行 substring(1)。

第二个显然可以分为外循环和内循环部分。

那么,有什么建议吗?使用 10,000 次迭代的 for 循环,我得到的时间与前两种方法相似,而第三种方法大约快十倍。这看起来正确吗?

下面是完整的测试方法代码...

    // Setup test variables
    int numDigits = 9;
    int testNumber = 724;
    int numIterations = 10000;
    String folderHolder = null;
    DecimalFormat outputFormat = new DecimalFormat( "#,##0" );

    // StringBuilder test
    long before = System.nanoTime();
    for ( int i = 0; i < numIterations; i++ )
    {
        StringBuilder sb = new StringBuilder( numDigits );
        sb.append( testNumber );
        while ( sb.length() < numDigits )
        {
            sb.insert( 0, 0 );
        }

        folderHolder = sb.toString();
    }
    long after = System.nanoTime();
    System.out.println( "01: " + outputFormat.format( after - before ) + " nanoseconds" );
    System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );

    // DecimalFormat test
    before = System.nanoTime();
    StringBuilder sb = new StringBuilder( numDigits );
    while ( sb.length() < numDigits )
    {
        sb.append( 0 );
    }
    DecimalFormat formatter = new DecimalFormat( sb.toString() );
    for ( int i = 0; i < numIterations; i++ )
    {
        folderHolder = formatter.format( testNumber );
    }
    after = System.nanoTime();
    System.out.println( "02: " + outputFormat.format( after - before ) + " nanoseconds" );
    System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );

    // Substring test
    before = System.nanoTime();
    int baseNum = (int)Math.pow( 10, numDigits );
    for ( int i = 0; i < numIterations; i++ )
    {
        int newNum = baseNum + testNumber;
        folderHolder = String.valueOf( newNum ).substring( 1 );
    }
    after = System.nanoTime();
    System.out.println( "03: " + outputFormat.format( after - before ) + " nanoseconds" );
    System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );

Am trying to create a well-optimised bit of code to create number of X-digits in length (where X is read from a runtime properties file), based on a DB-generated sequence number (Y), which is then used a folder-name when saving a file.

I've come up with three ideas so far, the fastest of which is the last one, but I'd appreciate any advice people may have on this...

1) Instantiate a StringBuilder with initial capacity X. Append Y. While length < X, insert a zero at pos zero.

2) Instantiate a StringBuilder with initial capacity X. While length < X, append a zero. Create a DecimalFormat based on StringBuilder value, and then format the number when it's needed.

3) Create a new int of Math.pow( 10, X ) and add Y. Use String.valueOf() on the new number and then substring(1) it.

The second one can obviously be split into outside-loop and inside-loop sections.

So, any tips? Using a for-loop of 10,000 iterations, I'm getting similar timings from the first two, and the third method is approximately ten-times faster. Does this seem correct?

Full test-method code below...

    // Setup test variables
    int numDigits = 9;
    int testNumber = 724;
    int numIterations = 10000;
    String folderHolder = null;
    DecimalFormat outputFormat = new DecimalFormat( "#,##0" );

    // StringBuilder test
    long before = System.nanoTime();
    for ( int i = 0; i < numIterations; i++ )
    {
        StringBuilder sb = new StringBuilder( numDigits );
        sb.append( testNumber );
        while ( sb.length() < numDigits )
        {
            sb.insert( 0, 0 );
        }

        folderHolder = sb.toString();
    }
    long after = System.nanoTime();
    System.out.println( "01: " + outputFormat.format( after - before ) + " nanoseconds" );
    System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );

    // DecimalFormat test
    before = System.nanoTime();
    StringBuilder sb = new StringBuilder( numDigits );
    while ( sb.length() < numDigits )
    {
        sb.append( 0 );
    }
    DecimalFormat formatter = new DecimalFormat( sb.toString() );
    for ( int i = 0; i < numIterations; i++ )
    {
        folderHolder = formatter.format( testNumber );
    }
    after = System.nanoTime();
    System.out.println( "02: " + outputFormat.format( after - before ) + " nanoseconds" );
    System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );

    // Substring test
    before = System.nanoTime();
    int baseNum = (int)Math.pow( 10, numDigits );
    for ( int i = 0; i < numIterations; i++ )
    {
        int newNum = baseNum + testNumber;
        folderHolder = String.valueOf( newNum ).substring( 1 );
    }
    after = System.nanoTime();
    System.out.println( "03: " + outputFormat.format( after - before ) + " nanoseconds" );
    System.out.println( "Sanity check: Folder = \"" + folderHolder + "\"" );

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

So尛奶瓶 2024-09-11 18:17:31

我会停止基于微基准进行优化,转而寻求代码看起来优雅的东西,例如 String.format("%0"+numDigits+"d", testNumber)

I would stop doing optimizations based on micro-benchmarks and go for something that looks elegant codewise, such as String.format("%0"+numDigits+"d", testNumber)

梦醒时光 2024-09-11 18:17:31

使用 String.format("%0[length]d", i)

对于长度为 8 的情况,它会

String out = String.format("%08d", i);

更慢,但是键入和调试更复杂的代码所花费的时间可能会超过执行期间所使用的总额外时间。

事实上,如果将讨论此问题所花费的所有工时加起来,很可能会大大超过节省的执行时间。

Use String.format("%0[length]d", i)

For length of 8 it would be

String out = String.format("%08d", i);

It's slower, but the time spent typing and debugging the more complex code will probably exceed the total extra time ever used during execution.

In fact, if you add up all the man-hours already spent discussing this, it most likely exceeds the execution time savings by a large factor.

晚风撩人 2024-09-11 18:17:31

一一插入填充字符显然很慢。如果性能确实是一个大问题,您可以使用长度为 1..n-1 的预定义字符串常量(其中 n 是最大预期长度),存储在相应索引处的 ArrayList 中。

如果 n 非常大,至少您仍然可以插入更大的块而不是单个字符。

但总的来说,正如其他人也指出的那样,只有在实际情况下分析应用程序并发现哪一段特定代码是瓶颈时,优化才是可行的。然后您可以专注于此(当然还要再次进行分析以验证您的更改是否确实提高了性能)。

Inserting padding characters one by one is obviously slow. If performance is really that big a concern, you could use predefined string constants of lengts 1..n-1 instead (where n is the biggest expected length), stored in an ArrayList at the corresponding indexes.

If n is very big, at least you could still insert in bigger chunks instead of single chars.

But overall, as others pointed out too, optimization is only feasible if you have profiled your application under real circumstances and found which specific piece of code is the bottleneck. Then you can focus on that (and of course profile again to verify that your changes actually improve performance).

青丝拂面 2024-09-11 18:17:31

这是一个与 StringBuilder 基本相同的解决方案,有两个优化:

  1. 它直接写入数组
    绕过 StringBuilder 开销
  2. 它执行相反的操作
    而不是 insert(0),它需要
    每次都进行数组复制

它还假设 numDigits 将 >= 所需的实际字符,但会正确处理负数:

before = System.nanoTime();
String arrString=null;
for ( int j = 0; j < numIterations; j++ ){
  char[] arrNum = new char[numDigits];
  int i = numDigits-1;
  boolean neg = testNumber<0;
  for(int tmp = neg?-testNumber:testNumber;tmp>0;tmp/=10){
    arrNum[i--] = (char)((tmp%10)+48);
  }
  while(i>=0){
    arrNum[i--]='0';
  }
  if(neg)arrNum[0]='-';
  arrString = new String(arrNum);
}
after = System.nanoTime();
System.out.println( "04: " + outputFormat.format( after - before ) + " nanoseconds" );
System.out.println( "Sanity check: Folder = \"" + arrString + "\"" );

此方法在负数方面优于我机器上的样本,并且在正数方面具有可比性:

01: 18,090,933 nanoseconds
Sanity check: Folder = "000000742"
02: 22,659,205 nanoseconds
Sanity check: Folder = "000000742"
03: 2,309,949 nanoseconds
Sanity check: Folder = "000000742"
04: 6,380,892 nanoseconds
Sanity check: Folder = "000000742"

01: 14,933,369 nanoseconds
Sanity check: Folder = "0000-2745"
02: 21,685,158 nanoseconds
Sanity check: Folder = "-000002745"
03: 3,213,270 nanoseconds
Sanity check: Folder = "99997255"
04: 1,255,660 nanoseconds
Sanity check: Folder = "-00002745"

编辑:我注意到你的测试在迭代循环中恢复了一些对象,而我在我的测试中没有这样做(例如不在子字符串版本中重新计算baseNum)。当我将测试更改为一致时(不恢复任何对象/计算,我的版本比您的版本表现更好:

01: 18,377,935 nanoseconds
Sanity check: Folder = "000000742"
02: 69,443,911 nanoseconds
Sanity check: Folder = "000000742"
03: 6,410,263 nanoseconds
Sanity check: Folder = "000000742"
04: 996,622 nanoseconds
Sanity check: Folder = "000000742"

当然,正如其他人提到的,微基准测试非常困难/“模糊”,因为虚拟机执行的所有优化都无法进行控制他们。

Here is a solution that is basically the same thing as your StringBuilder with two optimizations:

  1. It directly writes to an array
    bypassing the StringBuilder overhead
  2. It does the operations in reverse
    instead of insert(0), which requries
    an arraycopy each time

It also makes the assumptions that numDigits will be >= to the actual characters required, but will properly handle negative numbers:

before = System.nanoTime();
String arrString=null;
for ( int j = 0; j < numIterations; j++ ){
  char[] arrNum = new char[numDigits];
  int i = numDigits-1;
  boolean neg = testNumber<0;
  for(int tmp = neg?-testNumber:testNumber;tmp>0;tmp/=10){
    arrNum[i--] = (char)((tmp%10)+48);
  }
  while(i>=0){
    arrNum[i--]='0';
  }
  if(neg)arrNum[0]='-';
  arrString = new String(arrNum);
}
after = System.nanoTime();
System.out.println( "04: " + outputFormat.format( after - before ) + " nanoseconds" );
System.out.println( "Sanity check: Folder = \"" + arrString + "\"" );

This method well outperformed your samples on my machine for negatives and was comparable for positives:

01: 18,090,933 nanoseconds
Sanity check: Folder = "000000742"
02: 22,659,205 nanoseconds
Sanity check: Folder = "000000742"
03: 2,309,949 nanoseconds
Sanity check: Folder = "000000742"
04: 6,380,892 nanoseconds
Sanity check: Folder = "000000742"

01: 14,933,369 nanoseconds
Sanity check: Folder = "0000-2745"
02: 21,685,158 nanoseconds
Sanity check: Folder = "-000002745"
03: 3,213,270 nanoseconds
Sanity check: Folder = "99997255"
04: 1,255,660 nanoseconds
Sanity check: Folder = "-00002745"

Edit: I noticed your tests resued some of the objects within the iteration loop, which I had not done in mine (such as not recalculating baseNum in the substring version). When I altered the tests to be consistent (not resuing any objects / calculations my version performed better than yours:

01: 18,377,935 nanoseconds
Sanity check: Folder = "000000742"
02: 69,443,911 nanoseconds
Sanity check: Folder = "000000742"
03: 6,410,263 nanoseconds
Sanity check: Folder = "000000742"
04: 996,622 nanoseconds
Sanity check: Folder = "000000742"

Of course as others have mentioned micro benchmarking is incredibly difficult / "fudgy" with all of the optimization performed by the VM and the inability to control them.

維他命╮ 2024-09-11 18:17:31

此可能相关链接讨论了许多实现方法。我会推荐 Apache 选项 StringUtils,它可能是也可能不是绝对最快的,但它通常是最容易理解的选项之一,并且已经将 )&##@ 去掉了,所以它可能不会打破一些不可预见的边缘情况。 ;)

This probably related link discusses many of the ways to do it. I would recommend the Apache option, StringUtils, it may or may not be the absolute fastest, but its usually one of the easiest to understand, and has had the )&##@ pounded out of it, so it probably won't break in some unforeseen edge case. ;)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文