在Java Int优化中解开4位块

发布于 2025-02-09 18:39:59 字数 1083 浏览 2 评论 0原文

为了从int解开第一和第二个位块,我使用这种方法:

int chunks = 0b1111_1110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 4);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 4));
Integer.toBinaryString(result); //print "1111"

int chunks = 0b1111_1110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 8);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 8));
Integer.toBinaryString(result); //print "1110"

当int块数字位的表示从1位开始时,它效果很好。

当我有这样的数字从0000开始时:

int chunks = 0b0000_0110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 4);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 4));
Integer.toBinaryString(result); //print "0"

int chunks = 0b0000_0110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 8);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 8));
Integer.toBinaryString(result); //print "110"

它也很棒。

从性能的角度来看,这是最好的方法吗?我觉得我过度复杂化了。

In order to unpack first and second 4 bit chunks from int, I use this approach:

int chunks = 0b1111_1110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 4);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 4));
Integer.toBinaryString(result); //print "1111"

int chunks = 0b1111_1110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 8);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 8));
Integer.toBinaryString(result); //print "1110"

It works great when the int chunks number bit's representation starts from 1 bit.

When I have such number starting from 0000:

int chunks = 0b0000_0110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 4);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 4));
Integer.toBinaryString(result); //print "0"

int chunks = 0b0000_0110_1101_1100_1011_1010_1001_1000;
int mask = 0x0f << (Integer.SIZE - 8);
byte result = (byte) ((chunks & mask) >>> (Integer.SIZE - 8));
Integer.toBinaryString(result); //print "110"

It works great as well.

Is it a best approach from a performance perspective? I feel like I overcomplicated it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

方觉久 2025-02-16 18:39:59

这些操作是如此微不足道,以至于不太可能对性能产生影响。

但是,如果您想深入研究

  1. 算术操作和本地变量,无论如何使用int,当使用byteshort> shortcharint。因此,除非您将值实际存储到类型字节的堆变量中,否则将本地变量声明为byte 否则。相关类型铸造意味着最低八位至32位的符号扩展操作,除非JVM的优化器设法消除它。

  2. 除了最重要的位置,没有其他位。因此,当您将最重要的位转移到最低位置时,就无需掩盖它们。

  3. 在第二个“ 4位块”中,您仍然需要蒙版,但是您无需将其转移到高位置上。相反,您可以首先向下移动位,然后使用0xf掩盖它们。由于掩码在任何一种情况下都是常数,因此至少在JIT编译器完成其工作时,对性能都没有影响,但是,字节码将较小。

因此,当我们使用时

public static void main(String[] args) {
  int[] allChunks = {
    0b1111_1110_1101_1100_1011_1010_1001_1000,
    0b0000_0110_1101_1100_1011_1010_1001_1000
  };
  for(int chunks: allChunks) {
    if(firstFourBitChunkOld(chunks) != firstFourBitChunk(chunks))
      throw new AssertionError();
    if(secondFourBitChunkOld(chunks) != secondFourBitChunk(chunks))
      throw new AssertionError();

    System.out.println(Integer.toBinaryString(firstFourBitChunk(chunks)));
    System.out.println(Integer.toBinaryString(secondFourBitChunk(chunks)));
    System.out.println();
  }
}

static int firstFourBitChunk(int chunks) {
  return chunks >>> 28;
}

static int secondFourBitChunk(int chunks) {
  return chunks >>> 24 & 0xf;
}

private static final int MASK_FIRST_FOUR_BITS = 0x0f << (Integer.SIZE - 4);
static byte firstFourBitChunkOld(int chunks) {
  return (byte) ((chunks & MASK_FIRST_FOUR_BITS) >>> (Integer.SIZE - 4));
}

private static final int MASK_SECOND_FOUR_BITS = 0x0f << (Integer.SIZE - 8);
static byte secondFourBitChunkOld(int chunks) {
  return (byte) ((chunks & MASK_SECOND_FOUR_BITS) >>> (Integer.SIZE - 8));
}

,是否有争议,例如integer.size -428更可读。无论integer.size的意思是“位尺寸”或“字节中的大小”或完全不同的东西,我都必须查找它。名字不告诉。我认为,通常,开发人员应该知道Java的int在执行位操作之前具有32位。但是,由于表达式integer.size -4是一个常数,因此此选择完全不会影响编译的代码。

上面的代码将成功运行,我们还可以比较产生的字节码:

  static int firstFourBitChunk(int);
       0: iload_0
       1: bipush        28
       3: iushr
       4: ireturn

  static int secondFourBitChunk(int);
       0: iload_0
       1: bipush        24
       3: iushr
       4: bipush        15
       6: iand
       7: ireturn

  static byte firstFourBitChunkOld(int);
       0: iload_0
       1: ldc           #8                  // int -268435456
       3: iand
       4: bipush        28
       6: iushr
       7: i2b
       8: ireturn

  static byte secondFourBitChunkOld(int);
       0: iload_0
       1: ldc           #10                 // int 251658240
       3: iand
       4: bipush        24
       6: iushr
       7: i2b
       8: ireturn

i2b是附加的符号扩展操作。为了加载移位的蒙版,需要一个LDC指令,该指令从常数池中加载值。在这种情况下,常数本身将在常数池中又有五个字节。除此之外,这些代码是等效的。

如上所述,在正常的优化执行环境中,它可能不会对实际绩效产生实际影响。但是,我认为较短的变体也更可读性,至少对于开发人员而言,对位算术的必要理解。无论如何,没有理解的情况下,没有办法让观众读取它。

These operations are so trivial, that it is unlikely to have an impact on the performance.

But if you want to dive into it

  1. Arithmetic operations and local variables are using int anyway, when using byte, short, char, or int. So unless you’re going to actually store the value into a heap variable of type byte, there is no advantage in declaring a local variable as byte. The associated type cast implies a sign extension operation of the lowest eight bits to 32 bits, unless the JVM’s optimizer manages to eliminate it.

  2. There are no additional bits beyond the most significant bits. So when you shift the most significant bits into the lowest position, there is no need to mask them.

  3. For the second “4 bit chunk” you still need a mask, but you don’t need to shift it into the high position. Instead, you can first shift your bits down, then mask them using 0xf. Since the mask is a constant in either case, there is no impact on the performance, at least when the JIT compiler did its work, however, the bytecode will be smaller.

So, when we use

public static void main(String[] args) {
  int[] allChunks = {
    0b1111_1110_1101_1100_1011_1010_1001_1000,
    0b0000_0110_1101_1100_1011_1010_1001_1000
  };
  for(int chunks: allChunks) {
    if(firstFourBitChunkOld(chunks) != firstFourBitChunk(chunks))
      throw new AssertionError();
    if(secondFourBitChunkOld(chunks) != secondFourBitChunk(chunks))
      throw new AssertionError();

    System.out.println(Integer.toBinaryString(firstFourBitChunk(chunks)));
    System.out.println(Integer.toBinaryString(secondFourBitChunk(chunks)));
    System.out.println();
  }
}

static int firstFourBitChunk(int chunks) {
  return chunks >>> 28;
}

static int secondFourBitChunk(int chunks) {
  return chunks >>> 24 & 0xf;
}

private static final int MASK_FIRST_FOUR_BITS = 0x0f << (Integer.SIZE - 4);
static byte firstFourBitChunkOld(int chunks) {
  return (byte) ((chunks & MASK_FIRST_FOUR_BITS) >>> (Integer.SIZE - 4));
}

private static final int MASK_SECOND_FOUR_BITS = 0x0f << (Integer.SIZE - 8);
static byte secondFourBitChunkOld(int chunks) {
  return (byte) ((chunks & MASK_SECOND_FOUR_BITS) >>> (Integer.SIZE - 8));
}

It’s debatable whether, e.g. Integer.SIZE - 4 is more readable than 28. I had to look it up whether Integer.SIZE means “size in bits” or “size in bytes” or something entirely different. The name doesn’t tell. I think, generally, developers should know that Java’s int has 32 bits, before going to perform bit manipulations. But since the expression Integer.SIZE - 4 is a constant, this choice has no impact on the compiled code at all.

The code above will run successfully und we can also compare the resulting bytecode:

  static int firstFourBitChunk(int);
       0: iload_0
       1: bipush        28
       3: iushr
       4: ireturn

  static int secondFourBitChunk(int);
       0: iload_0
       1: bipush        24
       3: iushr
       4: bipush        15
       6: iand
       7: ireturn

  static byte firstFourBitChunkOld(int);
       0: iload_0
       1: ldc           #8                  // int -268435456
       3: iand
       4: bipush        28
       6: iushr
       7: i2b
       8: ireturn

  static byte secondFourBitChunkOld(int);
       0: iload_0
       1: ldc           #10                 // int 251658240
       3: iand
       4: bipush        24
       6: iushr
       7: i2b
       8: ireturn

The i2b is the additional sign extension operation. For loading the shifted mask, an ldc instruction is needed, which loads a value from the constant pool. The constant itself will take another five bytes in the constant pool in this case. Besides that, the codes are equivalent.

As said, it will likely have no practical impact on the actual performance in a normal, optimizing execution environment. However, I consider the shorter variants also more readable, at least for developers with the necessary understanding of the bit arithmetic. There’s no way to make it readable for an audience without that understanding anyway.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文