Java 语言中不可用的字节码功能

发布于 2024-11-26 00:54:17 字数 193 浏览 5 评论 0原文

当前(Java 6)是否有一些事情可以在 Java 字节码中完成而在 Java 语言中无法完成?

我知道两者都是图灵完备的,所以将“可以做”理解为“可以做得更快/更好,或者只是以不同的方式”。

我正在考虑额外的字节码,例如 invokedynamic,它无法使用 Java 生成,除非特定的字节码是针对未来版本的。

Are there currently (Java 6) things you can do in Java bytecode that you can't do from within the Java language?

I know both are Turing complete, so read "can do" as "can do significantly faster/better, or just in a different way".

I'm thinking of extra bytecodes like invokedynamic, which can't be generated using Java, except that specific one is for a future version.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

慕烟庭风 2024-12-03 00:54:17

在使用 Java 字节代码相当长一段时间并对这个问题做了一些额外的研究之后,以下是我的发现摘要:

在调用超级构造函数或辅助构造函数之前在

构造函数 中执行代码在编程语言 (JPL) 中,构造函数的第一条语句必须是对超级构造函数或同一类的另一个构造函数的调用。对于 Java 字节码 (JBC) 来说,情况并非如此。在字节代码中,在构造函数之前执行任何代码都是绝对合法的,只要:

  • 在此代码块之后的某个时间调用另一个兼容的构造函数。
  • 此调用不在条件语句内。
  • 在此构造函数调用之前,不会读取构造实例的任何字段,也不会调用其任何方法。这意味着下一个项目。

在调用超级构造函数或辅助构造函数之前设置实例字段

如前所述,在调用另一个构造函数之前设置实例的字段值是完全合法的。甚至存在一个遗留的 hack,使其能够在 6 之前的 Java 版本中利用此“功能”:

class Foo {
  public String s;
  public Foo() {
    System.out.println(s);
  }
}

class Bar extends Foo {
  public Bar() {
    this(s = "Hello World!");
  }
  private Bar(String helper) {
    super();
  }
}

这样,可以在调用超级构造函数之前设置一个字段,但这不再可能了。在JBC中,这种行为仍然可以实现。

分支超级构造函数调用

在 Java 中,不可能定义像

class Foo {
  Foo() { }
  Foo(Void v) { }
}

class Bar() {
  if(System.currentTimeMillis() % 2 == 0) {
    super();
  } else {
    super(null);
  }
}

这样的构造函数调用,直到 Java 7u23,HotSpot VM 的验证器确实错过了此检查,这就是它可能的原因。这被多个代码生成工具用作一种黑客攻击,但实现这样的类不再合法。

后者只是此编译器版本中的一个错误。在较新的编译器版本中,这又是可能的。

定义一个没有任何构造函数的类

Java 编译器将始终为任何类实现至少一个构造函数。在 Java 字节码中,这不是必需的。这允许创建即使使用反射也无法构造的类。但是,使用 sun.misc.Unsafe 仍然允许创建此类实例。

定义具有相同签名但返回类型不同的方法

在 JPL 中,方法通过其名称及其原始参数类型来标识为唯一。在 JBC 中,还考虑了原始返回类型。

定义名称不不同而仅类型不同的字段

一个类文件可以包含多个同名的字段,只要它们声明不同的字段类型即可。 JVM 始终将字段引用为名称和类型的元组。

抛出未声明的检查异常而不捕获它们

Java 运行时和 Java 字节码不知道检查异常的概念。只有 Java 编译器会验证检查异常是否始终被捕获或在抛出时被声明。

在 lambda 表达式之外使用动态方法调用

所谓的动态方法调用可用于任何东西,不仅仅是 Java 的 lambda 表达式。例如,使用此功能可以在运行时切换执行逻辑。许多归结为 JBC 的动态编程语言都通过使用此指令提高了性能。在 Java 字节代码中,您还可以在 Java 7 中模拟 lambda 表达式,其中编译器尚未允许使用动态方法调用,而 JVM 已经理解该指令。

使用通常不被视为合法的标识符

曾经想过在方法名称中使用空格和换行符吗?创建您自己的 JBC,祝代码审查好运。标识符的唯一非法字符是 .;[/。此外,未命名为 的方法不能包含 <>

重新分配 final 参数或 this 引用

final 参数在 JBC 中不存在,因此可以重新分配。任何参数(包括 this 引用)仅存储在 JVM 内的一个简单数组中,这允许在索引 0 内重新分配 this 引用。单一方法框架。

重新分配final字段

只要在构造函数中分配了final字段,重新分配该值甚至根本不分配值都是合法的。因此,以下两个构造函数是合法的:

class Foo {
  final int bar;
  Foo() { } // bar == 0
  Foo(Void v) { // bar == 2
    bar = 1;
    bar = 2;
  }
}

对于 static final 字段,甚至允许在 外部重新分配字段
类初始值设定项。

将构造函数和类初始值设定项视为方法

这更多的是一个概念性功能,但构造函数在 JBC 中的处理方式与普通方法没有任何不同。只有 JVM 的验证程序才能确保构造函数调用另一个合法的构造函数。除此之外,构造函数必须称为 ,类初始值设定项称为 ,这只是 Java 命名约定。除了这种差异之外,方法和构造函数的表示是相同的。正如 Holger 在评论中指出的那样,您甚至可以定义具有除 void 之外的返回类型的构造函数或带有参数的类初始值设定项,即使无法调用这些方法。

创建不对称记录*

创建记录时,

record Foo(Object bar) { }

javac 将生成一个类文件,其中包含名为 bar 的单个字段、名为 bar() 的访问器方法以及采用单个 Object 的构造函数。代码>.此外,还添加了 bar 的记录属性。通过手动生成记录,可以创建不同的构造函数形状,以跳过该字段并以不同的方式实现访问器。同时,仍然可以使反射 API 相信该类代表实际记录。

调用任何超级方法(Java 1.1 之前)

但是,这仅适用于 Java 版本 1 和 1.1。在 JBC 中,方法总是在显式目标类型上分派。这意味着

class Foo {
  void baz() { System.out.println("Foo"); }
}

class Bar extends Foo {
  @Override
  void baz() { System.out.println("Bar"); }
}

class Qux extends Bar {
  @Override
  void baz() { System.out.println("Qux"); }
}

可以实现 Qux#baz 来调用 Foo#baz,同时跳过 Bar#baz。虽然仍然可以定义显式调用来调用直接超类之外的另一个超级方法实现,但这在 1.1 之后的 Java 版本中不再有任何效果。在 Java 1.1 中,此行为是通过设置 ACC_SUPER 标志来控制的,该标志将启用仅调用直接超类的实现的相同行为。

定义对同一个类中声明的方法的非虚拟调用

在 Java 中,不可能定义类

class Foo {
  void foo() {
    bar();
  }
  void bar() { }
}

class Bar extends Foo {
  @Override void bar() {
    throw new RuntimeException();
  }
}

上面的代码总是会导致 RuntimeException当在 Bar 实例上调用 foo 时。无法定义 Foo::foo 方法来调用 Foo它自己的 bar 方法>。由于 bar 是非私有实例方法,因此调用始终是虚拟的。然而,使用字节码,我们可以定义调用以使用 INVOKESPECIAL 操作码,该操作码将 Foo::foo 中的 bar 方法调用直接链接到 < code>Foo 的版本。此操作码通常用于实现超级方法调用,但您可以重用该操作码来实现所描述的行为。

细粒度类型注释

在 Java 中,注释是根据注释声明的 @Target 应用的。使用字节码操作,可以独立于该控件定义注释。此外,例如,即使 @Target 注释适用于两个元素,也可以在不注释参数的情况下注释参数类型。

为类型或其成员定义任何属性

在 Java 语言中,只能为字段、方法或类定义注释。在 JBC 中,您基本上可以将任何信息嵌入到 Java 类中。然而,为了利用这些信息,您可以不再依赖Java类加载机制,而是需要自己提取元信息。

溢出并隐式分配 byteshortcharboolean

后者原语类型在 JBC 中通常是未知的,但仅为数组类型或字段和方法描述符定义。在字节码指令中,所有命名类型都占用 32 位空间,这允许将它们表示为 int。官方规定,字节码中仅存在 intfloatlongdouble 类型,它们都需要显式转换JVM 验证器的规则。

不释放监视器

synchronized 块实际上由两条语句组成,一条用于获取监视器,一条用于释放监视器。在 JBC 中,您无需释放即可获取它。

注意:在最近的 HotSpot 实现中,这会导致方法末尾出现 IllegalMonitorStateException,或者如果方法本身因异常而终止,则会导致隐式释放。

向类型初始值设定项添加多个 return 语句

在 Java 中,即使像这样的简单类型初始值设定项也是

class Foo {
  static {
    return;
  }
}

非法的。在字节代码中,类型初始值设定项被视为与任何其他方法一样,即可以在任何地方定义返回语句。

创建不可约循环

Java 编译器将循环转换为 Java 字节代码中的 goto 语句。此类语句可用于创建不可约循环,而 Java 编译器绝不会这样做。

定义递归 catch 块

在 Java 字节代码中,您可以定义一个块:

try {
  throw new Exception();
} catch (Exception e) {
  <goto on exception>
  throw Exception();
}

在 Java 中使用 synchronized 块时,会隐式创建类似的语句,其中释放监视器时出现任何异常返回释放此监视器的指令。通常,此类指令不应发生异常,但如果发生异常(例如已弃用的ThreadDeath),监视器仍将被释放。

调用任何默认方法

Java 编译器需要满足几个条件才能允许调用默认方法:

  1. 该方法必须是最具体的方法(不得被由任何类型,包括超级类型)。
  2. 默认方法的接口类型必须由调用默认方法的类直接实现。但是,如果接口B扩展了接口A,但没有重写A中的方法,则仍然可以调用该方法。

对于 Java 字节码,只有第二个条件才有效。然而,第一个是无关紧要的。

在不是 this 的实例上调用 super 方法

Java 编译器只允许在 this 的实例上调用 super(或接口默认)方法>。然而,在字节代码中,也可以在相同类型的实例上调用 super 方法,类似于以下内容:

class Foo {
  void m(Foo f) {
    f.super.toString(); // calls Object::toString
  }
  public String toString() {
    return "foo";
  }
}

访问合成成员

在 Java 字节代码中,可以直接访问合成成员。例如,请考虑在以下示例中如何访问另一个 Bar 实例的外部实例:

class Foo {
  class Bar { 
    void bar(Bar bar) {
      Foo foo = bar.Foo.this;
    }
  }
}

对于任何合成字段、类或方法来说通常都是如此。

定义不同步的泛型类型信息

虽然 Java 运行时不处理泛型类型(在 Java 编译器应用类型擦除之后),但此信息仍然作为元信息附加到已编译的类并可供访问通过反射 API。

验证器不会检查这些元数据String编码值的一致性。因此,可以定义与擦除不匹配的通用类型的信息。因此,以下断言可能为真:

Method method = ...
assertTrue(method.getParameterTypes() != method.getGenericParameterTypes());

Field field = ...
assertTrue(field.getFieldType() == String.class);
assertTrue(field.getGenericFieldType() == Integer.class);

此外,签名可以定义为无效,从而引发运行时异常。当第一次访问信息时会抛出此异常,因为它是延迟评估的。 (类似于带有错误的注释值。)

仅为某些方法附加参数元信息

Java 编译器允许在使用 parameter 编译类时嵌入参数名称和修饰符信息标志已启用。然而,在 Java 类文件格式中,此信息是按方法存储的,这使得只能为某些方法嵌入此类方法信息成为可能。

搞乱你的 JVM

例如,在 Java 字节码中,你可以定义调用任何类型的任何方法。通常,如果类型不知道这样的方法,验证器会抱怨。但是,如果您在数组上调用未知方法,我在某些 JVM 版本中发现了一个错误,验证程序将错过此错误,并且一旦调用指令,您的 JVM 将完成。虽然这很难说是一个特性,但从技术上来说,这是 javac 编译的 Java 无法实现的。 Java 有某种双重验证。第一个验证由 Java 编译器应用,第二个验证由 JVM 在加载类时应用。通过跳过编译器,您可能会发现验证器验证中的弱点。不过,这只是一个笼统的说法,而不是一个功能。

没有外部类时注释构造函数的接收者类型

从 Java 8 开始,内部类的非静态方法和构造函数可以声明接收者类型并注释这些类型。顶级类的构造函数无法注释其接收者类型,因为它们大多数不声明接收者类型。

class Foo {
  class Bar {
    Bar(@TypeAnnotation Foo Foo.this) { }
  }
  Foo() { } // Must not declare a receiver type
}

然而,由于 Foo.class.getDeclaredConstructor().getAnnotatedReceiverType() 确实返回表示 FooAnnotatedType,因此可以包含 Foo 的类型注释code>Foo 的构造函数直接位于类文件中,这些注释稍后由反射 API 读取。

使用未使用/遗留的字节码指令

既然其他人命名了它,我也将包括它。 Java 以前通过 JSRRET 语句使用子例程。为此,JBC 甚至知道自己的返回地址类型。然而,子例程的使用确实使静态代码分析过于复杂,这就是不再使用这些指令的原因。相反,Java 编译器会重复它编译的代码。然而,这基本上创建了相同的逻辑,这就是为什么我并不真正认为它能实现不同的目标。同样,您可以添加 Java 编译器也不使用的 NOOP 字节码指令,但这也不会真正让您实现新的目标。正如上下文中所指出的,这些提到的“功能指令”现在已从合法操作码集中删除,这确实使它们不再是一个功能。

After working with Java byte code for quite a while and doing some additional research on this matter, here is a summary of my findings:

Execute code in a constructor before calling a super constructor or auxiliary constructor

In the Java programming language (JPL), a constructor's first statement must be an invocation of a super constructor or another constructor of the same class. This is not true for Java byte code (JBC). Within byte code, it is absolutely legitimate to execute any code before a constructor, as long as:

  • Another compatible constructor is called at some time after this code block.
  • This call is not within a conditional statement.
  • Before this constructor call, no field of the constructed instance is read and none of its methods is invoked. This implies the next item.

Set instance fields before calling a super constructor or auxiliary constructor

As mentioned before, it is perfectly legal to set a field value of an instance before calling another constructor. There even exists a legacy hack which makes it able to exploit this "feature" in Java versions before 6:

class Foo {
  public String s;
  public Foo() {
    System.out.println(s);
  }
}

class Bar extends Foo {
  public Bar() {
    this(s = "Hello World!");
  }
  private Bar(String helper) {
    super();
  }
}

This way, a field could be set before the super constructor is invoked which is however not longer possible. In JBC, this behavior can still be implemented.

Branch a super constructor call

In Java, it is not possible to define a constructor call like

class Foo {
  Foo() { }
  Foo(Void v) { }
}

class Bar() {
  if(System.currentTimeMillis() % 2 == 0) {
    super();
  } else {
    super(null);
  }
}

Until Java 7u23, the HotSpot VM's verifier did however miss this check which is why it was possible. This was used by several code generation tools as a sort of a hack but it is not longer legal to implement a class like this.

The latter was merely a bug in this compiler version. In newer compiler versions, this is again possible.

Define a class without any constructor

The Java compiler will always implement at least one constructor for any class. In Java byte code, this is not required. This allows the creation of classes that cannot be constructed even when using reflection. However, using sun.misc.Unsafe still allows for the creation of such instances.

Define methods with identical signature but with different return type

In the JPL, a method is identified as unique by its name and its raw parameter types. In JBC, the raw return type is additionally considered.

Define fields that do not differ by name but only by type

A class file can contain several fields of the same name as long as they declare a different field type. The JVM always refers to a field as a tuple of name and type.

Throw undeclared checked exceptions without catching them

The Java runtime and the Java byte code are not aware of the concept of checked exceptions. It is only the Java compiler that verifies that checked exceptions are always either caught or declared if they are thrown.

Use dynamic method invocation outside of lambda expressions

The so-called dynamic method invocation can be used for anything, not only for Java's lambda expressions. Using this feature allows for example to switch out execution logic at runtime. Many dynamic programming languages that boil down to JBC improved their performance by using this instruction. In Java byte code, you could also emulate lambda expressions in Java 7 where the compiler did not yet allow for any use of dynamic method invocation while the JVM already understood the instruction.

Use identifiers that are not normally considered legal

Ever fancied using spaces and a line break in your method's name? Create your own JBC and good luck for code review. The only illegal characters for identifiers are ., ;, [ and /. Additionally, methods that are not named <init> or <clinit> cannot contain < and >.

Reassign final parameters or the this reference

final parameters do not exist in JBC and can consequently be reassigned. Any parameter, including the this reference is only stored in a simple array within the JVM what allows to reassign the this reference at index 0 within a single method frame.

Reassign final fields

As long as a final field is assigned within a constructor, it is legal to reassign this value or even not assign a value at all. Therefore, the following two constructors are legal:

class Foo {
  final int bar;
  Foo() { } // bar == 0
  Foo(Void v) { // bar == 2
    bar = 1;
    bar = 2;
  }
}

For static final fields, it is even allowed to reassign the fields outside of
the class initializer.

Treat constructors and the class initializer as if they were methods

This is more of a conceptional feature but constructors are not treated any differently within JBC than normal methods. It is only the JVM's verifier that assures that constructors call another legal constructor. Other than that, it is merely a Java naming convention that constructors must be called <init> and that the class initializer is called <clinit>. Besides this difference, the representation of methods and constructors is identical. As Holger pointed out in a comment, you can even define constructors with return types other than void or a class initializer with arguments, even though it is not possible to call these methods.

Create asymmetric records*.

When creating a record

record Foo(Object bar) { }

javac will generate a class file with a single field named bar, an accessor method named bar() and a constructor taking a single Object. Additionally, a record attribute for bar is added. By manually generating a record, it is possible to create, a different constructor shape, to skip the field and to implement the accessor differently. At the same time, it is still possible to make the reflection API believe that the class represents an actual record.

Call any super method (until Java 1.1)

However, this is only possible for Java versions 1 and 1.1. In JBC, methods are always dispatched on an explicit target type. This means that for

class Foo {
  void baz() { System.out.println("Foo"); }
}

class Bar extends Foo {
  @Override
  void baz() { System.out.println("Bar"); }
}

class Qux extends Bar {
  @Override
  void baz() { System.out.println("Qux"); }
}

it was possible to implement Qux#baz to invoke Foo#baz while jumping over Bar#baz. While it is still possible to define an explicit invocation to call another super method implementation than that of the direct super class, this does no longer have any effect in Java versions after 1.1. In Java 1.1, this behavior was controlled by setting the ACC_SUPER flag which would enable the same behavior that only calls the direct super class's implementation.

Define a non-virtual call of a method that is declared in the same class

In Java, it is not possible to define a class

class Foo {
  void foo() {
    bar();
  }
  void bar() { }
}

class Bar extends Foo {
  @Override void bar() {
    throw new RuntimeException();
  }
}

The above code will always result in a RuntimeException when foo is invoked on an instance of Bar. It is not possible to define the Foo::foo method to invoke its own bar method which is defined in Foo. As bar is a non-private instance method, the call is always virtual. With byte code, one can however define the invocation to use the INVOKESPECIAL opcode which directly links the bar method call in Foo::foo to Foo's version. This opcode is normally used to implement super method invocations but you can reuse the opcode to implement the described behavior.

Fine-grain type annotations

In Java, annotations are applied according to their @Target that the annotations declares. Using byte code manipulation, it is possible to define annotations independently of this control. Also, it is for example possible to annotate a parameter type without annotating the parameter even if the @Target annotation applies to both elements.

Define any attribute for a type or its members

Within the Java language, it is only possible to define annotations for fields, methods or classes. In JBC, you can basically embed any information into the Java classes. In order to make use of this information, you can however no longer rely on the Java class loading mechanism but you need to extract the meta information by yourself.

Overflow and implicitly assign byte, short, char and boolean values

The latter primitive types are not normally known in JBC but are only defined for array types or for field and method descriptors. Within byte code instructions, all of the named types take the space 32 bit which allows to represent them as int. Officially, only the int, float, long and double types exist within byte code which all need explicit conversion by the rule of the JVM's verifier.

Not release a monitor

A synchronized block is actually made up of two statements, one to acquire and one to release a monitor. In JBC, you can acquire one without releasing it.

Note: In recent implementations of HotSpot, this instead leads to an IllegalMonitorStateException at the end of a method or to an implicit release if the method is terminated by an exception itself.

Add more than one return statement to a type initializer

In Java, even a trivial type initializer such as

class Foo {
  static {
    return;
  }
}

is illegal. In byte code, the type initializer is treated just as any other method, i.e. return statements can be defined anywhere.

Create irreducible loops

The Java compiler converts loops to goto statements in Java byte code. Such statements can be used to create irreducible loops, which the Java compiler never does.

Define a recursive catch block

In Java byte code, you can define a block:

try {
  throw new Exception();
} catch (Exception e) {
  <goto on exception>
  throw Exception();
}

A similar statement is created implicitly when using a synchronized block in Java where any exception while releasing a monitor returns to the instruction for releasing this monitor. Normally, no exception should occur on such an instruction but if it would (e.g. the deprecated ThreadDeath), the monitor would still be released.

Call any default method

The Java compiler requires several conditions to be fulfilled in order to allow a default method's invocation:

  1. The method must be the most specific one (must not be overridden by a sub interface that is implemented by any type, including super types).
  2. The default method's interface type must be implemented directly by the class that is calling the default method. However, if interface B extends interface A but does not override a method in A, the method can still be invoked.

For Java byte code, only the second condition counts. The first one is however irrelevant.

Invoke a super method on an instance that is not this

The Java compiler only allows to invoke a super (or interface default) method on instances of this. In byte code, it is however also possible to invoke the super method on an instance of the same type similar to the following:

class Foo {
  void m(Foo f) {
    f.super.toString(); // calls Object::toString
  }
  public String toString() {
    return "foo";
  }
}

Access synthetic members

In Java byte code, it is possible to access synthetic members directly. For example, consider how in the following example the outer instance of another Bar instance is accessed:

class Foo {
  class Bar { 
    void bar(Bar bar) {
      Foo foo = bar.Foo.this;
    }
  }
}

This is generally true for any synthetic field, class or method.

Define out-of-sync generic type information

While the Java runtime does not process generic types (after the Java compiler applies type erasure), this information is still attcheched to a compiled class as meta information and made accessible via the reflection API.

The verifier does not check the consistency of these meta data String-encoded values. It is therefore possible to define information on generic types that does not match the erasure. As a concequence, the following assertings can be true:

Method method = ...
assertTrue(method.getParameterTypes() != method.getGenericParameterTypes());

Field field = ...
assertTrue(field.getFieldType() == String.class);
assertTrue(field.getGenericFieldType() == Integer.class);

Also, the signature can be defined as invalid such that a runtime exception is thrown. This exception is thrown when the information is accessed for the first time as it is evaluated lazily. (Similar to annotation values with an error.)

Append parameter meta information only for certain methods

The Java compiler allows for embedding parameter name and modifier information when compiling a class with the parameter flag enabled. In the Java class file format, this information is however stored per-method what makes it possible to only embed such method information for certain methods.

Mess things up and hard-crash your JVM

As an example, in Java byte code, you can define to invoke any method on any type. Usually, the verifier will complain if a type does not known of such a method. However, if you invoke an unknown method on an array, I found a bug in some JVM version where the verifier will miss this and your JVM will finish off once the instruction is invoked. This is hardly a feature though, but it is technically something that is not possible with javac compiled Java. Java has some sort of double validation. The first validation is applied by the Java compiler, the second one by the JVM when a class is loaded. By skipping the compiler, you might find a weak spot in the verifier's validation. This is rather a general statement than a feature, though.

Annotate a constructor's receiver type when there is no outer class

Since Java 8, non-static methods and constructors of inner classes can declare a receiver type and annotate these types. Constructors of top-level classes cannot annotate their receiver type as they most not declare one.

class Foo {
  class Bar {
    Bar(@TypeAnnotation Foo Foo.this) { }
  }
  Foo() { } // Must not declare a receiver type
}

Since Foo.class.getDeclaredConstructor().getAnnotatedReceiverType() does however return an AnnotatedType representing Foo, it is possible to include type annotations for Foo's constructor directly in the class file where these annotations are later read by the reflection API.

Use unused / legacy byte code instructions

Since others named it, I will include it as well. Java was formerly making use of subroutines by the JSR and RET statements. JBC even knew its own type of a return address for this purpose. However, the use of subroutines did overcomplicate static code analysis which is why these instructions are not longer used. Instead, the Java compiler will duplicate code it compiles. However, this basically creates identical logic which is why I do not really consider it to achieve something different. Similarly, you could for example add the NOOP byte code instruction which is not used by the Java compiler either but this would not really allow you to achieve something new either. As pointed out in the context, these mentioned "feature instructions" are now removed from the set of legal opcodes which does render them even less of a feature.

笙痞 2024-12-03 00:54:17

据我所知,Java 6 支持的字节码中没有任何主要功能不能从 Java 源代码访问。造成这种情况的主要原因显然是 Java 字节码是根据 Java 语言设计的。

然而,有一些功能不是由现代 Java 编译器产生的:

  • ACC_SUPER 标志

    这是一个可以在类上设置的标志,并指定如何为此类处理 invokespecial 字节码的特定极端情况。它由所有现代 Java 编译器设置(其中“现代”是 >= Java 1.1,如果我没记错的话),并且只有古代 Java 编译器生成未设置此设置的类文件。该标志仅出于向后兼容的原因而存在。请注意,从 Java 7u51 开始,由于安全原因,ACC_SUPER 被完全忽略。

  • jsr/ret 字节码。

    这些字节码用于实现子例程(主要用于实现finally块)。它们自 Java 6 起不再生成。不推荐使用它们的原因是它们使静态验证变得非常复杂,而没有带来太大的好处(即使用的代码几乎总是可以通过正常的跳转重新实现,而开销很小)。

  • 一个类中有两个仅返回类型不同的方法。

    Java 语言规范不允许同一类中有两个方法,其返回类型不同(即相同的名称、相同的参数列表,...)。然而 JVM 规范没有这样的限制,因此一个类文件可以包含两个这样的方法,只是没有办法使用普通的 Java 编译器生成这样的类文件。 这个答案中有一个很好的示例/解释。

As far as I know there are no major features in the bytecodes supported by Java 6 that are not also accessible from Java source code. The main reason for this is obviously that the Java bytecode was designed with the Java language in mind.

There are some features that are not produced by modern Java compilers, however:

  • The ACC_SUPER flag:

    This is a flag that can be set on a class and specifies how a specific corner case of the invokespecial bytecode is handled for this class. It is set by all modern Java compilers (where "modern" is >= Java 1.1, if I remember correctly) and only ancient Java compilers produced class files where this was un-set. This flag exists only for backwards-compatibility reasons. Note that starting with Java 7u51, ACC_SUPER is ignored completely due to security reasons.

  • The jsr/ret bytecodes.

    These bytecodes were used to implement sub-routines (mostly for implementing finally blocks). They are no longer produced since Java 6. The reason for their deprecation is that they complicate static verification a lot for no great gain (i.e. code that uses can almost always be re-implemented with normal jumps with very little overhead).

  • Having two methods in a class that only differ in return type.

    The Java language specification does not allow two methods in the same class when they differ only in their return type (i.e. same name, same argument list, ...). The JVM specification however, has no such restriction, so a class file can contain two such methods, there's just no way to produce such a class file using the normal Java compiler. There's a nice example/explanation in this answer.

一刻暧昧 2024-12-03 00:54:17

以下是一些可以在 Java 字节码中完成但不能在 Java 源代码中完成的功能:

  • 从方法中抛出已检查异常而不声明该方法抛出该异常。已检查异常和未检查异常是仅由 Java 编译器而不是 JVM 检查的东西。因此,例如 Scala 可以从方法中抛出已检查的异常,而无需声明它们。尽管对于 Java 泛型,有一种名为 sneaky throw 的解决方法。< /p>

  • 类中有两个方法,仅返回类型不同,如 Joachim 的回答:Java 语言规范不允许同一类中的两个方法不同在它们的返回类型中(即相同的名称,相同的参数列表,...)。然而 JVM 规范没有这样的限制,因此一个类文件可以包含两个这样的方法,只是没有办法使用普通的 Java 编译器生成这样的类文件。 这个答案中有一个很好的示例/解释。

Here are some features that can be done in Java bytecode but not in Java source code:

  • Throwing a checked exception from a method without declaring that the method throws it. The checked and unchecked exceptions are a thing which is checked only by the Java compiler, not the JVM. Because of this for example Scala can throw checked exceptions from methods without declaring them. Though with Java generics there is a workaround called sneaky throw.

  • Having two methods in a class that only differ in return type, as already mentioned in Joachim's answer: The Java language specification does not allow two methods in the same class when they differ only in their return type (i.e. same name, same argument list, ...). The JVM specification however, has no such restriction, so a class file can contain two such methods, there's just no way to produce such a class file using the normal Java compiler. There's a nice example/explanation in this answer.

若言繁花未落 2024-12-03 00:54:17
  • GOTO 可以与标签一起使用来创建您自己的控制结构(for while 等除外)
  • 您可以覆盖 this 方法内的局部变量
  • 结合这两者,您可以创建创建尾部调用优化的字节码(我在 中执行此操作JCompilo

作为相关点,您可以获得使用调试编译时方法的参数名称(Paranamer 通过读取字节码来执行此操作

  • GOTO can be used with labels to create your own control structures (other than for while etc)
  • You can override the this local variable inside a method
  • Combining both of these you can create create tail call optimised bytecode (I do this in JCompilo)

As a related point you can get parameter name for methods if compiled with debug (Paranamer does this by reading the bytecode

苦妄 2024-12-03 00:54:17

也许本文档中的第7A节很有趣,尽管它是关于字节码陷阱而不是字节码功能

Maybe section 7A in this document is of interest, although it's about bytecode pitfalls rather than bytecode features.

撧情箌佬 2024-12-03 00:54:17

在Java语言中,构造函数中的第一条语句必须是对超类构造函数的调用。字节码没有这个限制,而是规则是在访问成员之前必须为该对象调用超类构造函数或同一类中的另一个构造函数。这应该允许更多的自由,例如:

  • 创建另一个对象的实例,将其存储在局部变量(或堆栈)中并将其作为参数传递给超类构造函数,同时仍保留该变量中的引用以供其他使用。
  • 根据条件调用不同的其他构造函数。这应该是可能的:如何有条件地调用不同的构造函数在Java中?

我还没有测试过这些,所以如果我错了,请纠正我。

In Java language the first statement in a constructor must be a call to the super class constructor. Bytecode does not have this limitation, instead the rule is that the super class constructor or another constructor in the same class must be called for the object before accessing the members. This should allow more freedom such as:

  • Create an instance of another object, store it in a local variable (or stack) and pass it as a parameter to super class constructor while still keeping the reference in that variable for other use.
  • Call different other constructors based on a condition. This should be possible: How to call a different constructor conditionally in Java?

I have not tested these, so please correct me if I'm wrong.

呢古 2024-12-03 00:54:17

您可以使用字节码(而不是纯 Java 代码)来生成无需编译器即可加载和运行的代码。许多系统都有 JRE 而不是 JDK,如果您想动态生成代码,生成字节代码可能会更好(如果不是更容易的话),而不是必须先编译 Java 代码才能使用。

Something you can do with byte code, rather than plain Java code, is generate code which can loaded and run without a compiler. Many systems have JRE rather than JDK and if you want to generate code dynamically it may be better, if not easier, to generate byte code instead of Java code has to be compiled before it can be used.

死开点丶别碍眼 2024-12-03 00:54:17

当我还是一名 I-Play 人员时,我编写了一个字节码优化器(它旨在减少 J2ME 应用程序的代码大小)。我添加的一项功能是能够使用内联字节码(类似于 C++ 中的内联汇编语言)。我设法通过使用 DUP 指令来减小作为库方法一部分的函数的大小,因为我需要该值两次。我还有零字节指令(如果您正在调用一个采用 char 的方法并且您想传递一个 int,您知道不需要进行强制转换,我添加了 int2char(var) 来替换 char(var) ,它将删除i2c 指令减少了代码的大小。支持浮点)。

I wrote a bytecode optimizer when I was a I-Play, (it was designed to reduce the code size for J2ME applications). One feature I added was the ability to use inline bytecode (similar to inline assembly language in C++). I managed to reduce the size of a function that was part of a library method by using the DUP instruction, since I need the value twice. I also had zero byte instructions (if you are calling a method that takes a char and you want to pass an int, that you know does not need to be cast I added int2char(var) to replace char(var) and it would remove the i2c instruction to reduce the size of the code. I also made it do float a = 2.3; float b = 3.4; float c = a + b; and that would be converted to fixed point (faster, and also some J2ME did not support floating point).

沫雨熙 2024-12-03 00:54:17

在 Java 中,如果您尝试使用受保护的方法(或任何其他减少访问权限)覆盖公共方法,您会收到错误:“尝试分配较弱的访问权限”。如果您使用 JVM 字节码执行此操作,验证程序就可以使用它,并且您可以通过父类调用这些方法,就好像它们是公共的一样。

In Java, if you attempt to override a public method with a protected method (or any other reduction in access), you get an error: "attempting to assign weaker access privileges". If you do it with JVM bytecode, the verifier is fine with it, and you can call these methods via the parent class as if they were public.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文