为什么 Java 使用 JIT 比编译为机器代码更快?
我听说 Java 必须使用 JIT 才能更快。与解释相比,这是完全有道理的,但为什么有人不能制作一个生成快速 Java 代码的提前编译器呢?我了解 gcj
,但我不认为它的输出通常比 Hotspot 更快。
语言方面是否有什么因素让这件事变得困难?我认为这可以归结为以下几点:
- 反射类
- 加载
我缺少什么?如果我避免这些功能,是否可以将 Java 代码一次编译为本机机器代码并完成?
I have heard that Java must use a JIT to be fast. This makes perfect sense when comparing to interpretation, but why can't someone make an ahead-of-time compiler that generates fast Java code? I know about gcj
, but I don't think its output is typically faster than Hotspot for example.
Are there things about the language that make this difficult? I think it comes down to just these things:
- Reflection
- Classloading
What am I missing? If I avoid these features, would it be possible to compile Java code once to native machine code and be done?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
JIT 编译器可以更快,因为机器代码是在它执行的机器上生成的。这意味着 JIT 拥有可用的最佳信息来发出优化的代码。
如果将字节码预编译为机器代码,则编译器无法针对目标机器进行优化,只能针对构建机器进行优化。
A JIT compiler can be faster because the machine code is being generated on the exact machine that it will also execute on. This means that the JIT has the best possible information available to it to emit optimized code.
If you pre-compile bytecode into machine code, the compiler cannot optimize for the target machine(s), only the build machine.
我将在书中粘贴 James Gosling 给出的一个有趣的答案编程大师。
I will paste an interesting answer given by the James Gosling in the Book Masterminds of Programming.
对于任何 AOT 编译器来说,真正的杀手是:
这意味着您无法编写一个涵盖所有 Java 程序的 AOT 编译器,因为只有在运行时才能获得有关程序特征的信息。不过,您可以在 Java 的一个子集上执行此操作,我相信 gcj 就是这样做的。
另一个典型的例子是,如果发现这样做是安全的,JIT 能够直接在调用方法中内联 getX() 等方法,并在适当的情况下撤消它,即使程序员没有通过告知该方法来明确提供帮助方法是最终的。 JIT 可以看到在运行的程序中给定的方法没有被覆盖,因此在这种情况下可以被视为最终的。这在下一次调用中可能会有所不同。
2019 年编辑:Oracle 引入了 GraalVM,它允许在 Java 子集(一个相当大的子集,但仍然是一个子集)上进行 AOT 编译,主要要求是所有代码在编译时可用。这允许 Web 容器的启动时间为毫秒。
The real killer for any AOT compiler is:
This means that you cannot write a AOT compiler which covers ALL Java programs as there is information available only at runtime about the characteristics of the program. You can, however, do it on a subset of Java which is what I believe that gcj does.
Another typical example is the ability of a JIT to inline methods like getX() directly in the calling methods if it is found that it is safe to do so, and undoing it if appropriate, even if not explicitly helped by the programmer by telling that a method is final. The JIT can see that in the running program a given method is not overriden and is therefore in this instance can be treated as final. This might be different in the next invocation.
Edit 2019: Oracle has introduced GraalVM which allows AOT compilation on a subset of Java (a quite large one, but still a subset) with the primary requirement that all code is available at compile time. This allows for millisecond startup time of web containers.
Java 的 JIT 编译器也是惰性的和自适应的。
懒惰
由于懒惰,它只在到达方法时才编译方法,而不是编译整个程序(如果您不使用程序的一部分,则非常有用)。类加载实际上允许 JIT 忽略尚未遇到的类,从而有助于加快 JIT 速度。
自适应
由于具有自适应性,它会首先发出一个快速而肮脏的机器代码版本,然后仅在频繁使用该方法时才返回并执行完整的工作。
Java's JIT compiler is also lazy and adaptive.
Lazy
Being lazy it only compiles methods when it gets to them instead of compiling the whole program (very useful if you don't use part of a program). Class loading actually helps make the JIT faster by allowing it to ignore classes it hasn't come across yet.
Adaptive
Being adaptive it emits a quick and dirty version of the machine code first and then only goes back and does a through job if that method is used frequently.
最终归结为这样一个事实:拥有更多信息可以实现更好的优化。在这种情况下,JIT 具有有关代码运行的实际机器的更多信息(正如 Andrew 提到的),并且它还具有许多在编译期间不可用的运行时信息。
In the end it boils down to the fact that having more information enables better optimizations. In this case, the JIT has more information about the actual machine the code is running on (as Andrew mentioned) and it also has a lot of runtime information that is not available during compilation.
理论上,如果 JIT 编译器有足够的时间和可用的计算资源,则 JIT 编译器比 AOT 更具优势。例如,如果您有一个企业应用程序在具有大量 RAM 的多处理器服务器上运行数天甚至数月,那么 JIT 编译器可以生成比任何 AOT 编译器更好的代码。
现在,如果您有桌面应用程序,诸如快速启动和初始响应时间(AOT 的优势)之类的事情变得更加重要,而且计算机可能没有足够的资源来进行最高级的优化。
如果你有一个资源稀缺的嵌入式系统,JIT 就没有机会对抗 AOT。
然而,以上都是理论。在实践中,创建这样一个先进的 JIT 编译器比一个像样的 AOT 编译器要复杂得多。一些实际证据怎么样?
In theory, a JIT compiler has an advantage over AOT if it has enough time and computational resources available. For instance, if you have an enterprise app running for days and months on a multiprocessor server with plenty of RAM, the JIT compiler can produce better code than any AOT compiler.
Now, if you have a desktop app, things like fast startup and initial response time (where AOT shines) become more important, plus the computer may not have sufficient resources for the most advanced optimizations.
And if you have an embedded system with scarce resources, JIT has no chance against AOT.
However, the above was all theory. In practice, creating such an advanced JIT compiler is way more complicated than a decent AOT one. How about some practical evidence?
Java 跨虚拟方法边界内联和执行高效接口分派的能力需要在编译之前进行运行时分析 - 换句话说,它需要 JIT。由于所有方法都是虚拟的并且接口“无处不在”,所以它产生了很大的差异。
Java's ability to inline across virtual method boundaries and perform efficient interface dispatch requires runtime analysis before compiling - in other words it requires a JIT. Since all methods are virtual and interfaces are used "everywhere", it makes a big difference.
JIT 可以识别并消除一些只能在运行时才能知道的条件。一个主要的例子是消除现代虚拟机使用的虚拟调用 - 例如,当 JVM 发现
invokevirtual
或invokeinterface
指令时,如果只加载了一个覆盖所调用方法的类,虚拟机实际上可以使该虚拟调用静态,从而能够内联它。另一方面,对于 C 程序,函数指针始终是函数指针,并且不能内联对它的调用(无论如何,在一般情况下)。这是 JVM 能够内联虚拟调用的情况:
假设我们不在其他地方创建
A
或B
实例,并且someCondition
设置为true
,JVM 知道对doIt()
的调用始终意味着A.doIt
,因此可以避免方法表查找,然后内联调用。非 JIT 环境中的类似构造不会是内联的。JITs can identify and eliminate some conditions which can only be known at runtime. A prime example is the elimination of virtual calls modern VMs use - e.g., when the JVM finds an
invokevirtual
orinvokeinterface
instruction, if only one class overriding the invoked method has been loaded, the VM can actually make that virtual call static and is thus able to inline it. To a C program, on the other hand, a function pointer is always a function pointer, and a call to it can't be inlined (in the general case, anyway).Here's a situation where the JVM is able to inline a virtual call:
Assuming we don't go around creating
A
orB
instances elsewhere and thatsomeCondition
is set totrue
, the JVM knows that the call todoIt()
always meansA.doIt
, and can therefore avoid the method table lookup, and then inline the call. A similar construct in a non-JITted environment would not be inlinable.我认为官方的 Java 编译器是 JIT 编译器这一事实是其中很大一部分原因。与 Java 机器代码编译器相比,优化 JVM 花费了多少时间?
I think the fact that the official Java compiler is a JIT compiler is a large part of this. How much time has been spent optimizing the JVM vs. a machine code compiler for Java?
迪米特里·莱斯科夫(Dimitry Leskov)绝对是对的。
以上所有只是关于如何使 JIT 更快的理论,实现每个场景几乎是不可能的。此外,由于 x86_64 CPU 上只有少数不同的指令集,因此针对当前 CPU 上的每个指令集几乎没有什么好处。在本机代码中构建性能关键型应用程序时,我始终遵循以 x86_64 和 SSE4.2 为目标的规则。 Java 的基本结构造成了大量的限制,JNI 可以帮助您展示它的效率是多么低下,而 JIT 只是通过使其整体速度更快来粉饰这一点。除了默认情况下每个函数都是虚拟的这一事实之外,它还在运行时使用类类型,而不是例如 C++。 C++ 在性能方面具有很大的优势,因为不需要在运行时加载类对象,所有数据块都在内存中分配,并且仅在请求时才进行初始化。换句话说,C++ 在运行时没有类类型。 Java 类是实际的对象,而不仅仅是模板。我不打算讨论 GC,因为那是无关紧要的。 Java 字符串也较慢,因为它们使用动态字符串池,这需要运行时每次在池表中进行字符串搜索。其中许多问题是由于 Java 最初构建时并不是为了快速,所以它的基础总是很慢。大多数本机语言(主要是 C/C++)都是专门为精简而构建的,不会浪费内存或资源。事实上,Java 的前几个版本非常慢并且浪费内存,有很多不必要的变量元数据等等。就像今天一样,JIT 能够生成比 AOT 语言更快的代码仍然是一个理论。
考虑一下 JIT 需要跟踪执行惰性 JIT 的所有工作,每次调用函数时递增计数器,检查它被调用了多少次......等等。运行 JIT 需要花费大量时间。在我看来,这种交换是不值得的。这只是在 PC 上
尝试过在 Raspberry 和其他嵌入式设备上运行 Java 吗?绝对糟糕的表现。 Raspberry 上的 JavaFX?连功能都没有...
Java 及其 JIT 远未达到它所宣传的所有内容以及人们盲目吐槽的理论。
Dimitry Leskov is absolutely right here.
All of the above is just theory of what could make JIT faster, implementing every scenaro is almost impossible. Besides, due to the fact that we only have a handful of different instruction sets on x86_64 CPUs there is very little to gain by targeting every instruction set on the current CPU. I always go by the rule of targeting x86_64 and SSE4.2 when building performance critical applications in native code. Java's fundamental structure is causing a ton of limitations, JNI can help you show just how inefficient it is, JIT is only sugarcoating this by making it overall faster. Besides the fact that every function by default is virtual, it also uses class types at runtime as opposed to for example C++. C++ has a great advantage here when it comes to performance, because no class object is required to be loaded at runtime, it's all blocks of data that gets allocated in memory, and only initialized when requested. In other words C++ doesn't have class types at runtime. Java classes are actual objects, not just templates. I'm not going to go into GC because that's irrelevant. Java strings are also slower because they use dynamic string pooling which would require runtime to do string searches in the pool table each time. Many of those things are due to the fact that Java wasn't first built to be fast, so its fundament will always be slow. Most native languages (primarily C/C++) was specifically built to be lean and mean, no waste of memory or resources. The first few versions of Java in fact were terribly slow and wasteful to memory, with lots of unnecessary meta data for variables and what not. As it is today, JIT being capable of producing faster code than AOT languages will remain a theory.
Think about all the work the JIT needs to keep track of to do the lazy JIT, increment a counter each time a function is called, check how many times it's been called.. so on and so forth. Running the JIT is taking a lot of time. The tradeof in my eyes is not worth it. This is just on PC
Ever tried to run Java on Raspberry and other embedded devices? Absolutely terrible performance. JavaFX on Raspberry? Not even functional...
Java and its JIT is very far from meeting all of what it advertises and the theory people blindly spew out about it.