如何说服 JVM 内联接口方法?

发布于 2024-12-06 20:09:04 字数 1077 浏览 1 评论 0原文

我有一个植根于接口并使用抽象基类实现的类层次结构。它看起来像这样:

interface Shape {
  boolean checkFlag();
}

abstract class AbstractShape implements Shape {
  private boolean flag = false;

  protected AbstractShape() { /* compute flag value */ }

  public final boolean checkFlag() { return flag; }
}

interface HasSides extends Shape { 
  int numberOfSides();
}

interface HasFiniteArea extends Shape { 
  double area();
}

class Square extends AbstractShape implements HasSides, HasFiniteArea {

} 

class Circle extends AbstractShape implements HasFiniteArea { 
}

/** etc **/

当我使用 VisualVM 对运行代码进行采样时,AbstractShape.checkFlag() 似乎从未内联,并且消耗总程序运行时间的 14%,这对于以下方法来说是令人厌恶的简单,即使对于如此频繁调用的方法也是如此。

我已在基类上将方法标记为final,并且(当前)所有实现“Shape”接口的类都扩展了AbstractShape。

我是否正确解释了 VisualVM 示例结果?有什么方法可以说服 JVM 内联此方法,还是我需要删除接口并只使用抽象基类? (我不想这样做,因为层次结构包括像 HasFiniteArea 和 HasSides 这样的接口,这意味着层次结构没有完美的树形形式)

编辑:要清楚,这是一个在任何宇宙中的方法>应该内联。在 2 分钟执行期间,它被调用超过 4.2 亿次,并且由于它不是内联的并且仍然是虚拟调用,因此它占运行时的 14%。我要问的问题是是什么阻止了 JVM 内联此方法以及如何修复它?

I have an class hierarchy rooted in an interface and implemented with an abstract base class. It something looks like this:

interface Shape {
  boolean checkFlag();
}

abstract class AbstractShape implements Shape {
  private boolean flag = false;

  protected AbstractShape() { /* compute flag value */ }

  public final boolean checkFlag() { return flag; }
}

interface HasSides extends Shape { 
  int numberOfSides();
}

interface HasFiniteArea extends Shape { 
  double area();
}

class Square extends AbstractShape implements HasSides, HasFiniteArea {

} 

class Circle extends AbstractShape implements HasFiniteArea { 
}

/** etc **/

When I sample the running code with VisualVM, it appears that AbstractShape.checkFlag() is never inlined and consumes 14% of total program running time, which is obscene for a method this simple, even for a method called so frequently.

I have marked the method final on the base class, and (currently) all classes implementing the "Shape" interface extend AbstractShape.

Am i interpreting the VisualVM sample results correctly? Is there any way to convince the JVM to inline this method or would I need to tear out the interfaces and just use an abstract base class? (I would prefer not to because the hierarchy includes interfaces like HasFiniteArea and HasSides which mean the hierachy does not have a perfect tree form)

EDIT: to be clear, this is a method that in any universe should be inlined. It is called more than 420 million times during a 2 minute execution and, because it is not inlined and remains a virtual call, it accounts for 14% of runtime. The question I'm asking is what is preventing the JVM from inlining this method and how do i fix it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

望她远 2024-12-13 20:09:04

这是来自维基百科的引用

一个常见的误解是将类或方法声明为final
通过允许编译器直接插入来提高效率
内联方法无论在哪里被调用。这并不完全正确。这
编译器无法执行此操作,因为类是在以下位置加载的
运行时并且可能与刚刚的版本不同
编译。此外,运行时环境和 JIT 编译器具有
有关已加载哪些类以及能够加载哪些类的信息
以便更好地决定何时内联、是否
方法是最终的。

另请参阅本文

Here is quote from the Wikipedia

A common misconception is that declaring a class or method final
improves efficiency by allowing the compiler to directly insert the
method inline wherever it is called. This is not completely true; the
compiler is unable to do this because the classes are loaded at
runtime and might not be the same version as the ones that were just
compiled. Further, the runtime environment and JIT compiler have the
information about exactly which classes have been loaded, and are able
to make better decisions about when to inline, whether or not the
method is final.

See also this article.

情话已封尘 2024-12-13 20:09:04

默认编译器阈值是 10000。 -XX:CompilerThreshold= 这意味着方法或循环(对于服务器 JVM)必须被调用至少 10000 次才能编译为本机代码。

编译后可以内联,但是调用堆栈确实显示了这一点。它足够聪明,知道内联代码来自另一个方法,并且您永远不会看到被截断的调用堆栈。

探查器尝试示例代码并分配时间。它并不总是做得很好,并且您得到的方法显然不是分配 CPU 时间的时间消费者。 VisualVM 是一个免费的分析器,它是用 Java 实现的。如果您使用像 YourKit 这样的分析器,您可以获得更准确的结果,因为它使用本机代码,例如不会产生垃圾。

The default compiler threshold is 10000. -XX:CompilerThreshold= This means a method or a loop (for the server JVM) has to be called at least 10000 times before it is compiled to native code.

After it has been compiled it can be inlined, however the call stack does show this. It is smart enough to know the inlined code came from another method and you never see a truncated call stack.

profilers try sample code and assign time. It doesn't always do a great job and you get methods which are obvious not time consumers being assigned CPU time. VisualVM is a free profiler and it is implementing in Java. If you use a profiler like YourKit you can get more accurate results as it uses native code e.g. doesn't create garbage.

不顾 2024-12-13 20:09:04

经过大量的实验,我无法让 Sun JDK 6 在接口上调用时内联此方法。

幸运的是,涉及的调用站点数量有限,更改

public void paint(Shape shape) {
  if(shape.checkFlag()) { /* do stuff */ }
} 

public void paint(Shape shape) {
  if(((AbstractShape)shape).checkFlag()) { /* .. */ }
}

足以让 JVM 内联该方法。与原来的 6 分钟运行时间相比,相关计算的运行时间减少了 13%。

After extensive experimentation, I could not get the Sun JDK 6 to inline this method when called on the interface.

Fortunately, there were a limited number of call sites involved, and changing

public void paint(Shape shape) {
  if(shape.checkFlag()) { /* do stuff */ }
} 

to

public void paint(Shape shape) {
  if(((AbstractShape)shape).checkFlag()) { /* .. */ }
}

is enough of a hint to get the JVM to inline the method. Running time of the calculation in question dropped 13% compared to the original runtime of 6 minutes.

各自安好 2024-12-13 20:09:04

从我读到的有关旧版本 JVM 的文献中,通过将方法声明为 Final ,它将确定它将内联转换该方法。
现在,您不必将该方法指定为最终方法来优化代码。

您应该让 JVM 优化代码,并且仅当您明确不希望覆盖该方法时才将该方法设为最终方法。 JVM 可能不会使您的方法内联,因为考虑到应用程序代码的其余部分,它自己的优化速度更快。

From the literature I've read on old versions of JVM's by declaring methods final it would determine it to transform that method inline.
Now you don't have to specify that method to be final to optimize the code.

You should let the JVM optimize the code, and make the method final only if you explicitly don't want that method overridden. The JVM probably doesn't make your method inline because its own optimization is faster, considering the rest of the application's code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文