当前位置：文江博客话题详情

识别java字节码中的循环

发布于 2024-11-25 20:38:54 字数 272 浏览 4 评论 0原文

我正在尝试检测java字节代码。

我想识别java循环的进入和退出，但我发现循环的识别非常具有挑战性。我花了好几个小时研究ASM和开源反编译器（我认为它们必须一直解决这个问题），但是，我却做得不够。

我正在增强/扩展的工具正在使用 ASM，所以理想情况下我想知道如何通过 ASM 检测 java 中不同循环结构的进入和退出。然而，我也欢迎推荐一个好的开源反编译器，因为显然他们会解决同样的问题。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

掩耳倾听 2024-12-02 20:38:54

编辑 4：一些背景/序言。

“在代码中向后跳转的唯一方法是通过循环。”Peter 的回答并不完全正确。你可以来回跳跃，而不意味着它是一个循环。一个简化的情况是这样的：
<前><代码>0：转到2
1：转到3
2：转到1
当然，这个特定的例子非常人为而且有点愚蠢。然而，对源代码到字节码编译器的行为进行假设可能会导致意外。正如彼得和我在各自的答案中所示，两种流行的编译器可以产生相当不同的输出（即使没有混淆）。这并不重要，因为当您执行代码时，JIT 编译器往往会很好地优化所有这些。
话虽这么说，在绝大多数情况下，向后跳转将是循环开始位置的合理指示。与其他部分相比，找出循环的入口点是“容易”的部分。
在考虑任何循环启动/退出检测之前，您应该查看入口、出口和后继者的定义。虽然循环只有一个入口点，但它可能有多个出口点和/或多个后继点，通常由 break 语句（有时带有标签）、return 语句和/或异常（明确捕获或未捕获）。虽然您没有提供有关您正在研究的仪器类型的详细信息，但当然值得考虑要在哪里插入代码（如果这就是您想要做的）。通常，可能必须在每个退出语句之前或而不是每个后继语句之前完成一些检测（在这种情况下，您必须移动原始语句）。

Soot 是一个很好的框架。它有许多中间表示形式，使字节码分析更加方便（例如Jimple）。

您可以根据您的方法主体构建 BlockGraph，例如ExceptionalBlockGraph。一旦您将控制流图分解为这样的块图，您应该能够从节点中识别出支配者（即具有返回它们的箭头的块）。这将为您提供循环的开始。

您可能会在本论文的第 4.3 至 4.7 节中找到类似的内容。

编辑：

在对@Peter的回答进行评论后进行讨论。谈论相同的例子：

public int foo(int i, int j) {
    while (true) {
        try {
            while (i < j)
                i = j++ / i;
        } catch (RuntimeException re) {
            i = 10;
            continue;
        }
        break;
    }
    return j;
}

这次，使用 Eclipse 编译器进行编译（没有特定选项：只是从 IDE 内自动编译）。
这段代码没有被混淆（除了糟糕的代码之外，但那是另一回事）。
下面是结果（来自 javap -c）：

public int foo(int, int);
  Code:
   0:   goto    10
   3:   iload_2
   4:   iinc    2, 1
   7:   iload_1
   8:   idiv
   9:   istore_1
   10:  iload_1
   11:  iload_2
   12:  if_icmplt   3
   15:  goto    25
   18:  astore_3
   19:  bipush  10
   21:  istore_1
   22:  goto    10
   25:  iload_2
   26:  ireturn
  Exception table:
   from   to  target type
     0    15    18   Class java/lang/RuntimeException

在 3 和 12 之间有一个循环（在开始 10 时跳转）和另一个循环，这是由于从 8 处除以 0 到22.
与 javac 编译器结果不同，人们可以猜测 0 到 22 之间有一个外循环，0 到 12 之间有一个内循环，这里的嵌套不太明显。

编辑2：

用一个不太尴尬的例子来说明你可能会遇到的问题。这是一个相对简单的循环：

public void foo2() {
    for (int i = 0; i < 5; i++) {
        System.out.println(i);
    }
}

在 Eclipse 中（正常）编译后，javap -c 给出：

public void foo2();
  Code:
   0:   iconst_0
   1:   istore_1
   2:   goto    15
   5:   getstatic   #25; //Field java/lang/System.out:Ljava/io/PrintStream;
   8:   iload_1
   9:   invokevirtual   #31; //Method java/io/PrintStream.println:(I)V
   12:  iinc    1, 1
   15:  iload_1
   16:  iconst_5
   17:  if_icmplt   5
   20:  return

在循环内执行任何操作之前，直接从 2 跳转到 15。块 15 到 17 是标头循环的起点（“入口点”）。有时，标头块可能包含更多指令，特别是当退出条件涉及更多计算时，或者它是一个 do {} while() 循环时。
循环的“进入”和“退出”概念可能并不总是反映您明智地编写为 Java 源代码的内容（包括您可以将 for 循环重写为 while<例如 /code> 循环）。使用 break 也可能导致多个退出点。

顺便说一下，我所说的“块”是指一系列字节码，您无法在其中跳入，也无法在中间跳出：它们仅从第一行输入（不一定是从上一行输入）行，可能是从其他地方跳转）并从最后一行退出（不一定到下一行，它也可以跳转到其他地方）。

编辑3：

自从我上次查看Soot以来，似乎已经添加了新的类/方法来分析循环，这使得它更方便一些。

这是一个完整的例子。

要分析的类/方法 (TestLoop.foo())

public class TestLoop {
    public void foo() {
        for (int j = 0; j < 2; j++) {
            for (int i = 0; i < 5; i++) {
                System.out.println(i);
            }
        }
    }
}

当由 Eclipse 编译器编译时，会生成以下字节码 (javap -c)：

public void foo();
  Code:
   0:   iconst_0
   1:   istore_1
   2:   goto    28
   5:   iconst_0
   6:   istore_2
   7:   goto    20
   10:  getstatic   #25; //Field java/lang/System.out:Ljava/io/PrintStream;
   13:  iload_2
   14:  invokevirtual   #31; //Method java/io/PrintStream.println:(I)V
   17:  iinc    2, 1
   20:  iload_2
   21:  iconst_5
   22:  if_icmplt   10
   25:  iinc    1, 1
   28:  iload_1
   29:  iconst_2
   30:  if_icmplt   5
   33:  return

这是一个加载的程序使用 Soot 的类（假设它位于此处的类路径上）并显示其块和循环：

import soot.Body;
import soot.Scene;
import soot.SootClass;
import soot.SootMethod;
import soot.jimple.toolkits.annotation.logic.Loop;
import soot.toolkits.graph.Block;
import soot.toolkits.graph.BlockGraph;
import soot.toolkits.graph.ExceptionalBlockGraph;
import soot.toolkits.graph.LoopNestTree;

public class DisplayLoops {
    public static void main(String[] args) throws Exception {
        SootClass sootClass = Scene.v().loadClassAndSupport("TestLoop");
        sootClass.setApplicationClass();

        Body body = null;
        for (SootMethod method : sootClass.getMethods()) {
            if (method.getName().equals("foo")) {
                if (method.isConcrete()) {
                    body = method.retrieveActiveBody();
                    break;
                }
            }
        }

        System.out.println("**** Body ****");
        System.out.println(body);
        System.out.println();

        System.out.println("**** Blocks ****");
        BlockGraph blockGraph = new ExceptionalBlockGraph(body);
        for (Block block : blockGraph.getBlocks()) {
            System.out.println(block);
        }
        System.out.println();

        System.out.println("**** Loops ****");
        LoopNestTree loopNestTree = new LoopNestTree(body);
        for (Loop loop : loopNestTree) {
            System.out.println("Found a loop with head: " + loop.getHead());
        }
    }
}

查看 Soot 文档以获取有关如何加载类的更多详细信息。 Body 是循环体的模型，即由字节码组成的所有语句。这使用了中间Jimple表示，它相当于字节码，但更容易分析和处理。

以下是该程序的输出：

Body:

    public void foo()
    {
        TestLoop r0;
        int i0, i1;
        java.io.PrintStream $r1;

        r0 := @this: TestLoop;
        i0 = 0;
        goto label3;

     label0:
        i1 = 0;
        goto label2;

     label1:
        $r1 = <java.lang.System: java.io.PrintStream out>;
        virtualinvoke $r1.<java.io.PrintStream: void println(int)>(i1);
        i1 = i1 + 1;

     label2:
        if i1 < 5 goto label1;

        i0 = i0 + 1;

     label3:
        if i0 < 2 goto label0;

        return;
    }

Blocks:

Block 0:
[preds: ] [succs: 5 ]
r0 := @this: TestLoop;
i0 = 0;
goto [?= (branch)];

Block 1:
[preds: 5 ] [succs: 3 ]
i1 = 0;
goto [?= (branch)];

Block 2:
[preds: 3 ] [succs: 3 ]
$r1 = <java.lang.System: java.io.PrintStream out>;
virtualinvoke $r1.<java.io.PrintStream: void println(int)>(i1);
i1 = i1 + 1;

Block 3:
[preds: 1 2 ] [succs: 4 2 ]
if i1 < 5 goto $r1 = <java.lang.System: java.io.PrintStream out>;

Block 4:
[preds: 3 ] [succs: 5 ]
i0 = i0 + 1;

Block 5:
[preds: 0 4 ] [succs: 6 1 ]
if i0 < 2 goto i1 = 0;

Block 6:
[preds: 5 ] [succs: ]
return;

Loops:

Found a loop with head: if i1 < 5 goto $r1 = <java.lang.System: java.io.PrintStream out>
Found a loop with head: if i0 < 2 goto i1 = 0

LoopNestTree 使用 LoopFinder，它使用ExceptionalBlockGraph 用于构建块列表。
Loop 类将为您提供进入语句和退出语句。如果您愿意，您应该能够添加额外的语句。 Jimple 对此非常方便（它足够接近字节码，但级别稍高，以免手动处理所有内容）。然后，如果需要，您可以输出修改后的 .class 文件。（请参阅 Soot 文档了解这一点。）

EDIT 4: A bit of background/preamble.

"The only way to jump backward in the code is via a loop." in Peter's answer isn't strictly true. You could jump back and forth without it meaning it's a loop. A simplified case would be something like this:
```
0: goto 2
1: goto 3
2: goto 1
```
Of course, this particular example is very artificial and a bit silly. However, making assumptions as to how the source-to-bytecode compiler is going to behave could lead to surprises. As Peter and I have shown in our respective answers, two popular compilers can produce a rather different output (even without obfuscation). It rarely matters, because all of this tends to be optimised rather well by the JIT compiler when you execute the code.
This being said, in the vast majority of cases, jumping backwards will be a reasonable indication as to where a loop starts. Compared with the rest, finding out the entry point of a loop is the "easy" part.
Before considering any loop start/exit instrumentation, you should look into the definitions of what entry, exit and successors are. Although a loop will only have one entry point, it may have multiple exit points and/or multiple successors, typically caused by break statements (sometimes with labels), return statements and/or exceptions (explicitly caught or not). While you haven't given details regarding the kind of instrumentations you're investigating, it's certainly worth considering where you want to insert code (if that's what you want to do). Typically, some instrumentation may have to be done before each exit statement or instead of each successor statement (in which case you'll have to move the original statement).

Soot is a good framework to do this. It has a number of intermediate representations that make bytecode analysis more convenient (e.g. Jimple).

You can build a BlockGraph based on your method body, for example an ExceptionalBlockGraph. Once you've decomposed the control flow graph into such a block graph, from the nodes, you should be able to identity the dominators (i.e. blocks that have an arrow coming back to them). This will give you the start of the loop.

You may find something similar done in sections 4.3 to 4.7 of this dissertation.

EDIT:

Following the discussion with @Peter in comments to his answer. Talking the same example:

public int foo(int i, int j) {
    while (true) {
        try {
            while (i < j)
                i = j++ / i;
        } catch (RuntimeException re) {
            i = 10;
            continue;
        }
        break;
    }
    return j;
}

This time, compiled with the Eclipse compiler (no specific option: simply autocompilation from within the IDE).
This code hasn't been obfuscated (apart from being bad code, but that's a different matter).
Here is the result (from javap -c):

public int foo(int, int);
  Code:
   0:   goto    10
   3:   iload_2
   4:   iinc    2, 1
   7:   iload_1
   8:   idiv
   9:   istore_1
   10:  iload_1
   11:  iload_2
   12:  if_icmplt   3
   15:  goto    25
   18:  astore_3
   19:  bipush  10
   21:  istore_1
   22:  goto    10
   25:  iload_2
   26:  ireturn
  Exception table:
   from   to  target type
     0    15    18   Class java/lang/RuntimeException

There is a loop between 3 and 12 (jumped in starting a 10) and another loop, due to the exception occurring from the division by zero at 8 to 22.
Unlike the javac compiler result, where one could make as guess that there was an outer loop between 0 and 22 and an inner loop between 0 and 12, the nesting is less obvious here.

EDIT 2:

To illustrate the kind of problems you may get with a less awkward example. Here is a relatively simple loop:

public void foo2() {
    for (int i = 0; i < 5; i++) {
        System.out.println(i);
    }
}

After (normal) compilation within Eclipse, javap -c gives this:

public void foo2();
  Code:
   0:   iconst_0
   1:   istore_1
   2:   goto    15
   5:   getstatic   #25; //Field java/lang/System.out:Ljava/io/PrintStream;
   8:   iload_1
   9:   invokevirtual   #31; //Method java/io/PrintStream.println:(I)V
   12:  iinc    1, 1
   15:  iload_1
   16:  iconst_5
   17:  if_icmplt   5
   20:  return

Before doing anything within the loop, you jump straight from 2 to 15. Block 15 to 17 is the header of the loop (the "entry point"). Sometimes, the header block could contain far more instructions, especially if the exit condition involves more evaluation, or if it's a do {} while() loop.
The concept of "entry" and "exit" of a loop may not always reflect what you'd write sensibly as Java source code (including the fact that you can rewrite for loops as while loops, for example). Using break can also lead to multiple exit points.

By the way, by "block", I mean a sequence of bytecode into which you can't jump and out of which you can't jump in the middle: they're only entered from the first line (not necessarily from the previous line, possibly from a jump from somewhere else) and exited from the last (not necessarily to the following line, it can jump somewhere else too).

EDIT 3:

It seems that new classes/methods to analyse loops have been added since last time I had looked at Soot, which make it a bit more convenient.

Here is a complete example.

The class/method to analyse (TestLoop.foo())

public class TestLoop {
    public void foo() {
        for (int j = 0; j < 2; j++) {
            for (int i = 0; i < 5; i++) {
                System.out.println(i);
            }
        }
    }
}

When compiled by the Eclipse compiler, this produces this bytecode (javap -c):

public void foo();
  Code:
   0:   iconst_0
   1:   istore_1
   2:   goto    28
   5:   iconst_0
   6:   istore_2
   7:   goto    20
   10:  getstatic   #25; //Field java/lang/System.out:Ljava/io/PrintStream;
   13:  iload_2
   14:  invokevirtual   #31; //Method java/io/PrintStream.println:(I)V
   17:  iinc    2, 1
   20:  iload_2
   21:  iconst_5
   22:  if_icmplt   10
   25:  iinc    1, 1
   28:  iload_1
   29:  iconst_2
   30:  if_icmplt   5
   33:  return

Here is a program that loads the class (assuming it's on the classpath here) using Soot and displays its blocks and loops:

import soot.Body;
import soot.Scene;
import soot.SootClass;
import soot.SootMethod;
import soot.jimple.toolkits.annotation.logic.Loop;
import soot.toolkits.graph.Block;
import soot.toolkits.graph.BlockGraph;
import soot.toolkits.graph.ExceptionalBlockGraph;
import soot.toolkits.graph.LoopNestTree;

public class DisplayLoops {
    public static void main(String[] args) throws Exception {
        SootClass sootClass = Scene.v().loadClassAndSupport("TestLoop");
        sootClass.setApplicationClass();

        Body body = null;
        for (SootMethod method : sootClass.getMethods()) {
            if (method.getName().equals("foo")) {
                if (method.isConcrete()) {
                    body = method.retrieveActiveBody();
                    break;
                }
            }
        }

        System.out.println("**** Body ****");
        System.out.println(body);
        System.out.println();

        System.out.println("**** Blocks ****");
        BlockGraph blockGraph = new ExceptionalBlockGraph(body);
        for (Block block : blockGraph.getBlocks()) {
            System.out.println(block);
        }
        System.out.println();

        System.out.println("**** Loops ****");
        LoopNestTree loopNestTree = new LoopNestTree(body);
        for (Loop loop : loopNestTree) {
            System.out.println("Found a loop with head: " + loop.getHead());
        }
    }
}

Check the Soot documentation for more details on how to load classes. The Body is a model for the body of the loop, i.e. all the statements made from the bytecode. This uses the intermediate Jimple representation, which is equivalent to the bytecode, but easier to analyse and process.

Here is the output of this program:

Body:

    public void foo()
    {
        TestLoop r0;
        int i0, i1;
        java.io.PrintStream $r1;

        r0 := @this: TestLoop;
        i0 = 0;
        goto label3;

     label0:
        i1 = 0;
        goto label2;

     label1:
        $r1 = <java.lang.System: java.io.PrintStream out>;
        virtualinvoke $r1.<java.io.PrintStream: void println(int)>(i1);
        i1 = i1 + 1;

     label2:
        if i1 < 5 goto label1;

        i0 = i0 + 1;

     label3:
        if i0 < 2 goto label0;

        return;
    }

Blocks:

Block 0:
[preds: ] [succs: 5 ]
r0 := @this: TestLoop;
i0 = 0;
goto [?= (branch)];

Block 1:
[preds: 5 ] [succs: 3 ]
i1 = 0;
goto [?= (branch)];

Block 2:
[preds: 3 ] [succs: 3 ]
$r1 = <java.lang.System: java.io.PrintStream out>;
virtualinvoke $r1.<java.io.PrintStream: void println(int)>(i1);
i1 = i1 + 1;

Block 3:
[preds: 1 2 ] [succs: 4 2 ]
if i1 < 5 goto $r1 = <java.lang.System: java.io.PrintStream out>;

Block 4:
[preds: 3 ] [succs: 5 ]
i0 = i0 + 1;

Block 5:
[preds: 0 4 ] [succs: 6 1 ]
if i0 < 2 goto i1 = 0;

Block 6:
[preds: 5 ] [succs: ]
return;

Loops:

Found a loop with head: if i1 < 5 goto $r1 = <java.lang.System: java.io.PrintStream out>
Found a loop with head: if i0 < 2 goto i1 = 0

LoopNestTree uses LoopFinder, which uses an ExceptionalBlockGraph to build the list of blocks.
The Loop class will give you the entry statement and the exit statements. You should then be able to add extra statements if you wish. Jimple is quite convenient for this (it's close enough to the bytecode, but has a slightly higher level so as not to deal with everything manually). You can then output your modified .class file if needed. (See the Soot documentation for this.)

回复收藏 0 原文

飘逸的'云 2024-12-02 20:38:54

在代码中向后跳转的唯一方法是通过循环。因此，您正在寻找 goto、if_icmplt 等，它们会转到上一个字节码指令。一旦找到循环的结尾并且跳回的位置就是循环的开始。

这是一个复杂的例子，来自布鲁诺建议的文档。

public int foo(int i, int j) {
    while (true) {
        try {
            while (i < j)
                i = j++ / i;
        } catch (RuntimeException re) {
            i = 10;
            continue;
        }
        break;
    }
    return j;
}

该字节码出现在 javap -c 中，

public int foo(int, int);
  Code:
   0:   iload_1
   1:   iload_2
   2:   if_icmpge       15
   5:   iload_2
   6:   iinc    2, 1
   9:   iload_1
   10:  idiv
   11:  istore_1
   12:  goto    0
   15:  goto    25
   18:  astore_3
   19:  bipush  10
   21:  istore_1
   22:  goto    0
   25:  iload_2
   26:  ireturn
  Exception table:
   from   to  target type
     0    15    18   Class java/lang/RuntimeException

您可以看到 0 到 12 之间有一个内部循环，0 到 15 之间有一个 try/catch 块，以及 0 到 22 之间有一个外部循环。

The only way to jump backward in the code is via a loop. So you are looking for a goto,if_icmplt etc which goes to a previous byte code instruction. Once you have found the end of the loop and where it jumps back to is the start of the loop.

Here is a complex example, from the document Bruno suggested.

public int foo(int i, int j) {
    while (true) {
        try {
            while (i < j)
                i = j++ / i;
        } catch (RuntimeException re) {
            i = 10;
            continue;
        }
        break;
    }
    return j;
}

The byte-code for this appears in javap -c as

public int foo(int, int);
  Code:
   0:   iload_1
   1:   iload_2
   2:   if_icmpge       15
   5:   iload_2
   6:   iinc    2, 1
   9:   iload_1
   10:  idiv
   11:  istore_1
   12:  goto    0
   15:  goto    25
   18:  astore_3
   19:  bipush  10
   21:  istore_1
   22:  goto    0
   25:  iload_2
   26:  ireturn
  Exception table:
   from   to  target type
     0    15    18   Class java/lang/RuntimeException

You can see there is an inner loop between 0 and 12, a try/catch block between 0 and 15 and an outer loop between 0 and 22.

回复收藏 0 原文

來不及說愛妳 2024-12-02 20:38:54

您实际上是在逐字节构建您的类吗？那真是太狂野了！ ASM 的首页链接到 Eclipse 的 Bytecode Outline 插件，我假设您是这样的正在使用。如果您单击那里的第一个图像，您会注意到代码有一个 while 循环，并且您至少可以看到一些用于实现该循环的字节代码。供参考的是该屏幕截图：

Bytecode Outline Screenshot

直接链接

看起来循环只是通过边界检查实现为 GOTO。我正在谈论这一行：

L2 (173)
  GOTO L3

我确信 L3 标记具有用于检查索引边界并决定是否 JMP 的代码。我认为如果您想一次检测一个字节代码的循环，这对您来说将非常困难。 ASM 确实可以选择使用模板类作为检测的基础，您尝试过使用它吗？

Are you actually building your class byte by byte? Thats pretty wild! The front page of ASM links to the Bytecode Outline plugin for Eclipse, which I assume you are using. If you click on the first image on there you will notice the code has a while loop, and you can see at least some of the byte code used to implement that loop. For reference here is that screenshot:

Bytecode Outline Screenshot

Direct link

Looks like loops are simply implemented as GOTO's with a boundary check. I'm talking about this line:

L2 (173)
  GOTO L3

I'm sure the L3 marker has code for checking the index bound and decided wether to JMP. I think this is going to be quite hard for you if you want to instrument loops one byte code at a time. ASM does have the option of using a template class as the basis for you instrumentation, have you tried using it?

回复收藏 0 原文

雨后咖啡店 2024-12-02 20:38:54

我知道这是一个老问题 - 然而，人们对如何使用 ASM 库实现这一目标特别感兴趣，这可能对未来的访问者有用。记住其他答案对与“goto”语句相关的广义假设发出警告的警告，有一种方法可以做到这一点。（这假设应该检测给定方法中可以“循环”的任何代码分组 - 通常这是一个实际的循环构造，但已经提供了其他（罕见但存在）示例来说明如何发生这种情况。

）你需要做的就是跟踪 ASM 在所谓的“跳转指令”之前访问的“标签”（字节码中的位置）——如果它跳转到的标签已经在上下文中遇到过同样的方法，那么你就有潜力用于循环代码。

我在这里看到的答案和 ASM 行为之间的一个显着区别是，它读取一个简单文件的“循环”跳转命令作为“goto”以外的操作码 - 这可能只是自提出这个问题以来 Java 编译中的变化，但似乎值得注意。

ASM 的基本示例代码如下（这是通过 checkForLoops 方法输入的）：

import org.objectweb.asm.ClassReader;
import org.objectweb.asm.ClassVisitor;
import org.objectweb.asm.Label;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public void checkForLoops(Path classFile) {
    LoopClassVisitor classVisitor = new LoopClassVisitor();

    try (InputStream inputStream = Files.newInputStream(classFile)) {
        ClassReader cr = new ClassReader(inputStream);

        cr.accept(classVisitor, 0);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

public class LoopClassVisitor extends ClassVisitor {

    public LoopClassVisitor() {
        super(Opcodes.ASM7);
    }

    @Override
    public MethodVisitor visitMethod(int access, String name, String descriptor, String signature,
            String[] exceptions) {
        return new LoopMethodVisitor();
    }

}

public class LoopMethodVisitor extends MethodVisitor {

    private List<Label> visitedLabels;

    public LoopMethodVisitor() {
        super(Opcodes.ASM7);

        visitedLabels = new ArrayList<>();
    }

    @Override
    public void visitLineNumber(final int line, final Label start) {
        System.out.println("lnLabel: " + start.toString());

        visitedLabels.add(start);
    }

    @Override
    public void visitLabel(final Label label) {
        System.out.println("vLabel: " + label.toString());

        visitedLabels.add(label);
    }

    @Override
    public void visitJumpInsn(final int opcode, final Label label) {
        System.out.println("Label: " + label.toString());

        if (visitedLabels.contains(label)) {
            System.out.println("Op: " + opcode + ", GOTO to previous command - possible looped execution");
        }
    }

}

您还可以在标签可用时附加行号信息，并在方法访问者中跟踪该信息，以输出检测源内循环的开始和结束。

I know this is an old question - however, there was specific interest in how this would be achievable with the ASM library, and this may be of use to future visitors. Bearing in mind the caveats other answers give warning against generalized assumptions related to the "goto" statement, there is a way to do that. (This assumes that any grouping of code within a given method that can "loop" should be detected - usually this is an actual loop construct, but other (rare, but present) examples have been provided of how this can occur.)

The main thing you'd need to do is keep track of the "labels" (locations in the byte code) that ASM visits prior to what it terms a "jump instruction" - if the label it jumps to has already been encountered in the context of the same method, then you have a potential for looping code.

A notable difference I saw between the answers here and how ASM behaved is that it read the "looping" jump commands for a simple file as opcodes other than "goto" - this may be just changes in Java compilation in the time since this was asked, but seemed worth noting.

The basic example code for ASM is this (this is entered via the checkForLoops method):

import org.objectweb.asm.ClassReader;
import org.objectweb.asm.ClassVisitor;
import org.objectweb.asm.Label;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public void checkForLoops(Path classFile) {
    LoopClassVisitor classVisitor = new LoopClassVisitor();

    try (InputStream inputStream = Files.newInputStream(classFile)) {
        ClassReader cr = new ClassReader(inputStream);

        cr.accept(classVisitor, 0);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

public class LoopClassVisitor extends ClassVisitor {

    public LoopClassVisitor() {
        super(Opcodes.ASM7);
    }

    @Override
    public MethodVisitor visitMethod(int access, String name, String descriptor, String signature,
            String[] exceptions) {
        return new LoopMethodVisitor();
    }

}

public class LoopMethodVisitor extends MethodVisitor {

    private List<Label> visitedLabels;

    public LoopMethodVisitor() {
        super(Opcodes.ASM7);

        visitedLabels = new ArrayList<>();
    }

    @Override
    public void visitLineNumber(final int line, final Label start) {
        System.out.println("lnLabel: " + start.toString());

        visitedLabels.add(start);
    }

    @Override
    public void visitLabel(final Label label) {
        System.out.println("vLabel: " + label.toString());

        visitedLabels.add(label);
    }

    @Override
    public void visitJumpInsn(final int opcode, final Label label) {
        System.out.println("Label: " + label.toString());

        if (visitedLabels.contains(label)) {
            System.out.println("Op: " + opcode + ", GOTO to previous command - possible looped execution");
        }
    }

}

You could additionally attach line number information when available to the labels, and track that within the method visitor, to output where the detect loops start and end within source.

回复收藏 0 原文

~没有更多了~