将 JVM 字节码往返于文本表示的故障安全方法
我正在寻找一种在 JVM 类文件和文本表示之间往返的故障安全方法。
一项严格的要求是,只要文本表示形式保持不变,生成的往返 JVM 类文件在功能上与原始 JVM 类文件完全相同。
此外,文本表示必须是人类可读和可编辑的。应该可以对文本表示进行小的更改(例如更改文本字符串或类名称等),这些更改会反映在生成的类文件表示中。
最简单的解决方案是使用 Java 反编译器(例如 JAD)来生成文本表示形式,在本例中,文本表示形式只是重新创建的 Java 源代码。然后使用javac生成字节码。然而,考虑到免费 Java 反编译器的现状,这种方法并非在所有情况下都有效。创建混淆的字节码相当容易,该字节码无法在完整的往返类文件/java源/类文件中生存(部分原因是 JVM 字节码和Java 源代码)。
考虑到上述要求,是否有一种故障安全方法来实现 JVM 类文件/文本表示/类文件往返?
更新:在回答之前 - 通过阅读上述所有要求来节省时间和精力,并特别注意:
- “JVM字节码的文本表示”不一定意味着“Java源代码”。
I'm looking for a fail-safe way to round-trip between a JVM class file and a text representation and back again.
One strict requirement is that the resulting round-tripped JVM class file is exactly functionally equivalent to the original JVM class file as long as the text representation is left unchanged.
Furthermore, the text representation must be human-readable and editable. It should be possible to make small changes to the the text representation (such as changing a text string or a class name, etc.) which are reflected in the resulting class file representation.
The simplest solution would be to use a Java decompiler such as JAD to generate the text representation, which in this case would simply be the re-created Java source code. And then use javac to generate the byte-code. However, given the state of the free Java decompilers this approach does not work under all circumstances. It is rather easy to create obfuscated byte-code that does not survive a full round-trip class-file/java-source/class-file (in part because there simply isn't a 1:1 mapping between JVM byte-code and Java source code).
Is there a fail-safe way to achieve JVM class-file/text-representation/class-file round-tripping given the requirements above?
Update: Before answering - save time and effort by reading all the requirements above, and note specifically:
- "Text-representation of JVM bytecode" does not necessarily mean "Java source-code".
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
BCEL 项目 提供了 JasminVisitor 它将把类文件转换为 jasmin 装配。
可以对其进行修改,然后重新组装成类文件。如果不进行任何编辑并且版本保持兼容,则往返应该会产生相同的类文件,但行号映射可能会丢失。如果您需要在往返情况下获得完全相同的副本,您可能需要更改工具以获取代码的某些方面,这些方面也是纯元数据。
jasmin 相当旧,其设计并不便于在汇编中实际编写完整的程序,但对于修改字符串常量表和常量来说,它应该足够了。
The BCEL project provides a JasminVisitor which will convert class files into jasmin assembly.
This can be modified and then reassembled into class files. If no edits are made and the versions are kept compatible the the round trip should result in identical class files except that line number mapping may be lost. If you require a bit for bit identical copy for the round trip case you will likely need to alter the tool to take aspects of the code which are pure meta data as well.
jasmin is rather old and is not designed with ease of actually writing full blown programs in assembly but for modifying string constant tables and constants it should be more than adequate.
贾斯敏和基梅拉?
Jasmin and Kimera?
我写了一个专门为此设计的工具。
Krakatau 反汇编器和汇编器 旨在处理任何有效的类文件,无论多么奇怪。它使用基于 Jasmin 格式的汇编格式,但经过扩展以支持 Jasmin 无法处理的所有类文件功能。它甚至支持 Hotspot 的一些晦涩或未记录的“功能”,例如使用较小宽度的代码属性字段的
45.3
之前的类文件。它可以往返我所知道的任何类文件。结果不会是相同的二进制,但它将具有相同的功能(例如,常量池条目可能会重新排列)。
更新:Krakatau 现在支持类文件的精确二进制往返。传递
-roundtrip
标志将保留常量池条目的顺序等。I've written a tool that's designed for exactly this.
The Krakatau disassembler and assembler is designed to handle any valid classfile, no matter how bizarre. It uses an assembly format based on the Jasmin format, but extended to support all the classfile features that Jasmin can't handle. It even supports some of the obscure or undocumented 'features' of Hotspot, such as pre
45.3
classfiles using smaller widths for the Code attribute fields.It can roundtrip any classfile I know of. The result won't be identical binary wise, but it will have the same functionality (constant pool entries may be rearranged for instance).
Update: Krakatau now supports exact binary roundtripping of classfiles. Passing the
-roundtrip
flag will preserve the order of constant pool entries, etc.看起来 ASM 就是这样做的。 (这与 ShuggyCoUk 的答案相同,但使用了不同的工具。)Jarjar 说它使用 ASM 来完成您正在谈论的那种事情。
Looks like ASM does this. (This is the same sort of answer as ShuggyCoUk's, but with a different tool.) Jarjar says it uses ASM for exactly the sort of thing you're talking about.
不。存在有效的字节码,但没有相应的 Java 程序。
Soot 项目有一个相当复杂的反编译器 - http://www.sable.mcgill.ca/dava/ - 这对于来自 Java 编译器的字节码可能有用。然而,它并不完美。
最好的选择仍然是获取类文件的源代码。
No. There exists valid byte-code without a corresponding Java program.
The Soot project has a quite sophisticated decompiler- http://www.sable.mcgill.ca/dava/ - which may be useful for those byte codes coming from a Java compiler. It is, however, not perfect.
Your best bet is still getting the source code for the class files.