Duff 的设备可以加速 Java 代码吗?
使用现有的 Sun 1.6 编译器和 JRE/JIT,使用 Duff 设备所示例的扩展展开来展开循环是个好主意吗?或者它最终会导致代码混淆而没有任何性能优势?
我使用的 Java 分析工具在逐行 CPU 使用情况方面的信息少于 valgrind,因此我希望通过其他人的经验来增强测量。
请注意,当然,您无法准确对 Duff 的设备进行编码,但您可以进行基本的展开,这就是我想知道的。
short stateType = data.getShort(ptr);
switch (stateType) {
case SEARCH_TYPE_DISPATCH + 16:
if (c > data.getChar(ptr + (3 << 16) - 4)) {
ptr += 3 << 16;
}
case SEARCH_TYPE_DISPATCH + 15:
if (c > data.getChar(ptr + (3 << 15) - 4)) {
ptr += 3 << 15;
}
...
通过许多其他值。
Using the stock Sun 1.6 compiler and JRE/JIT, is it a good idea to use the sort of extensive unroll exemplified by Duff's Device to unroll a loop? Or does it end up as code obfuscation with no performance benefit?
The Java profiling tools I've used are less informative about line-by-line CPU usage than, say, valgrind, so I was looking to augment measurement with other people's experience.
Note that, of course, you can't exactly code Duff's Device, but you can do the basic unroll, and that's what I'm wondering about.
short stateType = data.getShort(ptr);
switch (stateType) {
case SEARCH_TYPE_DISPATCH + 16:
if (c > data.getChar(ptr + (3 << 16) - 4)) {
ptr += 3 << 16;
}
case SEARCH_TYPE_DISPATCH + 15:
if (c > data.getChar(ptr + (3 << 15) - 4)) {
ptr += 3 << 15;
}
...
down through many other values.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是否是一个好主意(不是)并不重要,因为它无法编译。
编辑:JLS 中明确提到了 :
或者,更直白地说(来自同一部分):
编辑:要回答您更(太)普遍的问题,通常不会。通常您应该依赖 JIT。
It doesn't much matter whether it's a good idea (it's not), because it won't compile.
EDIT: This is mentioned explicitly in the JLS:
Or, more bluntly (from the same section):
EDIT: To answer your more (too) general question, usually no. You should generally rely on the JIT.
您忽略了 Java 编译为面向堆栈的虚拟机的字节码这一事实。无论您在 Java 级别尝试什么低级优化技巧,基本上都是无效的。真正的优化发生在 JIT 编译器为目标体系结构生成程序集时,这是一个您在很大程度上既无法控制也无法关心的过程。
相反,您应该在更大的范围内进行优化。让 JIT 编译器处理低级优化。
You are ignoring the fact that Java compiles to bytecodes for a stack-oriented virtual machine. Whatever low-level optimization trick you attempt at the Java level is largely ineffective. The real optimization happens when the JIT compiler produces the assembly for the target architecture, a process that you can neither control nor care about for the most part.
You should instead optimize at a much larger picture. Let the JIT compiler handle the low-level optimizations.