Java 性能技巧

发布于 2024-07-23 12:18:22 字数 365 浏览 3 评论 0 原文

我有一个程序从 C 移植到 Java。 这两个应用程序都使用快速排序来排序一些分区数据(基因组坐标)。

Java 版本运行速度很快,但我想让它更接近 C 版本。 我使用的是 Sun JDK v6u14。

显然,我无法与 C 应用程序相媲美,但我想了解如何才能在合理的范围内(在环境的限制内)获得尽可能多的性能。

我可以做哪些事情来测试应用程序不同部分的性能、内存使用情况等? 具体来说我会做什么?

另外,我可以(一般)实施哪些技巧来更改类和变量的属性和组织,减少内存使用并提高速度?

编辑:我正在使用 Eclipse,并且显然更喜欢任何第三方工具的免费选项。 谢谢!

I have a program I ported from C to Java. Both apps use quicksort to order some partitioned data (genomic coordinates).

The Java version runs fast, but I'd like to get it closer to the C version. I am using the Sun JDK v6u14.

Obviously I can't get parity with the C application, but I'd like to learn what I can do to eke out as much performance as reasonably possible (within the limits of the environment).

What sorts of things can I do to test performance of different parts of the application, memory usage, etc.? What would I do, specifically?

Also, what tricks can I implement (in general) to change the properties and organization of my classes and variables, reducing memory usage and improving speed?

EDIT : I am using Eclipse and would obviously prefer free options for any third-party tools. Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

柠檬心 2024-07-30 12:18:22

不要试图比 jvm 更聪明。

特别是:

  • 不要试图避免对象创建
    为了性能

  • 使用不可变对象,其中
    适用。

  • 使用对象的范围
    正确,以便 GC 能够完成其任务
    job.

  • 在你想要的地方使用原语
    原语(例如不可空 int
    与可空整数相比)

  • 使用内置算法和数据结构

  • 处理并发时使用java.util.concurrent包。

  • 正确性胜过性能。 首先做对,然后测量,然后使用分析器测量,然后优化。

do not try to outsmart the jvm.

in particular:

  • don't try to avoid object creation
    for the sake of performance

  • use immutable objects where
    applicable.

  • use the scope of your objects
    correctly, so that the GC can do its
    job.

  • use primitives where you mean
    primitives (e.g. non-nullable int
    compared to nullable Integer)

  • use the built-in algorithms and data structures

  • when handing concurrency use java.util.concurrent package.

  • correctness over performance. first get it right, then measure, then measure with a profiler then optimize.

隱形的亼 2024-07-30 12:18:22

显然,轮廓轮廓轮廓。 对于 Eclipse,有 TPTP。 这是一篇关于 Eclipse TPTP 插件 的文章。 Netbeans 有自己的分析器jvisualvm 作为独立工具非常好。 (整个 dev.java.net 服务器目前似乎已关闭,但它是一个非常活跃的项目。)

要做的第一件事是使用库排序例程,Collections.sort; 这将要求您的数据对象可比较 。 这可能足够快并且肯定会提供良好的基线。

一般提示:

  • 避免使用不需要的锁(您的 JVM 可能已经优化了这些锁)
  • 使用 StringBuilder (不是 StringBuffer 因为我刚才提到的那个锁,而不是连接 String 对象
  • 做任何你可以的 final ; 如果可能的话,让你的类完全不可变
  • 如果你没有在循环中改变变量的值,尝试将它提升出来,看看它是否会产生影响(JVM可能已经为你做到了这一点)
  • ArrayList< /a> (甚至是一个数组),因此您访问的内存是连续的,而不是像 LinkedList
  • 快速排序可以并行化; 考虑这样做(请参阅快速排序并行化
  • 尽可能减少数据的可见性和生存时间尽可能(但不要扭曲你的算法来做到这一点,除非分析表明这是一个巨大的胜利)

Obviously, profile profile profile. For Eclipse there's TPTP. Here's an article on the TPTP plugin for Eclipse. Netbeans has its own profiler. jvisualvm is nice as a standalone tool. (The entire dev.java.net server seems to be down at the moment, but it is very much an active project.)

The first thing to do is use the library sorting routine, Collections.sort; this will require your data objects to be Comparable. This might be fast enough and will definitely provide a good baseline.

General tips:

  • Avoid locks you don't need (your JVM may have already optimized these away)
  • Use StringBuilder (not StringBuffer because of that lock thing I just mentioned) instead of concatenating String objects
  • Make anything you can final; if possible, make your classes completely immutable
  • If you aren't changing the value of a variable in a loop, try hoisting it out and see if it makes a difference (the JVM may have already done this for you)
  • Try to work on an ArrayList (or even an array) so the memory you're accessing is contiguous instead of potentially fragmented the way it might be with a LinkedList
  • Quicksort can be parallelized; consider doing that (see quicksort parallelization)
  • Reduce the visibility and live time of your data as much as possible (but don't contort your algorithm to do it unless profiling shows it is a big win)
[旋木] 2024-07-30 12:18:22

使用探查器:

使用提供商提供的最新版本的 JVM。 顺便说一句,Sun 的 Java 6 update 14 确实带来了性能改进

测量您的 GC 吞吐量并为您的工作负载选择最佳垃圾收集器

Use a profiler:

Use the latest version of JVM from your provider. Incidentally Sun's Java 6 update 14 does bring performance improvements.

Measure your GC throughput and pick the best garbage collector for your workload.

染墨丶若流云 2024-07-30 12:18:22

不要过早优化。

衡量性能,然后优化。

尽可能使用最终变量。 它不仅允许 JVM
进行更多优化,同时也让您
代码更容易阅读和维护。

如果您使对象不可变,则不必克隆它们。

首先通过更改算法进行优化,然后通过更改实现来进行优化。

有时您需要诉诸旧式技术,例如循环展开或缓存预先计算的值。 记住它们,即使它们看起来不好看,它们也很有用。

Don't optimize prematurely.

Measure performance, then optimize.

Use final variables whenever possible. It will not only allow JVM
to optimize more, but also make your
code easier to read and maintain.

If you make your objects immutable, you don't have to clone them.

Optimize by changing the algorithm first, then by changing the implementation.

Sometimes you need to resort to old-style techniques, like loop unrolling or caching precalculated values. Remember about them, even if they don't look nice, they can be useful.

锦爱 2024-07-30 12:18:22

jvisualvm 现在随 JDK 6 一起提供 - 这就是上面引用的链接不起作用的原因。 只需输入“jvisualvm ”,其中 是您要跟踪的进程的 ID。 您将看到堆的使用情况,但看不到堆中的内容。

如果是长时间运行的进程,可以在运行时打开-server选项。 有很多调整选项可供您使用; 这只是其中之一。

jvisualvm ships with JDK 6 now - that's the reason the link cited above doesn't work. Just type "jvisualvm <pid>", where <pid> is the ID of the process you want to track. You'll get to see how the heap is being used, but you won't see what's filling it up.

If it's a long-running process, you can turn on the -server option when you run. There are a lot of tuning options available to you; that's just one.

鸠书 2024-07-30 12:18:22

还可以尝试调整虚拟机的运行时参数 - 例如,最新版本的虚拟机包含以下标志,可以提高某些情况下的性能。

-XX:+DoEscapeAnalysis 

Also try tweaking the runtime arguments of the VM - the latest release of the VM for example includes the following flag which can improve performance in certain scenarios.

-XX:+DoEscapeAnalysis 
就此别过 2024-07-30 12:18:22

首先需要注意的是 - 在开始任何优化工作之前,请确保您已完成适当的分析或基准测试。 结果通常会启发您,并且几乎总是可以让您在优化无关紧要的事情时节省大量浪费的精力。

假设你确实需要它,那么你可以在Java中获得与C相当的性能,但这需要付出一些努力。 你需要知道 JVM 在哪里做“额外的工作”并避免这些。

特别是:

  • 避免不必要的对象创建。 虽然 JVM 堆和 GC 非常快速和高效(可能是世界上最好的,并且几乎肯定比您自己用 C 语言编写的任何东西都要好),但它仍然是堆分配,并且通过避免首先使用堆来击败它。地点(堆栈或寄存器分配)
  • 避免装箱原语。 您希望使用 double 而不是 Double
  • 对任何大数据块使用原始数组。 Java 原始数组基本上与 C/C++ 数组一样快(它们确实有额外的边界检查,但这通常是微不足道的)
  • 避免任何同步 - Java 线程相当不错,但它仍然是您可能需要的开销不需要。 为每个线程提供自己的数据来处理。
  • 利用并发 - Java 的并发支持非常好。 您不妨使用所有核心! 这是一个很大的话题,但有很多好书/教程可用。
  • 如果您有一些非常具体的要求,例如支持某些专门的排序/搜索算法,请为某些类型的数据使用专门的集合类。 您可能需要自己动手,但也有一些优秀的库可以满足您的需求,提供高性能集合类 - 请参阅 Javoltion
  • 避免大的类层次结构 - 这是性能代码中的设计味道。 每一层抽象都会消耗您的开销。 非常快的 Java 代码通常最终看起来很像 C...
  • 使用静态方法 - JIT 可以非常好地优化这些方法。 它通常会内联它们。
  • 使用最终具体类 - 同样,JIT 可以通过避免虚函数调用来很好地优化这些类。
  • 生成您自己的字节码 - 如果其他方法都失败,如果您希望 JVM 获得绝对最大性能,这可能是一个可行的选择。 如果您需要编译自己的 DSL,则特别有用。 使用类似 ASM 的内容。

First caveat - make sure you have done appropriate profiling or benchmarking before embarking on any optimisation work. The results will often enlighten you, and nearly always save you a lot of wasted effort in optimising something that doesn't matter.

Assuming that you do need it, then you can get performance comparable to C in Java, but it takes some effort. You need to know where the JVM is doing "extra work" and avoid these.

In particular:

  • Avoid unnecessary object creation. While the JVM heap and GC is extremely fast and efficient (probably the best in the world, and almost certainly better than anything you could roll yourself in C), it is still heap allocation and that will be beaten by avoiding the heap in the first place (stack or register allocation)
  • Avoid boxed primitives. You want to be using double and not Double.
  • Use primitive arrays for any big chunks of data. Java primitive arrays are basically as fast as C/C++ arrays (they do have an additional bounds check but that is usually insignificant)
  • Avoid anything synchronized - Java threading is pretty decent but it is still overhead that you may not need. Give each thread it's own data to work on.
  • Exploit concurrency - Java's concurrency support is very good. You might as well use all your cores! This is a big topic but there are plenty of good books / tutorials available.
  • Use specialised collection classes for certain types of data if you have some very specific requirements, e.g. supporting some specialised sorting/search algorithms. You may need to roll your own, but there are also some good libraries with high performance collection classes available that may fit your needs - see e.g. Javoltion
  • Avoid big class heirarchies - this is a design smell in performance code. Every layer of abstraction is costing you overhead. Very fast Java code will often end up looking rather like C....
  • Use static methods - the JIT can optimise these extremely well. It will usually inline them.
  • Use final concrete classes - again, the JIT can optimise these very well by avoiding virtual function calls.
  • Generate your own bytecode - if all else fails, this can be a viable option if you want the absolute maximum performance out of the JVM. Particularly useful if you need to compile your own DSL. Use something like ASM.
丢了幸福的猪 2024-07-30 12:18:22

如果您的算法占用大量 CPU,您可能需要考虑利用并行化。 您也许可以在多个线程中进行排序,并稍后将结果合并回来。

然而,这并不是一个可以轻易做出的决定,因为编写并发代码很困难。

If your algorithm is CPU-heavy, you may want to consider taking advantage of parallelisation. You may be able to sort in multiple threads and merge the results back later.

This is however not a decision to be taken lightly, as writing concurrent code is hard.

像你 2024-07-30 12:18:22

不能使用 Java 库中包含的排序函数吗?

您至少可以看看两个排序功能之间的速度差异。

Can't you use the sort functions that are included in the Java library?

You could at least look at the speed difference between the two sorting functions.

梦里南柯 2024-07-30 12:18:22

从方法上讲,您必须分析应用程序,然后了解程序的哪些组件是时间和内存密集型的:然后仔细查看这些组件,以提高它们的性能(请参阅阿姆达尔定律)。

从纯粹的技术角度来看,您可以使用一些 java-to-nativecode 编译器,例如 Excelsior 的 jet,但我必须指出,最近的 JVM 非常快,因此 VM 不应产生重大影响。

Methodolically, you have to profile the application and then get an idea of what components of your program are time and memory-intensive: then take a closer look to that components, in order to improve their performances (see Amdahl's law).

From a pure technological POV, you can use some java-to-nativecode compilers, like Excelsior's jet, but I've to note that recent JVM are really fast, so the VM should not impact in a significative manner.

深爱不及久伴 2024-07-30 12:18:22

您的排序代码是仅执行一次(例如在仅排序的命令行实用程序中)还是多次(例如响应某些用户输入进行排序的 Web 应用程序)?

代码执行几次后,性能可能会显着提高,因为如果 HotSpot VM 确定您的代码是热点,它可能会积极优化。

与C/C++相比,这是一个很大的优势。

虚拟机在运行时会优化经常使用的代码,并且做得很好。 因此性能实际上可以超越 C/C++。 真的。 ;)

不过,您的自定义比较器可能是一个优化的地方。

尝试先检查便宜的东西(例如 int 比较),然后再检查更昂贵的东西(例如 String 比较)。 我不确定这些提示是否适用,因为我不知道您的比较器。

使用 Collections.sort(list, comparator) 或 Arrays.sort(array, comparator)。 数组变体会更快一点,请参阅相应的文档。

正如 Andreas 之前所说:不要试图比虚拟机更聪明。

Is your sorting code executing only once, e.g. in a commandline utility that just sorts, or multiple times, e.g. a webapp that sorts in response to some user input?

Chances are that performance would increase significantly after the code has been executed a few times because the HotSpot VM may optimize aggressively if it decides your code is a hotspot.

This is a big advantage compared to C/C++.

The VM, at runtime, optimizes code that is used often, and it does that quite well. Performance can actually rise beyond that of C/C++ because of this. Really. ;)

Your custom Comparator could be a place for optimization, though.

Try to check inexpensive stuff first (e.g. int comparison) before more expensive stuff (e.g. String comparison). I'm not sure if those tips apply because I don't know your Comparator.

Use either Collections.sort(list, comparator) or Arrays.sort(array, comparator). The array variant will be a bit faster, see the respective documentation.

As Andreas said before: don't try to outsmart the VM.

你曾走过我的故事 2024-07-30 12:18:22

除了代码的微优化之外,也许还有其他提高性能的途径。 使用不同的算法来实现您希望程序执行的操作怎么样? 可能是不同的数据结构?

或者用一些磁盘/内存空间来换取速度,或者如果您可以在加载程序期间提前放弃一些时间,您可以预先计算查找表而不是进行计算 - 这样,处理速度就会很快。 即,对其他可用资源进行一些权衡。

Perhaps there are other routes to performance enhancement other than micro-optimization of code. How about a different algorithm to achieve what you wanted your program to do? May be a different data structure?

Or trade some disk/ram space for speed, or if you can give up some time upfront during the loading of your program, you can precompute lookup tables instead of doing calculations - that way, the processing is fast. I.e., make some trade-offs of other resources available.

软的没边 2024-07-30 12:18:22

这就是我会用任何语言做的事情。如果示例如果表明您的排序比较例程在大部分时间都处于活动状态,您可能会找到一种简化它的方法。 但也许时间都去别处了。 在修复任何东西之前,先进行诊断,看看哪里出了问题。 很可能,如果你修复了最重要的事情,那么其他事情也将是最重要的事情,依此类推,直到你真正获得了相当好的加速。

Here's what I would do, in any language. If samples show that your sort-comparison routine is active a large percentage of the time, you might find a way to simplify it. But maybe the time is going elsewhere. Diagnose first, to see what's broken, before you fix anything. Chances are, if you fix the biggest thing, then something else will be the biggest thing, and so on, until you've really gotten a pretty good speedup.

花期渐远 2024-07-30 12:18:22

配置文件并调整您的 Java 程序和主机。 大多数代码遵循 80/20 规则。 这就是 80% 时间的 20% 代码,因此找到那 20% 并使其尽可能快。 例如,文章调整 Java 服务器 (http://www.infoq.com/articles/ Tuning-Java-Servers)提供了从命令行向下钻取的描述,然后使用 Java Flight recorder、Eclipse Memory Analyser 和 JProfiler 等工具隔离问题。

Profile and tune your java program and host machine. Most code follows 80/20 rule. That is 20% of code 80% of time, so find that 20% and make it as fast as possible. For example, the article Tuning Java Servers (http://www.infoq.com/articles/Tuning-Java-Servers) provides a description of drill down from command line and then isolate the problem using tools like Java Flight recorder, Eclipse Memory Analyser, and JProfiler.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文