Java 应用程序的多核 CPU 利用率
我有一个程序,可以通过将大文件分成块、对块进行排序并将它们合并到最终排序的文件中来对大文件进行排序。应用程序运行一个线程来从文件加载数据/将数据保存到文件 - 只有一个线程执行 I/O 操作。另外还有两个线程接收块数据,对其进行排序,然后将排序后的数据发送回执行 I/O 的线程。
因此,通常有 4 个线程在运行 - 主线程、加载/保存数据的线程和两个对数据排序的线程。
我认为在执行过程中我会看到 1 个休眠线程(主线程),不占用任何 CPU 时间,以及 3 个活动线程,每个线程使用 1 个 CPU 核心。
当我在具有超线程(24 个 CPU)的双 6 核处理器机器上运行此程序时,我看到所有 24 个 CPU 的加载率为 100%!
最初我认为排序算法是多线程的,但在查看 java 源代码后我发现事实并非如此。
我正在使用简单的 Collections.sort(LinkedList) 对数据进行排序...
以下是一些详细信息:
# java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) # uname -a Linux 2.6.32-28-server #55-Ubuntu SMP Mon Jan 10 23:57:16 UTC 2011 x86_64 GNU/Linux
我正在使用 nmon 来监视处理器负载。
我将不胜感激对这种情况的任何解释以及有关如何控制 CPU 负载的任何建议,因为我这个特定的任务不会为其他应用程序留下 CPU 时间
[更新] 我使用 jvisualvm 来计算线程数 - 它只显示我知道的线程。我还制作了一个简单的测试程序(见下文),它仅运行一个主线程并得到完全相同的结果 - 所有 24 个处理器在代码执行期间几乎 100% 忙碌
public class Test {
public void run(){
Random r = new Random();
int len = r.nextInt(10) + 5000000;
LinkedList<String> list = new LinkedList<String>();
for (int i=0; i<len; i++){
list.add(new String("test" + r.nextInt(50000000)));
}
System.out.println("Inserted " + list.size() + " items");
list.clear();
}
public static void main(String[] argv){
Test t = new Test();
t.run();
System.out.println("Done");
}
}
[更新]
这是我在运行上面的程序时制作的屏幕截图(使用 nmon): http://imageshack.us/photo/my-images/716/cpuload。 .png/
I have a program that sorts big files by splitting them into chunks, sort chunks and merge them into final sorted file. Application runs one thread for loading/saving data from/to file - only one thread does I/O operations. Also there are two more threads that receive chunk data, sort it and then send sorted data back to thread that does I/O.
So in general there are 4 threads running - main thread, thread that loads/saves data and two threads that sort data.
I thought during execution i will see 1 sleeping thread (main) that doesn't take any CPU time and 3 active threads that utilize 1 CPU core each.
When i run this program on dual 6 core processor machine with hyper threading (24 CPUs) i see that ALL 24 CPU's are loaded for 100%!
Initially i thought that sort algorithm is mutithreaded, but after looking into java sources i found that it's not.
I'm using simple Collections.sort(LinkedList) to sort the data...
here are some details:
# java -version java version "1.6.0_26" Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode) # uname -a Linux 2.6.32-28-server #55-Ubuntu SMP Mon Jan 10 23:57:16 UTC 2011 x86_64 GNU/Linux
I was using nmon to monitor processor loading.
I would appreciate any explanation of this case and any advise on how to control CPU loading as i this particular task doesn't leave CPU time for other applications
[UPDATE]
I used jvisualvm to count threads - it shows only threads i know about. Also i made a simple test program (see below) that runs only one main thread and got exactly the same results - all 24 processors are busy almost for 100% during code execution
public class Test {
public void run(){
Random r = new Random();
int len = r.nextInt(10) + 5000000;
LinkedList<String> list = new LinkedList<String>();
for (int i=0; i<len; i++){
list.add(new String("test" + r.nextInt(50000000)));
}
System.out.println("Inserted " + list.size() + " items");
list.clear();
}
public static void main(String[] argv){
Test t = new Test();
t.run();
System.out.println("Done");
}
}
[UPDATE]
Here is the screenshot i made while running the program above (used nmon):
http://imageshack.us/photo/my-images/716/cpuload.png/
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我建议,这是一个 nmon 而不是一个 java 问题,为了解决它,我会看一下
top
命令,它提供了有关每个进程的 cpu 使用情况的信息。我预测以下结果:您将看到一个 java 线程使用接近 100% 的 cpu 时间(这没关系,因为顶部的每个进程百分比相对于一个(虚拟)核心),也许第二个和第三个 java 线程使用了很多更少的 cpu 使用率(I/O 线程)。根据 gc 的选择,您甚至可能会发现一个或多个 gc 线程,但远少于 20 个。然而,HotSpot 不会(甚至据我所知)不会自行并行化顺序任务。
I would suggest, that this is rather a nmon than a java question and to solve it, I would take a peek at the
top
command which provides info about cpu-usage per process. I predict the following result: You will see one java thread using near 100% cpu-time (which is ok, as per-process percentage in top is relative to one (virtual) core), maybe a second and third java thread with much less cpu-usage (the I/O threads). Depending on the choice of the gc you might even spot one or more gc-Threads, however much less than 20.HotSpot however will not (and even cannot to my knowledge) parallelize a sequential task on its own.