java中要运行多少个线程?
我有一个绝妙的主意来加快生成 36 个文件所需的时间:使用 36 个线程!!不幸的是,如果我用 36 个线程/会话启动一个连接(一个 j2ssh
连接对象),那么一切都会比一次执行每个线程滞后得多。
现在,如果我尝试创建 36 个新连接(36 个 j2ssh 连接对象),那么每个线程都有一个到服务器的单独连接,要么出现内存不足异常(不知怎的,程序仍然运行,并成功结束其工作,比我执行一个又一个线程的时间慢)。
那么该怎么办呢?如何找到我应该使用的最佳线程数? 因为在启动我的 36 个线程之前 Thread.activeCount() 是 3?!我使用的是联想笔记本电脑 Intel core i5。
I had this brilliant idea to speed up the time needed for generating 36 files: use 36 threads!! Unfortunately if I start one connection (one j2ssh
connection object) with 36 threads/sessions, everything lags way more than if I execute each thread at a time.
Now if I try to create 36 new connections (36 j2ssh
connection objects) then each thread has a separate connection to server, either i get out of memory exception (somehow the program still runs, and successfully ends its work, slower than the time when I execute one thread after another).
So what to do? how to find the optimal thread number I should use?
because Thread.activeCount()
is 3 before starting mine 36 threads?! i'm using Lenovo laptop Intel core i5.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
您可以使用 ExecutorService 将其缩小到更合理的线程数。您可能想使用接近可用处理器核心数量的东西,例如:
You could narrow it down to a more reasonable number of threads with an
ExecutorService
. You probably want to use something near the number of processor cores available, e.g:一个好的做法是生成与处理器中的核心数量相等的线程。我通常使用 Executors.fixedThreadPool(numOfCores) 执行程序服务,并不断从我的作业队列中向其提供作业,很简单。 :-)
A good practice would be to spawn threads equivalent to the number of cores in your processor. I normally use a
Executors.fixedThreadPool(numOfCores)
executor service and keep feeding it the jobs from my job queue, simple. :-)您的 Intel i5 有两个核心;超线程使它们看起来像四个。因此,您只能获得四个核心的并行化;其余的线程都是时间切片的。
假设每个线程 1MB RAM 仅用于线程创建,然后添加每个线程处理文件所需的内存。这将使您了解为什么会出现内存不足错误。您正在处理的文件有多大?您可以看到,如果它们非常大而无法同时将它们存储在内存中,则会遇到问题。
我假设接收文件的服务器可以接受多个连接,因此尝试这一点是有价值的。
我用 1 个线程进行基准测试,然后增加它们,直到发现性能曲线趋于平坦。
Your Intel i5 has two cores; hyperthreading makes them look like four. So you only get four cores' worth of parallelization; the rest of your threads are time sliced.
Assume 1MB RAM per thread just for thread creation, then add the memory that each thread requires to process the file. That will give you an idea about why you're getting out of memory errors. How big are the files you're dealing with? You can see that you'll have a problem if they're very large to have them in memory at the same time.
I'll assume that the server receiving the files can accept multiple connections, so there's value in trying this.
I'd benchmark with 1 thread and then increase them until I found that the performance curve was flattening out.
使用比机器上的核心数量更多的线程只会减慢整个过程。它会加速直到你达到这个数字。
Using more threads than the number of cores on your machine is going only to slow down the whole process. It will speed up till you reach this number.
蛮力:逐步分析。逐渐增加线程数并检查性能。由于连接数只有 36 个,所以应该很容易
Brute force: Profile incrementally. Increase the number of threads gradually and check the performance. As the number to connections is just 36, its should be easy
您需要了解,如果您创建 36 个线程,您仍然拥有一两个处理器,并且大多数时间都会在威胁之间切换。
我会说你增加一点线程,比如说 6 并查看行为。然后从那里开始
You need to understand that if you create 36 threads you still have one or two processors and it would be switching between threats most of the time.
I would say you increment the threads a little bit, let's say 6 and see the behavior. And then go from there
将线程数量调整为机器尺寸的一种方法是使用
One way to tune the numebr of threads to the size of the machine is to use
首先你必须找出瓶颈在哪里。
如果是SSH连接,并行打开多个连接通常没有帮助。如果需要,最好在一个连接上使用多个通道。
如果是磁盘IO,创建多个线程写入(或读取)仅当它们访问不同的磁盘时才有帮助(这种情况很少见)。但是,当您在一个线程中等待磁盘 IO 时,您可以让另一个线程执行 CPU 密集型操作。
如果是 CPU,并且您有足够的空闲内核,那么更多线程会有所帮助。更重要的是,如果他们不需要访问公共数据。但是,线程多于核心(+一些执行 IO 的线程)仍然没有帮助。 (另请记住,您的服务器上通常还有其他进程。)
First you have to find out where the bottle neck is.
If it is the SSH connection, it usually does not help to open multiple connections in parallel. Better use multiple channels on one connection, if needed.
If it is the disk IO, creating multiple threads writing (or reading) only helps if they are accessing different disks (which is seldom the case). But you could have another thread doing CPU-bound things while you are waiting on your disk IO in one thread.
If it is the CPU, and you have enough idle cores, more threads can help. Even more, if they don't need to access common data. But still, more threads than cores (+ some threads doing IO) does not help. (Also take in mind that usually there are other processes on your server, too.)
确保创建的线程数不超过处理单元数,否则上下文切换可能会产生比并发性获得的开销更多的开销。另请记住,因此您只有 1 个 HDD 和 1 个 HDD 控制器,我怀疑多线程在这里对您有帮助。
Be sure you don't create more threads than you have processing units or you are likely to create more overhead with context switching than you gain in concurrency. Also remember that you only have 1 HDD and 1 HDD controller as a result, I doubt multithreading is going to help you at all here.