运行Web服务请求的线程池的合理线程数
在Java中创建FixedThreadPool Executor对象时,您需要传递一个参数来描述Executor可以同时执行的线程数。 我正在构建一个服务类,其职责是处理大量电话号码集合。 对于每个电话号码,我需要执行 Web 服务(这是我的瓶颈),然后将响应保存在哈希图中。
为了减少这个瓶颈对我的服务性能的影响,我决定创建 Worker 类来获取未处理的元素并处理它们。 Worker 类实现 Runnable 接口,我使用 Executor 运行 Workers。
可以同时运行的Worker数量取决于Executor FixThreadPool的大小。 线程池的安全大小是多少? 当我使用一些大数字作为参数创建 FixTheradPool 时会发生什么?
When creating an FixedThreadPool Executor object in Java you need to pass an argument describing the number of threads that the Executor can execute concurrently. I'm building a service class that's responsibility is to process a large collections of phone numbers. For each phone number I need to execute web service (that's my bottleneck) and then save response in a hashmap.
To make this bottleneck less harmful to the performance of my service I've decided to create Worker class which fetches unprocessed elements and processes them. Worker class implements Runnable interface and I run Workers using Executor.
The number of Workers that can be run in the same time depends on the size of Executor FixedThreadPool. What is the safe size for a ThreadPool? What can happen when I create FixedTheradPool with some big number as an argument?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
可以考虑的事情是查看
哪些内容可以为系统有意义的线程数量提供一些指导。
Something that could be considered is looking at
which gives some direction on how many threads that would make sense for the system.
如果每个工作线程都需要进行 Web 服务调用,那么池中的线程数量应该受到 Web 服务可以处理的并发请求数量的强烈影响。 任何更多的线程只会压垮 Web 服务。
If each worker thread needs to make a web service call, then the number of threads in your pool should be influenced strongly by how many simultaneous requests your web service can handle. Any more threads than that will do nothing more than overwhelm the web service.
我在某处读到,最佳线程数是核心数 * 25。 .NET 似乎使用此作为 ThreadPool 的默认值。 但是,如果您有大量 Web 服务调用,您最好使用单个线程并检查 Web 服务调用列表以获取响应。 当响应到达时,只需处理该条目并将其从列表中删除。
I have read somewhere that the optimal number of threads is the number of cores * 25. It seems like .NET uses this as default for the ThreadPool. However if you have large numbers of web service calls you'd better use a single thread and check a list of web service calls for a response. When the response has arrived just process the entry and remove it from the list.
如果每个计算相当于对 Web 服务的调用,那么您应该考虑对该服务施加了多少负载/该服务将容忍或服务所有者允许多少个并发连接。 大多数可公开访问的服务都期望任何单个用户一次只能有一个这样的连接。 如果可能,请联系服务所有者以了解其使用政策。 此类连接的数量将决定您可以使用的线程数量。
If each computation is equivalent to a call to a web service, then you should consider how much load you are putting on that service/how many concurrent connections that service will tolerate or would be allowed by the services' owners. Most publicly accessible services would expect only one such connection from any single user at a time. If possible, contact the services' owners for their usage policies. The number of such connections will determine the number of threads you may use.
如果您拥有 Web 服务的开发访问权限,请考虑创建一个批处理函数来在一次呼叫中检查多个电话号码。
在较新的 .NET 中,有一个线程池,它可以根据自己的性能配置文件进行增长和收缩。 不幸的是,Java 的版本要么是固定的,要么根据传入的工作增长到限制。
我们曾经也有过类似的担忧。 我们的解决方案是允许客户根据自己的喜好调整池大小并调整性能。
I/O 操作池大小调整时可以考虑一些网络和数据属性:网络带宽、消息大小、Web 服务的处理时间和类型、本地核心数量。
If you have dev access to the web service, consider creating a batch function to check multiple phone numbers on one call.
In newer .NET there is a ThreadPool which can grow and shrink based on its own performance profile. Unfortunately, Java's version is either Fixed, or grows up to a limit based on the incoming work.
We had once similar concerns. Our solution was to allow the customer ajdust the pool size and tune the performance as he pleases.
There can be some network and data properties considered for the I/O operation pool sizing: network bandwith, message sizes, processing time and style of the web service, number of local cores.
假设 Web 服务是无限可扩展的,并且没有人会关心您向它发送垃圾邮件请求。 我们还假设 Web 服务响应在 1 秒范围内,而本地处理时间为 5 毫秒。
当繁忙线程数量与处理核心相同时,吞吐量会最大化。
在这些假设下,对于任何合理大小的线程池,您将无法最大化多核处理器上的吞吐量。 为了实现每秒最大事务数,您必须打破每个连接的线程模型。 查找前面提到的非阻塞 I/O (NIO) 或异步完成令牌模式的 Java 实现(Windows 中的 IO 完成)。
请注意,为每个创建的线程保留的堆栈内存实际上只是保留的地址空间,而不是实际分配或提交的内存。 当堆栈尝试增长时,会抛出异常,导致堆栈内存按需提交。 结果是它只与 32 位内存管理器真正相关。 对于 64 位内存,您拥有巨大的地址空间,即使您仅使用物理内存支持该空间的一小部分。 至少,这是我理解 Windows 的工作方式,我不确定 Unix 世界。
Let's assume that the web service is infinitely scalable and that nobody is going to care that you are spamming it with requests. Let's also assume that the web service responses are in the 1 second range while the local processing time is 5 milliseconds.
Throughput is maximized when you have the same amount of busy threads as processing cores.
Under these assumptions you are not going to be able to maximize throughput on a multi-core processor for any sane size of thread pool. To achieve maximum transactions per second you have to break the thread per connection model. Look for nonblocking I/O (NIO) mentioned previously or a Java implementation of the Asynchronous Completion Token pattern (IO Completion in Windows).
Note that stack memory that are reserved for every created thread is actually just reserved address space, not actual allocated or committed memory. As the stack tries to grow exceptions are thrown which results in stack memory getting committed on demand. The consequence is that it is only really relevant for 32-bit memory managers. For 64-bit memory, you have a huge address space even though you only back a small part of that space with physical memory. At least, this is how I understand Windows works, I'm not sure about the Unix world.
不要忘记,您创建的每个线程也会对其堆栈大小提出内存需求。 因此,创建线程池将影响进程的内存占用(请注意,某些池在实际需要时才会创建线程,因此在启动时您不会看到任何内存增加)。
该堆栈大小可通过
-Xss
配置(类似于-Xmx
等)。 我相信默认值是每个线程 512Kb。 目前我找不到任何权威人士证实这一点。Don't forget that each thread you create will also make demands on memory for its stack size. So creating a pool of threads will impact the memory footprint of your process (note that some pools don't create the threads until they're actually required, so at startup you won't see any memory increase).
This stack size is configurable via
-Xss
(similar to-Xmx
etc.). I believe the default is 512Kb per thread. At the moment I can't find any authoritative to confirm that.我想知道你是否最好使用 NIO 而不是线程,因为你的限制因素将是 Web 服务服务器 + 网络瓶颈,而不是客户端 CPU。
否则,最多不应超过 Web 服务可以支持的并发连接数。
I wonder if you'd be better off using NIO rather than threads, since your limiting factor will be web service server + network bottleneck, not client CPU.
Otherwise, at most you should not exceed the number of concurrent connections that your web service can support.
如果您正在进行大量计算(例如并行数组操作),那么经验法则就是线程数对应处理器数。
If you are doing heavy computation say for parallel array manipulations then the rule of thumb is having the number of threads for the number of processors.