Tomcat压测超时

发布于 2024-12-15 19:09:22 字数 1482 浏览 0 评论 0原文

我目前正在调查以下系统上的问题:

  • 3.2 GHz 8 核机器,24 GB 内存
  • Debian 6.0.2
    • ulimit -n 4096
    • ulimit -Sn 4096
    • ulimit -Hn 65535
  • Tomcat 6.0.28
    • -Xmx20g
  • MySQL 5.0.51a(通过 hibernate 和一些手动 JDBC 查询)
    • 还有很大的缓存空间

我正在远程测试对服务器的最常见请求,每分钟 2000 个请求。测试工具是最新的jMeter。平均响应时间约为 65 毫秒,最小值为 35 毫秒,最大值为 4000 毫秒(在极少数情况下,但有其原因)。

据我观察 htop,系统规格足以满足每分钟至少 3 倍以上的请求。 (平均 CPU:25%,RAM:22GB 中的 5 个)服务器本身始终可访问。 (在运行测试时不断对其进行 ping 操作。)

重要的是,每个请求都会向本地 tomcat 产生 3 个额外请求,其中第二个请求最终获取所需的数据,最后一个请求用于统计: jMeter(1) ->; RESTeasy-服务(2) -> ?-服务(2)->数据服务(2) -(新线程)> Statistic-Service(2)

(1) 是我的 jMeter 测试服务器,远离 (2),即 tomcat 服务器。是的,这个架构可能有点奇怪,但这不是我的错。 ^^

我在 server.xml 中将线程管理切换为池。将最大线程数从默认的 200 个增加到 1000 个,并将空闲线程数从 4 个增加到 10 个。我注意到,并发线程数从未减少,反而稳步上升到 tomcat 的最大线程数似乎。当 tomcat 停止时,htop 报告有 160 个线程。刚开始时大约460。 (服务似乎启动了一些...)在以每分钟 2000 个请求访问服务器几个小时(有时更短)后,htop 表示有 1400 个任务。这似乎是我开始在 jMeter 中遇到超时的时候。由于这非常耗时,我没有观看一千遍,因此不能保证这就是原因,但这几乎就是发生的事情。

主要问题:

  1. 数学告诉我,并发使用的线程数永远不应超过 600 左右。(34 个请求 * 4 个请求 * 4 秒 = 544,甚至更少,但估计 600 应该没问题)。据我理解线程池的想法,未使用的线程应该在空闲太长时间时释放并停止。我还有办法获得一千个空闲(?)线程吗?这可以吗?

  2. 在请求处理器之一中手动启动的线程是否会拒绝要释放的 tomcat 线程?

  3. 是否应该有任何日志消息告诉我 tomcat 无法为请求创建/获取线程?

  4. 还有其他想法吗?我在这方面工作了太长时间,现在 tomcat 耗尽了它的线程池似乎是这些奇怪的超时的唯一有效原因。但也许有人有另一个提示。

预先感谢,特别是如果你最终能把我从这个困境中拯救出来......

I'm currently investigating issues on the following system:

  • 3.2 GHz 8-core machine, 24 GB ram
  • Debian 6.0.2
    • ulimit -n 4096
    • ulimit -Sn 4096
    • ulimit -Hn 65535
  • Tomcat 6.0.28
    • -Xmx20g
  • MySQL 5.0.51a (through hibernate and a few manual JDBC queries)
    • also pretty much room for caching

I'm testing the most common requests to the server with 2000 requests per minute remotely. Testing tool is latest jMeter. The average response time is around 65 ms, min is 35 and max is 4000ms (in rare cases, but has it's reason).

As far as I watched htop, the system specs are sufficient for at least 3 times more request per Minute. (Avg. CPU: 25%, RAM: 5 of 22GB) The server itself is accessible all the time. (Pinging it constantly while running the test.)

Important is the fact, that each request results in 3 additional requests to the local tomcat where the second finally gets the required data and the last is for statistics:
jMeter(1) -> RESTeasy-Service(2) -> ?-Service(2) -> Data-Service(2) -(new Thread)> Statistic-Service(2)

(1) is my jMeter test server and distant from (2), which is the tomcat server. Yes, the architecture might be a little weird, but that's not my fault. ^^

I switched the thread management to pool in server.xml. Set 1000 max threads up from default 200 and 10 idle up from 4. What I noticed is that the number of concurrent threads as good as never decreases, instead steadily rises up to tomcat's max it seems. htop reports 160 Threads while tomcat is stopped. About 460 when it's started freshly. (Services seem to start a few...) After a few hours (sometimes less) of hitting the server with 2000 requests per minute htop says there are 1400 tasks. This seems to be the point when I start to get timeouts in jMeter. As this is extremely time consuming I did not watch it a thousand times and therefore can't garantuee this is the cause, but that's pretty much what happens.

Primary questions:

  1. Math tells me that the concurrently used thread count should never ever exceed about 600. (34 requests * 4 requests * 4 seconds = 544, even less, but estimated 600 should be fine). As far as I understand the idea of thread pooling, unused threads should be released and stopped when idle for too long. Is there still a way I could get a thousand idling(?) threads? And is this ok?

  2. Could a thread started manually in one of the request processors deny the tomcat threads to be released?

  3. Shouldn't there be any log message telling me that tomcat could not create/fetch a thread for a request?

  4. Any other ideas? I'm working on this for far too long and now tomcat exhausting it's thread pool seems the only valid reason for these weird timeouts. But maybe somebody has another hint.

Thanks in advance especially if you can finally save me from this...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

不知在何时 2024-12-22 19:09:22

经过数小时和数天的令人兴奋的测试后,我发现当 Tomcat 达到其线程限制,而我们正处于这 3 个本地连接开放的中间时,就会发生超时。我猜想,如果它一旦达到该限制,一个线程就会等待另一个线程打开,而前一个线程没有关闭时,这种情况就不会发生。在德语中我称之为 Teufelskreis。 ^^

不管怎样,解决方案是将最大线程数提高到一个荒谬的高数字:

我知道这不应该是正确的做法,但不幸的是,我们都知道我们的架构有点不切实际,没有人有时间对其进行更改。

希望它对某人有帮助。 =)

After hours and days of mind-blowing I found that the timeouts happen when Tomcat reaches it's thread limit while we're in the middle of those 3 local connection openings. I guess if it once reaches that limit one thread is waiting for another to open which will not happen while the previous do not close. In German I'd call that Teufelskreis. ^^

Whatever, solution was raise max threads to a ridiculous high number:

<Executor name="tomcatThreadPool" namePrefix="catalina-exec-" maxThreads="10000" minSpareThreads="10"/>

I know that this should not be the way to go, but unfortunately we all here know that our architecture is somewhat impractical and nobody got the time to change something about it.

Hope it helps somebody. =)

心清如水 2024-12-22 19:09:22

我想,这个问题需要了解底层的 HTTP/1.1 或 HTTP/1.1 keep active 连接。

如果您将其用于 REST Web 服务,您可能需要将连接器配置中的 maxKeepAliveRequests 参数设置为 1。

    <Connector port="8080" protocol="HTTP/1.1" 
           connectionTimeout="20000"
           maxKeepAliveRequests="1" 
           redirectPort="8443" />

此设置可以在 $CATALINA_HOME/conf/server.xml 中找到。

I guess, this issue needs the understanding of underlying HTTP/1.1 or HTTP/1.1 keep alive connection.

If you are using it for REST web service, probably you want to set the maxKeepAliveRequests parameter in your connector configuration to 1.

    <Connector port="8080" protocol="HTTP/1.1" 
           connectionTimeout="20000"
           maxKeepAliveRequests="1" 
           redirectPort="8443" />

This setting can be found in your $CATALINA_HOME/conf/server.xml.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文