是什么导致从 Websphere 内部对 Oracle 的 JDBC 调用出现峰值？

发布于 2024-12-09 05:09:58 字数 532 浏览 1 评论 0原文

我想知道是否有人可以阐明以下问题：

我们发现，在 AIX 上的 Websphere 6.1 上运行的基于 Spring 2.5.6 的 Web 服务中，对 Oracle 64 位 10.2.0.5 的调用的 JDBC 调用出现峰值。 0 JDBC 驱动程序版本为 10.2.0.3.0。

我们使用单个线程访问数据库，Web 服务的平均响应时间为 16 毫秒，但我们看到 11 个大约 1 秒或更长的峰值（其中 5 分钟内约有 11,000 个调用）。 Introscope 告诉我们，大约一半的峰值是由“select 1 from Dual”（Websphere 连接池用于验证连接）引起的。

在数据库方面，我们跟踪了 Websphere 连接池创建的会话，没有一个会话不表明数据库内部存在任何峰值。

关于可能导致这些峰值的原因有什么想法/建议吗？

编辑：

我们的连接池设置有 20 个连接，监控显示仅使用了 1 个连接。

EDIT2：

我们已将 Oracle JDBC 驱动程序升级到 10.2.0.5，没有任何区别。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

泪冰清 2024-12-16 05:09:58

也许是游泳池的尺寸不合适。

5 分钟（300 秒）内 11,000 个呼叫意味着每秒 37 个呼叫。每个连接平均 0.016 秒意味着每个连接可以处理 2,313 个调用。 4-5 的池大小应该能够处理流量。我不知道如果请求最终等待连接可用，其中一个查询是否运行得有点长。

池将执行“SELECT 1 FROM DUAL”查询来检查连接是否有效且可用。

您可以尝试增加池的大小或查看一些其他参数来控制池对连接执行的操作以确保其处于活动状态。

回复收藏 0 原文

小姐丶请自重 2024-12-16 05:09:58

这个问题的答案最终与 WebSphere 或 Oracle 无关，而是一个很好的老式网络配置问题，导致 WebSphere 服务器和 Oracle RAC 集群之间的 TCP 重新传输超时。

为了获得该诊断结果，我在测试运行之前和之后查看了 netstat -p tcp 的输出，发现

retransmit timeouts

统计数据正在增加。现在可以使用以下命令查看重传超时算法配置：

$ no -a
...
                 rto_high = 64
               rto_length = 13
                rto_limit = 7
                  rto_low = 1

这表明重传超时将持续 1 到 64 秒，并且会逐渐减少，这解释了为什么我们看到了 1 秒、2 秒、4 秒的峰值， 10 秒和 22 秒，但与这些峰值相差无几（即没有 6 秒峰值）。

修复网络配置后，问题就消失了。

The answer to this problem ended up not being related to WebSphere or Oracle but was a good old fashioned network configuration problem which resulted in TCP retransmission timeouts between the WebSphere server and the Oracle RAC cluster.

In order to arrive at that diagnostic I was looking at the output of netstat -p tcp before and after a test run and found that the

retransmit timeouts

stat was increasing. Now the Retransmission Timeout Algorithm configuration can be viewed using:

$ no -a
...
                 rto_high = 64
               rto_length = 13
                rto_limit = 7
                  rto_low = 1

Which indicates that the retransmission timeouts will take between 1 and 64 seconds and will back-off increasingly, which explains why we've been seeing spikes of 1 second, 2 seconds, 4 second, 10 seconds and 22 seconds but nothing away from these peaks (i.e. no 6 second spike).

Once the network config was fixed, the problem went away.

回复收藏 0 原文