同时启动进程比交错运行要慢;为什么?
我正在评估具有 16GB RAM 的 8 核机器上的实验系统设置的性能。我有两个正在运行的主内存 Java RDBMS (hsqldb),并且针对每个运行一个 TPCC 客户端(源自 jTPCC/BenchmarkSQL)。
我有脚本来启动事物,因此例如 hsqldb 实例的启动方式为:
./hsqld.bash 0 &
./hsqld.bash 1 &
如果我几乎同时启动客户端:
./hsql-tpcc.bash 0 &
./hsql-tpcc.bash 1 &
那么每个客户端的初始速率约为 500-1000 tpmC (这基本上是每分钟的事务数) ),然后快速(不到一秒)稳定在 200-250 tpmC 左右的速率。 OTOH,如果我在启动第二个客户端之前等待一两秒:
./hsql-tpcc.bash 0 &
sleep 1
./hsql-tpcc.bash 1 &
那么每个客户端都以 2500+ tpmC 运行。等待超过一秒并没有什么区别。
这很奇怪,因为客户端 0 只与服务器 0 通信,而客户端 1 只与服务器 1 通信。目前还不清楚为什么会出现如此严重的性能干扰。
我认为这可能是由于客户端的 CPU 调度程序亲和力造成的,但在缓慢运行时它们仅占用大约 1-3% 的单个核心(快速运行时为 20-25%)。另一个怀疑是客户端的 NUMA 绑定(同一内存节点上的内存争用),但机器显然只有 1 个内存节点(只有 /sys/devices/system/node/node0),而且每个客户端只占用 0.8%的记忆。
这似乎也不是由于 hsqldb 实例的 CPU 绑定造成的,因为只需重新启动客户端(并等待/不等待一秒钟)即可看到快速和慢速行为,从而使相同的 hsqldb 实例在两者上运行(即hsqldb 不必重新启动)。 hsqldb 慢时占用 4-8% CPU,快时占用 80% CPU,内存 4.3%。
为什么会发生这种情况还有其他想法吗?不涉及磁盘 IO,而且我还没有耗尽系统内存。提前致谢。其他相关信息如下:
$ uname -a
Linux hammer.csail.mit.edu 2.6.27.35-170.2.94.fc10.x86_64 #1 SMP Thu Oct 1 14:41:38 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
I'm evaluating the performance of an experimental system setup on an 8-core machine with 16GB RAM. I have two main-memory Java RDBMSs (hsqldb) running, and against each of these I run a TPCC client (derived from jTPCC/BenchmarkSQL).
I have scripts to launch things, so e.g. the hsqldb instances are started with:
./hsqld.bash 0 &
./hsqld.bash 1 &
If I start the clients at nearly the same time:
./hsql-tpcc.bash 0 &
./hsql-tpcc.bash 1 &
then each of those clients has a spiked initial rate at around 500-1000 tpmC (this is basically transactions per minute), then quickly (in less than a second) settles to a rate of around 200-250 tpmC. OTOH, if I wait for a second or two before starting the second client:
./hsql-tpcc.bash 0 &
sleep 1
./hsql-tpcc.bash 1 &
then each of the clients runs at 2500+ tpmC. Waiting for more than a second doesn't make any more difference.
This is strange because client 0 just talks to server 0 and client 1 just talks to server 1. It's unclear why there's such a dramatic performance interference.
I thought this may be due to CPU scheduler affinity of the clients, but they take only about 1-3% of a single core when running slowly (20-25% when running quickly). Another suspicion was in the clients' NUMA bindings (memory contention on same memory node), but the machine has apparently just 1 memory node (there's only /sys/devices/system/node/node0), and furthermore each client takes just 0.8% of memory.
It also doesn't seem due to CPU bindings for the hsqldb instances, since both fast and slow behaviors can be seen just by restarting the clients (and waiting/not waiting for a second), leaving the same hsqldb instances running across both (i.e. hsqldb doesn't have to be restarted). hsqldb takes 4-8% CPU when slow, 80% CPU when fast, and 4.3% mem.
Any other ideas why this could be happening? There's no disk IO involved, and I'm not close to exhausting the system's memory. Thanks in advance. Other relevant info follows:
$ uname -a
Linux hammer.csail.mit.edu 2.6.27.35-170.2.94.fc10.x86_64 #1 SMP Thu Oct 1 14:41:38 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在测试之前,您的“两个主内存 Java RDBMS (hsqldb)”已经运行了多长时间?如果您在测试前开始使用它们,请先尝试将它们预热一下。让热点做它的事情,并完成所有的
if (first_time) { do_initialization(); }
数据库中的代码,以便垃圾收集器可以稳定下来。此外,同时启动两件事(无论它们是什么)至少意味着两者都试图同时完成所有初始化工作(分配内存、在交换中分配页面、查找和加载库等)。 )。因此,两个程序在其生命周期的最初几毫秒都处于 I/O 争用中。
How long have your "two main-memory Java RDBMSs (hsqldb)"'s been running before the test? If you start them right before the test, try warming them up a bit first. Let hotspot do it's thing, and get through all of the
if (first_time) { do_initialization(); }
code in the db's so the garbage collector can settle down.Also, starting two things (no matter what they are) at the same time means that minimally, both are trying to do all of their init work at the same time (allocate memory, allocate pages in swap, find and load libraries, etc.). So both programs spend the first milliseconds of their lives in I/O contention.