当前位置：文江博客话题详情

东京内阁 - 达到 100 万后插入速度变慢

发布于 2024-07-14 01:35:36 字数 277 浏览 12 评论 0原文

我正在评估东京内阁表引擎。在达到 100 万条记录后，插入速度明显减慢。批量大小为 100,000，并且在事务内完成。我尝试设置 xmsiz 但仍然没有用。东京内阁有人遇到过这个问题吗？

详细信息

东京内阁 - 1.4.3
Perl 绑定 - 1.23
操作系统：Ubuntu 7.10（Windows XP 之上的 VMWare Player）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

卸妝后依然美 2024-07-21 01:35:36

我也遇到了每个分片大约 100 万条记录的问题（在客户端进行分片，没什么特别的）。我尝试了各种 ttserver 选项，它们似乎没有什么区别，所以我查看了内核端，发现

回声80> /proc/sys/vm/dirty_ratio

（之前的值为 10）给出了很大的改进 - 以下是每分钟打印的数据的总大小（在 8 个分片上，每个分片在其自己的节点上）：

total:  14238792  records,  27.5881 GB size
total:  14263546  records,  27.6415 GB size
total:  14288997  records,  27.6824 GB size
total:  14309739  records,  27.7144 GB size
total:  14323563  records,  27.7438 GB size
(here I changed the dirty_ratio setting for all shards)
total:  14394007  records,  27.8996 GB size
total:  14486489  records,  28.0758 GB size
total:  14571409  records,  28.2898 GB size
total:  14663636  records,  28.4929 GB size
total:  14802109  records,  28.7366 GB size

所以您可以看到改进是按顺序进行的7-8次。此时，每个节点的数据库大小约为 4.5GB（包括索引），并且节点具有 8GB RAM（因此 dirty_ratio 为 10 意味着内核尝试保留少于约 800MB 的脏数据）。

接下来我要尝试的是 ext2（当前：ext3）和 noatime，并将所有内容保存在 ramdisk 上（这可能会浪费两倍的内存，但可能是值得的）。

I hit a brick wall around 1 million records per shard as well (sharding on the client side, nothing fancy). I tried various ttserver options and they seemed to make no difference, so I looked at the kernel side and found that

echo 80 > /proc/sys/vm/dirty_ratio

(previous value was 10) gave a big improvement - the following is the total size of the data (on 8 shards, each on its own node) printed every minute:

total:  14238792  records,  27.5881 GB size
total:  14263546  records,  27.6415 GB size
total:  14288997  records,  27.6824 GB size
total:  14309739  records,  27.7144 GB size
total:  14323563  records,  27.7438 GB size
(here I changed the dirty_ratio setting for all shards)
total:  14394007  records,  27.8996 GB size
total:  14486489  records,  28.0758 GB size
total:  14571409  records,  28.2898 GB size
total:  14663636  records,  28.4929 GB size
total:  14802109  records,  28.7366 GB size

So you can see that the improvement was in the order of 7-8 times. Database size was around 4.5GB per node at that point (including indexes) and the nodes have 8GB RAM (so dirty_ratio of 10 meant that the kernel tried to keep less than ca. 800MB dirty).

Next thing I'll try is ext2 (currently: ext3) and noatime and also keeping everything on a ramdisk (that would probably waste twice the amount of memory, but might be worth it).

回复收藏 0 原文