在一个巨大的表上按 RAND() 排序会导致它在 MySQL 中崩溃。为什么?

发布于 2024-10-01 12:53:31 字数 662 浏览 6 评论 0原文

美好的一天,朋友们。我在 32b 机器上运行 Fedora 13。

我有一个巨大的 1.5B ip 地址表(当我有资源时,它会被分割,不用担心;))。当我运行此查询时:

SELECT ip FROM ips ORDER BY RAND() LIMIT 500000;

有时表会崩溃(我收到错误“MySQL 表被标记为崩溃并应该修复”),有时则不会。我的问题是;哪些因素会导致 MyISAM 或 InnoDB 表崩溃?记忆力有影响吗?因为我的系统监视器显示它只使用了大约 30%。

这是 my.cnf (默认值,从元包安装):

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

另外;我听说 PostGreSQL“更健壮”并且不会像 MySQL 在这些情况下那样崩溃。这是真的还是妻子的故事?

Good day, friends. I'm running Fedora 13 on a 32b machine.

I have a huge table of 1.5B ip addresses (which will be split up when I have the resources, don't worry ;)). When I run this query:

SELECT ip FROM ips ORDER BY RAND() LIMIT 500000;

Sometimes the table crashes (I get the error "MySQL Table is Marked as Crashed and Should be Repaired"), sometimes it doesn't. My question is; what sorts of things cause MyISAM or InnoDB tables to crash? Does memory have an affect? Because my system monitor shows it only uses about 30%.

Here is my.cnf (the default, installing from meta-package):

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Also; I hear PostGreSQL is 'more robust' and doesn't crash like MySQL does in these situations. Is this true or is it a wives tale?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

[浮城] 2024-10-08 12:53:31

1500M IP 地址进行排序并不是最好的主意。

如果您想要 500K 个随机 IP 地址,请使用以下方法:

不过,它也不会太高效,因为它需要两次遍历表格。

您可以粗略估计 COUNT(*) 而不是子查询。这将加快查询速度,但代价是有可能会丢失一些地址或获得的结果少于所需的结果。

Sorting 1500M of IP addresses is not the best idea.

If you want 500K random IP addresses, use this approach:

It won't be too efficient as well, though, since it needs two passes over the table.

You may put a rough estimate of the COUNT(*) instead of the subquery. This will speed the query up, at cost of a slight opportunity of missing some addresses or getting fewer results than needed.

水中月 2024-10-08 12:53:31

order by rand() 将导致使用文件排序的临时表,我怀疑您的表崩溃是由于内存限制。您可以尝试调整conf文件以允许mysql将更多内存用于临时表和排序缓冲区。

order by rand() will cause a temporary table with filesort, I suspect your table crash is due to a memory limitation. You can try adjusting your conf file to allow mysql to utilize more memory for temporary tables and sort buffers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文