在一个巨大的表上按 RAND() 排序会导致它在 MySQL 中崩溃。为什么?
美好的一天,朋友们。我在 32b 机器上运行 Fedora 13。
我有一个巨大的 1.5B ip 地址表(当我有资源时,它会被分割,不用担心;))。当我运行此查询时:
SELECT ip FROM ips ORDER BY RAND() LIMIT 500000;
有时表会崩溃(我收到错误“MySQL 表被标记为崩溃并应该修复”),有时则不会。我的问题是;哪些因素会导致 MyISAM 或 InnoDB 表崩溃?记忆力有影响吗?因为我的系统监视器显示它只使用了大约 30%。
这是 my.cnf (默认值,从元包安装):
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
另外;我听说 PostGreSQL“更健壮”并且不会像 MySQL 在这些情况下那样崩溃。这是真的还是妻子的故事?
Good day, friends. I'm running Fedora 13 on a 32b machine.
I have a huge table of 1.5B ip addresses (which will be split up when I have the resources, don't worry ;)). When I run this query:
SELECT ip FROM ips ORDER BY RAND() LIMIT 500000;
Sometimes the table crashes (I get the error "MySQL Table is Marked as Crashed and Should be Repaired"), sometimes it doesn't. My question is; what sorts of things cause MyISAM or InnoDB tables to crash? Does memory have an affect? Because my system monitor shows it only uses about 30%.
Here is my.cnf (the default, installing from meta-package):
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
Also; I hear PostGreSQL is 'more robust' and doesn't crash like MySQL does in these situations. Is this true or is it a wives tale?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
对
1500M
IP 地址进行排序并不是最好的主意。如果您想要
500K
个随机 IP 地址,请使用以下方法:不过,它也不会太高效,因为它需要两次遍历表格。
您可以粗略估计
COUNT(*)
而不是子查询。这将加快查询速度,但代价是有可能会丢失一些地址或获得的结果少于所需的结果。Sorting
1500M
of IP addresses is not the best idea.If you want
500K
random IP addresses, use this approach:It won't be too efficient as well, though, since it needs two passes over the table.
You may put a rough estimate of the
COUNT(*)
instead of the subquery. This will speed the query up, at cost of a slight opportunity of missing some addresses or getting fewer results than needed.order by rand() 将导致使用文件排序的临时表,我怀疑您的表崩溃是由于内存限制。您可以尝试调整conf文件以允许mysql将更多内存用于临时表和排序缓冲区。
order by rand() will cause a temporary table with filesort, I suspect your table crash is due to a memory limitation. You can try adjusting your conf file to allow mysql to utilize more memory for temporary tables and sort buffers.