大型 mysql 数据库的可靠备份?
我有 200GB / 400Mrows mysql/innodb 数据库 - 远远超出了我发现的合理范围。
一个令人惊讶的问题是恢复备份。 mysqldump 生成巨大的 sql 文件,它们需要大约一周的时间才能导入回新数据库(尝试使其更快,例如更大/更小的事务,在导入期间关闭密钥等,网络压缩等到目前为止失败,myisam 导入似乎速度快 2 倍,但之后就不会发生任何交易)。
更糟糕的是 - 我希望得到一些帮助 - 在一周的时间内传输超过 200GB 的网络连接很有可能会中断,并且 sql 导入过程无法以任何不寻常的方式继续。
处理它的最好方法是什么?现在,如果我注意到连接断开,我会通过检查最后导入的表的最高主键来手动尝试找出连接何时结束,然后使用 perlscript 基本上执行此操作:
perl -nle 'BEGIN{open F, "prelude.txt"; @a=<F>; print @a; close F;}; print if $x; $x++ if /INSERT.*last-table-name.*highest-primary-key/'
这确实不是可行的方法,所以什么是最好的方法吗?
I have 200GB / 400Mrows mysql/innodb database - far beyond what's reasonable as I found out.
One surprising problem is restoring backups. mysqldump generates huge sql files, and they take about a week to import back into a fresh database (attempts at making it faster like bigger/smaller transactions, turning off keys during import etc., network compression etc. failed so far, myisam import seems 2x faster but then there would be no transactions).
What's worse - and I hope to get some help with this - a network connection which transfers >200GB over a time period of a week has a non-trivial chance of breaking, and sql import process cannot be continued in any non-trivial way.
What would be the best way of dealing with it? Right now if I notice a broken connection I manually try to figure out when it ended by checking highest primary key of the last imported table, and then have a perlscript which basically does this:
perl -nle 'BEGIN{open F, "prelude.txt"; @a=<F>; print @a; close F;}; print if $x; $x++ if /INSERT.*last-table-name.*highest-primary-key/'
This really isn't the way to go, so what would be the best way?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你的 MySQL 盒子有足够的硬盘空间来容纳所有数据翻倍吗?本地存储在这里是最好的,但如果它不是一个选择,您也可以尝试某种利用 iSCSI 的 NAS 设备。它仍然通过网络进行,但在这种情况下,您可以获得更高的吞吐量和可靠性,因为您仅依赖于具有相当精简操作系统的 NAS,并且几乎不需要重新启动。
Does your MySQL box have enough hard drive space for all the data doubled? Local storage would be best here, but if it's not an option, you could also try some sort of NAS device utilizing iSCSI. It's still happening over the network, but in this case you get more throughput and reliability, because you're only relying on a NAS which has a pretty slim OS and almost never has to be rebooted.
你不能使用 mysqldump 来备份大型数据库 - 200G 是可行的,但更大的数据库会变得越来越糟。
最好的选择是拍摄数据库目录的卷快照并以某种方式压缩它 - 这就是我们通常所做的 - 或在其他地方 rsync 它。
如果您的文件系统或块设备不支持快照,那么您基本上就会遇到麻烦。您可以关闭数据库来进行备份,但我认为您不想这样做。
要恢复它,只需执行相反的操作,然后重新启动并等待(可能需要一段时间)innodb 恢复来修复问题。
maatkit mk-parallel-dump 和恢复工具在速度方面比 mysqldump 好一些 - 但我不是 100% 相信它们的正确性
编辑:重新阅读问题,我认为文件系统快照 + rsync 可能是最好的方法;您可以在不影响实时系统的情况下执行此操作(您只需要传输自上次备份以来更改的内容),并且如果连接失败,您可以恢复 rsync,并且它将从中断处继续。
You can't use mysqldump to backup large databases - 200G is feasible but bigger ones it gets worse and worse.
Your best bet is to take a volume snapshot of the database directory and zip that up somehow - that's what we've generally done - or rsync it somewhere else.
If your filesystem or block device does not support snapshots then you're basically in trouble. You can shut the db down to take a backup, but I don't imagine you want to do that.
To restore it, just do the opposite then restart and wait (possibly some time) for innodb recovery to fix things up.
The maatkit mk-parallel-dump and restore tools are a bit better than mysqldump, speed-wise - but I'm not 100% convinced of their correctness
Edit: re-reading the question, I think filesystem snapshot + rsync is probably the best way to go; you can do this without impacting the live system (you'll only need to transfer what changed since the last backup) too much and you can resume the rsync if the connection fails, and it'll continue where it left off.
您需要数据库中的所有内容吗?
您能否将一些信息推送到存档数据库并在您的应用程序中添加一些内容,以允许人们查看存档中的记录,
显然这在很大程度上取决于您的应用程序和设置,但这可能是一个解决方案?你的数据库可能只会变得更大......
Do you need everything in the database?
Can you push some of the information to an archive database and add something into your application that would allow people to view records in the archive,
Obviously this depends a lot upon your application and set up, but it may be a solution? Your DB is probably only going to get bigger....