MySql - 同时更改几个表
我有一个带有 innoDB 表的 MySql Db。
我需要更改几个大表(约 50M 记录),因为更改会锁定表,我希望使该过程尽可能快。
就速度而言最好的是: 1. 一次修改一张表 2.同时更改两个表
有什么想法吗?
I have a MySql Db with innoDB tables.
I need to alter a couple of big tables (~50M records), since altering locks the tables I want to make the process as fast as possible.
What is best in term of speed:
1. altering one table at a time
2. alter both tables on the same time (simultaneously)
any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我做了一个测试。
我创建了一个包含 400 万行的表。非常简单的表格,一列,所有行的所有值都是“花花公子”。然后,我将该表复制到包含完全相同数据的 big_2 中。
我的电脑是 2010 年中期的 MacBook Pro 13.3",所以一切都与此相关。
然后我做了三件事。
这证实了我的怀疑,即并行不会更快。最可能的原因是这几乎完全是磁盘上的操作,根本无法并行。
I did a test.
I created a table with 4 million rows. Very simple table, one column and all values are "dude" for all rows. I then duplicated that table into big_2 containing the exact same data.
My computer is a macbook pro 13.3" from mid 2010 so everything is related to that.
I then did three things.
This confirms my suspicion that it won't be any faster in parallel. The most likely reason is that this is almost entirely an operation on disk, and that cannot be paralleled at all.
实际上取决于您的服务器中有多少内存。
当您执行 ALTER TABLE 时,您确实希望表及其最大的二级索引(记住 innodb 集群主键,因此 PK 与行一起存储)适合内存。如果没有,速度会很慢(注意:本讨论假设表未分区)。
由于您的表只有 50M 行,很可能它及其所有二级索引都可以轻松装入 RAM(您的服务器上有 32G+,对吧?)。
如果 innodb 缓冲池可以容纳所有内容,则并行执行它们。如果它不连续进行。
在与生产具有相同规格的开发服务器上尝试一下(显然将它们配置为相同大小的 innodb_buffer_pool )
Really depends on how much memory you have in your server.
When you're doing ALTER TABLE, you really want the table and its largest secondary index (remember innodb clusters the primary key, so PK is stored with the rows) to fit into memory. If it doesn't, it's going to be slow (NB: This discussion assumes the table is not partitioned).
As your table has a tiny 50M rows, the chances are it fits in RAM trivially (you have 32G+ on your server, right?) with all its secondary indexes.
If it all fits in the innodb buffer pool, do them in parallel. If it doesn't do them in series.
Try it on your development server which has the same spec as production (obviously configure them with the same size innodb_buffer_pool)
同时做这件事不会给你带来太多收获。还得等到第一个完成后才能做第二个。
您可能更喜欢在查询之间有短暂的延迟来运行查询,以便自第一次更新开始以来一直在等待锁定的其他查询也不必等待第二次更新。例如,如果您的数据库为网站提供服务,则两次 15 秒的挂起比一次 30 秒的挂起要好。
Doing it simultaneously won't give you much gain. It still has to wait until the first is finished to do the second one.
You may prefer to run the queries with a short delay between them so other queries that have been waiting for the lock since the beginning of the first update don't have to wait for the second as well. It your database is serving a website for example, two 15 seconds hangs is better than a single 30 seconds one.