在巨大的 MySQL 生产表上创建索引,无需表锁定

发布于 2024-10-03 13:47:38 字数 147 浏览 8 评论 0原文

我需要在大约 5M 行的 MySQL 表上创建索引。这是一个生产表,如果我运行 CREATE INDEX 语句,我担心所有内容都会被完全阻塞...

有没有办法在不阻止插入和选择的情况下创建该索引?

只是想知道我不必停止,创建索引并重新启动我的系统!

I need to create an index on a ~5M rows MySQL table. It is a production table, and I fear a complete block of everything if I run a CREATE INDEX statement...

Is there a way to create that index without blocking inserts and selects?

Just wondering I have not to stop, create index and restart my system!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

又怨 2024-10-10 13:47:38

[2017] 更新:MySQL 5.6 支持在线索引更新

https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html#online-ddl-index-syntax-notes< /a>

在 MySQL 5.6 及更高版本中,在创建或删除索引时,表仍可用于读写操作。 CREATE INDEX 或 DROP INDEX 语句仅在访问表的所有事务完成后才完成,以便索引的初始状态反映表的最新内容。以前,在创建或删除索引时修改表通常会导致死锁,从而取消表上的 INSERT、UPDATE 或 DELETE 语句。

[2015] 更新表索引会阻止 MySQL 5.5 中的写入

从上面的答案中:

“如果您使用大于 5.1 的版本,则索引是在数据库在线时创建的。所以不用担心,您不会中断生产系统的使用。”

这是 ****FALSE* ***(至少对于 MyISAM / InnoDB 表,这是 99.999% 的人使用的表。集群版本不同。)

在表上执行 UPDATE 操作将阻塞,而正在创建索引。 MySQL 在这方面(以及其他一些事情)真的非常非常愚蠢。

测试脚本:

(   
  for n in {1..50}; do
    #(time mysql -uroot -e 'select  * from website_development.users where id = 41225\G'>/dev/null) 2>&1 | grep real;
    (time mysql -uroot -e 'update website_development.users set bio="" where id = 41225\G'>/dev/null) 2>&1 | grep real;
  done
) | cat -n &
PID=$!
sleep 0.05
echo "Index Update - START"
mysql -uroot website_development -e 'alter table users add index ddopsonfu (last_name, email, first_name, confirmation_token, current_sign_in_ip);'
echo "Index Update - FINISH"
sleep 0.05
kill $PID
time mysql -uroot website_development -e 'drop index ddopsonfu on users;'

我的服务器(InnoDB):

Server version: 5.5.25a Source distribution

输出(注意第 6 个操作如何阻塞完成索引更新所需的约 400 毫秒):

 1  real    0m0.009s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.012s
 5  real    0m0.009s
Index Update - START
Index Update - FINISH
 6  real    0m0.388s
 7  real    0m0.009s
 8  real    0m0.009s
 9  real    0m0.009s
10  real    0m0.009s
11  real    0m0.009s

与不阻塞的读取操作(交换脚本中的行注释):

 1  real    0m0.010s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.010s
 5  real    0m0.009s
Index Update - START
 6  real    0m0.010s
 7  real    0m0.010s
 8  real    0m0.011s
 9  real    0m0.010s
...
41  real    0m0.009s
42  real    0m0.010s
43  real    0m0.009s
Index Update - FINISH
44  real    0m0.012s
45  real    0m0.009s
46  real    0m0.009s
47  real    0m0.010s
48  real    0m0.009s

更新 MySQL没有停机的模式

到目前为止,我所知道的只有一种方法可以更新 MySql 模式并且不会遭受可用性中断。循环主控:

  • 主控 A 上运行 MySQL 数据库
  • 将主控 B 投入使用并复制主控 A 的写入(B 是 A 的从属)
  • 在主控 B 上执行架构更新。在升级过程中它将落后于
  • 主控B 追上来。不变:您的架构更改必须能够处理从降级架构复制的命令。索引更改符合条件。简单的列添加通常符合条件。删除一个列?可能不会。
  • 以原子方式将所有客户端从主 A 交换到主 B。如果您想安全(相信我,您确实如此),您应该确保对 A 的最后一次写入已复制到 B BEFORE B 执行第一个操作写。如果你允许对 2 个以上的主服务器进行并发写入,...你可以在更深层次上更好地理解 MySQL 复制,否则你将走向一个痛苦的世界。极度疼痛。比如,你有一个列是自动增量的吗???你完蛋了(除非你在一个大师上使用偶数,而在另一个大师上使用赔率)。不要相信 MySQL 复制会“做正确的事”。它不聪明,也不会拯救你。它只是比从命令行复制二进制事务日志并手动重放它们的安全性稍差一些。尽管如此,断开所有客户端与旧主服务器的连接并将它们切换到新主服务器只需几秒钟的时间,这比等待数小时的架构升级要快得多。
  • 现在B大师是你的新主人了。您有新的架构。生活是美好的。喝杯啤酒;最糟糕的时期已经过去了。
  • 对Master A重复这个过程,升级他的模式,这样他就成为你新的辅助master,准备好在你的主要master(现在是master B)失去权力或者刚刚死掉的情况下接管你的任务。

这不是更新架构的简单方法。能适应恶劣的生产环境;是的。拜托,拜托,拜托,如果有一种更简单的方法可以在不阻止写入的情况下向 MySQL 表添加索引,请告诉我。

谷歌搜索引导我找到这篇文章 它描述了类似的技术。更好的是,他们建议在程序的同一点喝酒(请注意,我在阅读文章之前写了答案)!

Percona 的 pt-online-schema-change

文章我在上面链接了有关工具的讨论,pt-online-schema-change,其工作原理如下:

  • 创建与原始结构相同的新表。
  • 更新新表上的架构。
  • 在原始表上添加触发器,以便更改与副本保持同步
  • 从原始表中批量复制行。
  • 将原来的桌子移开并更换为新桌子。
  • 扔掉旧桌子。

我自己从未尝试过该工具。 YMMV

RDS

我目前通过 Amazon 的 RDS 使用 MySQL。这是一项非常漂亮的服务,它封装并管理 MySQL,让您只需一个按钮即可添加新的只读副本,并跨硬件 SKU 透明地升级数据库。真的很方便。您无法获得对数据库的超级访问权限,因此您无法直接破坏复制(这是福还是祸?)。但是,您可以使用 读取副本升级以在只读从属设备上更改架构,然后将该从属设备升级为新的主设备。与我上面描述的技巧完全相同,只是更容易执行。他们在切换方面仍然没有多大帮助。您必须重新配置并重新启动您的应用程序。

[2017] Update: MySQL 5.6 has support for online index updates

https://dev.mysql.com/doc/refman/8.0/en/innodb-online-ddl-operations.html#online-ddl-index-syntax-notes

In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.

[2015] Updating table indicies blocks writes in MySQL 5.5

From the answer above:

"If your using a version greater than 5.1 indices are created while the database is online. So not to worry you won't interrupt production system use."

This is ****FALSE**** (at least for MyISAM / InnoDB tables, which is what 99.999% of people out there use. Clustered Edition is different.)

Doing UPDATE operations on a table will BLOCK while the index is being created. MySQL is really, really stupid about this (and a few other things).

Test Script:

(   
  for n in {1..50}; do
    #(time mysql -uroot -e 'select  * from website_development.users where id = 41225\G'>/dev/null) 2>&1 | grep real;
    (time mysql -uroot -e 'update website_development.users set bio="" where id = 41225\G'>/dev/null) 2>&1 | grep real;
  done
) | cat -n &
PID=$!
sleep 0.05
echo "Index Update - START"
mysql -uroot website_development -e 'alter table users add index ddopsonfu (last_name, email, first_name, confirmation_token, current_sign_in_ip);'
echo "Index Update - FINISH"
sleep 0.05
kill $PID
time mysql -uroot website_development -e 'drop index ddopsonfu on users;'

My Server (InnoDB):

Server version: 5.5.25a Source distribution

Output (notice how the 6th operation blocks for the ~400ms it takes to finish the index update):

 1  real    0m0.009s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.012s
 5  real    0m0.009s
Index Update - START
Index Update - FINISH
 6  real    0m0.388s
 7  real    0m0.009s
 8  real    0m0.009s
 9  real    0m0.009s
10  real    0m0.009s
11  real    0m0.009s

Vs read operations which don't block (swap the line comment in the script):

 1  real    0m0.010s
 2  real    0m0.009s
 3  real    0m0.009s
 4  real    0m0.010s
 5  real    0m0.009s
Index Update - START
 6  real    0m0.010s
 7  real    0m0.010s
 8  real    0m0.011s
 9  real    0m0.010s
...
41  real    0m0.009s
42  real    0m0.010s
43  real    0m0.009s
Index Update - FINISH
44  real    0m0.012s
45  real    0m0.009s
46  real    0m0.009s
47  real    0m0.010s
48  real    0m0.009s

Updating MySQL's Schema without downtime

Thusfar, there's only one method I know of to update a MySql schema and not suffer an availability outage. Circular masters:

  • Master A has your MySQL database running on it
  • Bring Master B into service and have it replicate writes from Master A ( B is a slave of A)
  • Perform the schema update on Master B. It will fall behind during the upgrade
  • Let Master B catch up. Invariant: Your schema change MUST be capable of processing commands replicated from a downversion schema. Indexing changes qualify. Simple column additions usually qualify. Removing a column? probably not.
  • ATOMICALLY swap all clients from Master A to Master B. If you want to be safe (trust me, you do), you should ensure that the last write to A is replicated to B BEFORE B takes its first write. If you allow concurrent writes to 2+ masters, ... you better understand MySQL replication at a DEEP level or you are headed for a world of pain. Extreme pain. Like, do you have a column that is AUTOINCREMENT??? you are screwed (unless you use even numbers on one master and odds on the other). Do NOT trust MySQL replication to "do the right thing". It is NOT smart and will not save you. It's just slightly less safe than copying binary transaction logs from the command-line and replaying them by hand. Still, disconnecting all clients from the old master and flipping them to the new master can be done in a matter of seconds, vastly faster than waiting for a multi-hour schema upgrade.
  • Now Master B is your new master. You have the new schema. Life is good. Have a beer; the worst is over.
  • Repeat the process with Master A, upgrading his schema so that he becomes your new secondary master, ready to take over in the event that your primary master (master B now) loses power or just up and dies on you.

An easy way to update schema this isn't. Workable in a serious production environment; yes, it is. Please, please, please, if there is an easier way to add an index to a MySQL table without blocking writes, let me know.

Googling lead me to this article which describes a similar technique. Even better, they advise drinking at the same point in the proceedure (Note that I wrote my answer before reading the article)!

Percona's pt-online-schema-change

The article I linked above talks about a tool, pt-online-schema-change, that works as follows:

  • Create new table with same structure as original.
  • Update schema on new table.
  • Add a trigger on the original table so that changes are kept in-sync with the copy
  • Copy rows in batches from original table.
  • Move original table out of the way and replace with new table.
  • Drop old table.

I've never tried the tool myself. YMMV

RDS

I'm currently using MySQL through Amazon's RDS. It's a really nifty service that wraps up and manages MySQL, letting you add new read replicas with a single button and transparently upgrade the database across hardware SKU's. It's really convenient. You don't get SUPER access to the database, so you can't screw with replication directly (is this a blessing or curse?). However, you can use Read Replica Promotion to make your schema changes on a read-only slave, then promote that slave to become your new master. Exactly the same trick as I described above, just vastly easier to execute. They still don't do much to help you with the cut-over. You have to reconfigure and restart your app.

昔梦 2024-10-10 13:47:38

正如这篇博客文章概述的,InnoDB ALTER TABLE 机制已针对 MySQL 5.6 进行了完全重新设计。

(有关此主题的独家概述,MySQL 文档< /a> 可以提供一个下午的阅读量。)

要向无锁表添加索引,导致 UPDATE/ INSERT,可以使用以下语句格式:

ALTER TABLE my_table ADD INDEX my_table__idx (my_column), ALGORITHM=INPLACE, LOCK=NONE;

As this blog post outlines, the InnoDB ALTER TABLE mechanism has been completely redesigned for MySQL 5.6.

(For an exclusive overview of this topic, the MySQL documentation can provide an afternoon's worth of reading.)

To add an index to a table without a lock resulting on UPDATE/ INSERT, the following statement format can be used:

ALTER TABLE my_table ADD INDEX my_table__idx (my_column), ALGORITHM=INPLACE, LOCK=NONE;
冷血 2024-10-10 13:47:38

MySQL 5.6 更新(2013 年 2 月):现在,即使使用 InnoDB 表创建索引,您也可以执行读写操作 - http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

在 MySQL 5.6 及更高版本中,在创建或删除索引时,表仍可用于读写操作。 CREATE INDEX 或 DROP INDEX 语句仅在访问表的所有事务完成后才完成,以便索引的初始状态反映表的最新内容。以前,在创建或删除索引时修改表通常会导致死锁,从而取消表上的 INSERT、UPDATE 或 DELETE 语句。

和:

在 MySQL 5.6 中,此功能变得更加通用:您可以在创建索引时读取和写入表,并且可以执行更多种类的 ALTER TABLE 操作,而无需复制表、不会阻塞 DML 操作,或者两者兼而有之。 。因此,在 MySQL 5.6 及更高版本中,我们通常将这组功能称为在线 DDL,而不是快速索引创建。

来自 http://dev.mysql.com/doc/refman/ 5.6/en/glossary.html#glos_fast_index_creation

MySQL 5.6 update (feb 2013): You can now perform read and write operations while an index is being created even with InnoDB tables - http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

In MySQL 5.6 and higher, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are accessing the table are completed, so that the initial state of the index reflects the most recent contents of the table. Previously, modifying the table while an index is being created or dropped typically resulted in a deadlock that cancelled the INSERT, UPDATE, or DELETE statement on the table.

and:

In MySQL 5.6, this feature becomes more general: you can read and write to tables while an index is being created, and many more kinds of ALTER TABLE operations can be performed without copying the table, without blocking DML operations, or both. Thus in MySQL 5.6 and higher, we typically refer to this set of features as online DDL rather than Fast Index Creation.

from http://dev.mysql.com/doc/refman/5.6/en/glossary.html#glos_fast_index_creation

谷夏 2024-10-10 13:47:38

如果您确实想确保迁移不会导致站点瘫痪,那么 pt-online-schema-change 就是您的最佳选择。

正如我在上面的评论中所写,我在生产中拥有一些使用 pt-online-schema-change 的经验。我们有 20M+ 记录的主表和一个主表 -> 2 个只读复制从站。我已经使用 pt-online-schema-change 完成了至少数十次迁移,从添加新列、更改字符集到添加多个索引。我们在迁移期间也提供大量流量,并且没有出现任何问题。当然,在生产环境中运行之前,您必须非常彻底地测试所有脚本。

我尝试将更改批量合并到 1 个脚本中,以便 pt-online-schema-change 只需复制数据一次。更改列名称时要非常小心,因为您会丢失数据。不过,添加索引应该没问题。

pt-online-schema-change is the way to go if you really want to make sure that the migration will not bring down the site.

As I wrote in the above comment, I have several experiences with pt-online-schema-change in production. We have our main table of 20M+ records and a master -> 2 read-only replication slaves. I've done at least a dozens of migrations with pt-online-schema-change from adding a new column, changing charset, to adding several indices. We serve tons of traffic during the migration time as well and we have not had any hiccup. Of course you'd have to test all the scripts very thoroughly before running on production.

I tried to batch up the changes into 1 script so that pt-online-schema-change only have to copy the data once. And be very careful with changing column name since you will loose your data. However, adding an index should be fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文