当仍需要实时 SELECT 查询时 MySQL 锁定处理大型 LOAD DATA INFILE
请从 Super Guru MySQL/PHP 专业人士那里寻求一些帮助和建议,他们可以抽出一点时间。
我有一个 PHP/MySQL 的 Web 应用程序,多年来它一直在发展,并获得了大量的搜索。当使用 MySQL LOAD DATA INFILE 处理新行的各种日常数据转储时,它现在遇到了瓶颈。
它是一个大型 MyISAM 表,约有 150 万行,所有 SELECT 查询都发生在其上。当这些发生在大约 600k 行的 LOAD DATA INFILE 期间(并删除过时的数据)时,它们只会被备份并需要大约 30 多分钟才能释放,从而使任何搜索都毫无结果。
我需要想出一种方法来更新该表,同时保留在合理的时间范围内提供 SELECT 结果的能力。
我完全没有想法,也无法自己想出解决方案,因为这是我第一次遇到此类问题。
任何有用的建议、解决方案或来自类似过去经验的指示将不胜感激,因为我很想学习解决此类问题。
非常感谢大家的宝贵时间! J
Looking for some help and advice please from Super Guru MySQL/PHP pros who can spare a moment of their time.
I have a web application in PHP/MySQL which has grown over the years and gets alot of searches on it. Its hitting bottlenecks now when the various daily data dumps of new rows get processed using MySQL LOAD DATA INFILE.
Its a large MyISAM table with about 1.5 million rows and all the SELECT queries occur on it. When these take place during the LOAD DATA INFILE of about 600k rows (and deletion of out dated data) they just get backed up and take about 30+ minutes to be freed up making any of those searches fruitless.
I need to come up with a way to get that table updated while retaining the ability to provide SELECT results in a reasonable timeframe.
Im completely out of ideas and have not been able to come up with a solution myself as its the first time ive encountered this sort of issue.
Any helpful advice, solutions or pointers from similar past experiences would be greatly appreciated as I would love to learn to resolve this sort of problem.
Many thanks everyone for your time! J
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用 CONCURRENT 关键字进行 LOAD DATA INFILE。这样,当您加载数据时,表仍然能够提供 SELECT 服务。
关于删除,这更复杂。我个人会添加一个名为“status”INT(1) 的列,该列将定义该行是否处于活动状态(= 已删除),然后使用基于此列状态的规则对我的表进行分区。
这样,删除 status=0 的所有行会更容易:PI 尚未测试最后一个解决方案,我可能会在不久的将来这样做。
如果您的表经过优化,则 CONCURRENT 关键字将起作用。如果有任何FREE_SPACE,那么LOAD DATA INFILE就会锁定表。
You can use the CONCURRENT keywords for LOAD DATA INFILE. This way, when you load the data, the table is still able to server SELECTs.
Concerning the delete, this is more complicated. I would personally add a column called 'status' INT(1), who will define if the line is active or not( = deleted), and then partition my table with a rule based on this column status.
This way, it will be easier to delete all rows where status=0 :P I haven;t tested this last solution, I may do that in a near future.
The CONCURRENT keywords will work if your table is optimized. If there is any FREE_SPACE, then the LOAD DATA INFILE will lock the table.
MyISAM 不支持行级锁定,因此像 mysqldump 这样的操作被迫锁定整个表以保证一致的转储。您唯一实用的选择是切换到另一个支持行级锁定的表(例如 InnoDB),和/或将转储分成更小的部分。小转储在转储/重新加载时仍会锁定表,但锁定周期会更短。
一个更复杂的选择是拥有“实时”和“备份”表。对备份表执行转储/加载操作。当它们完成时,将其替换为实时表(重命名表,或者让您的代码动态更改它们正在使用的表)。如果您可以忍受一小段时间的潜在陈旧数据,这可能是一个更好的选择选项。
MyISAM doesn't support row-level locking, so operations like mysqldump are forced to lock the entire table to guarantee a consistent dump. Your only practical options are to switch to another table like (like InnoDB) that supports row-level locking, and/or split your dump up into smaller pieces. The small dumps will still lock the table while they're dumping/reloading, but the lock periods would be shorter.
A hairier option would be to have "live" and "backup" tables. Do the dump/load operations on the backup table. When they're copmlete, swap it out for the live table (rename tables, or have your code dynamically change which table they're using).. If you can live with a short window of potential stale data, this could be a better option.
您应该将表存储引擎从 MyISAM 切换到 InnoDB。 InnoDB 提供行锁定(与 MyISAM 的表锁定相反),这意味着当一个查询忙于更新或插入一行时,另一个查询可以同时更新不同的行。
You should switch your table storage engine from MyISAM to InnoDB. InnoDB provides row-locking (as opposed to MyISAM's table-locking) meaning while one query is busy updating or inserting a row, another query can update a different row at the same time.