两个相同查询(不同参数)的 MySQL InnoDB 死锁问题

发布于 2024-12-02 10:00:21 字数 4872 浏览 1 评论 0原文

我有下表

CREATE TABLE IF NOT EXISTS `task` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `job_id` int(10) unsigned NOT NULL COMMENT 'The id of the related job',
  `server_id` tinyint(4) NOT NULL DEFAULT '0' COMMENT 'job/task owner',
  `jobtype_id` int(10) unsigned NOT NULL DEFAULT '0',
  `node_id` int(10) unsigned NOT NULL COMMENT 'The id of the user currently executing this task',
  `status` enum('QUEUED','EXECUTING','COMPLETED','CANCELED','TERMINATED','PAUSED','FAILED') NOT NULL COMMENT 'Current status of the task',
  `last_updated` int(11) NOT NULL COMMENT 'When was the last status change of this task. Used in requeueing hung tasks',
  `data_in` blob NOT NULL COMMENT 'An input data to the task. Sets when the task is created.',
  `data_out` blob NOT NULL COMMENT 'An output data of the task. Sets upon task completion. Can be absent.',
  `speed` bigint(20) unsigned NOT NULL DEFAULT '0',
  `time_spent` int(11) NOT NULL DEFAULT '0',
  `has_data_out` tinyint(1) NOT NULL COMMENT 'Shows if the task has any output data. Serves caching purposes. Used by Summarizers to quickly find out what tasks of the job yielded data.',
  `comment` varchar(200) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `fk_task_job_id` (`job_id`),
  KEY `index_has_data_out` (`has_data_out`),
  KEY `index_last_updated` (`last_updated`),
  KEY `index_status` (`status`),
  KEY `fk_task_userid` (`node_id`),
  KEY `jobtype_id` (`jobtype_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COMMENT='This table holds all subjobs - tasks' AUTO_INCREMENT=1081595 ;

--
-- Constraints for dumped tables
--

--
-- Constraints for table `task`
--
ALTER TABLE `task`
  ADD CONSTRAINT `task_ibfk_5` FOREIGN KEY (`job_id`) REFERENCES `job` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  ADD CONSTRAINT `task_ibfk_7` FOREIGN KEY (`jobtype_id`) REFERENCES `jobtype` (`id`),
  ADD CONSTRAINT `task_ibfk_8` FOREIGN KEY (`node_id`) REFERENCES `node` (`id`);

和以下死锁问题:

------------------------
LATEST DETECTED DEADLOCK
------------------------
110831 14:23:56
*** (1) TRANSACTION:
TRANSACTION 102B4D2, ACTIVE 0 sec, OS thread id 5480
mysql tables in use 2, locked 1
LOCK WAIT 7 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 3
MySQL thread id 74315, query id 2364347 192.168.1.120 usr_sl3 Sending data
select `usr_sl3`.`task`.`id` from `usr_sl3`.`task` where (`usr_sl3`.`task`.`node_id` = 103 and `usr_sl3`.`task`.`status` = 'EXECUTING') for update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 38 page no 2303 n bits 576 index `index_status` of table `usr_sl3`.`task` trx id 102B4D2 lock_mode X locks rec but not gap waiting
Record lock, heap no 471 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 1; hex 02; asc  ;;
1: len 4; hex 00107dac; asc   } ;;

*** (2) TRANSACTION:
TRANSACTION 102B4D3, ACTIVE 0 sec, OS thread id 5692 starting index read, thread declared inside InnoDB 500
mysql tables in use 2, locked 1
7 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 3
MySQL thread id 74354, query id 2364348 192.168.1.120 usr_sl3 Sending data
select `usr_sl3`.`task`.`id` from `usr_sl3`.`task` where (`usr_sl3`.`task`.`node_id` = 95 and `usr_sl3`.`task`.`status` = 'EXECUTING') for update
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 38 page no 2303 n bits 576 index `index_status` of table `usr_sl3`.`task` trx id 102B4D3 lock_mode X locks rec but not gap
Record lock, heap no 471 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 1; hex 02; asc  ;;
1: len 4; hex 00107dac; asc   } ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 38 page no 2303 n bits 576 index `index_status` of table `usr_sl3`.`task` trx id 102B4D3 lock_mode X locks rec but not gap waiting
Record lock, heap no 481 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 1; hex 02; asc  ;;
1: len 4; hex 00107dab; asc   } ;;

*** WE ROLL BACK TRANSACTION (2)

你能帮我理解这个死锁的机制吗?

这两个查询是从不同的线程发出的。每个线程在查询中都有自己的node_id。没有两个查询具有相同的node_id。

我怀疑,我可以通过在字段(node_id,status)上创建复合索引来解决这种情况,但这不是一个好的解决方案,我认为。我需要了解问题的本质。

同一查询上的这些死锁会定期发生,而不是一次或两次。

对受影响的查询进行 EXPLAIN 给出了有趣的结果:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  task    index_merge     index_status,fk_task_userid     index_status,fk_task_userid     1,4     NULL    1   Using intersect(index_status,fk_task_userid); Using where; Using index

MySQL 版本是 5.5。

此外,在死锁发生时,表不包含与受影响查询的条件匹配的行(例如 select usr_sl3.task.id from <代码>usr_sl3.task 其中 (usr_sl3.task.node_id = 95 并且usr_sl3.task.status = 'EXECUTING') 用于更新 根本不产生任何行)。

提前致谢。

I have the following table

CREATE TABLE IF NOT EXISTS `task` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `job_id` int(10) unsigned NOT NULL COMMENT 'The id of the related job',
  `server_id` tinyint(4) NOT NULL DEFAULT '0' COMMENT 'job/task owner',
  `jobtype_id` int(10) unsigned NOT NULL DEFAULT '0',
  `node_id` int(10) unsigned NOT NULL COMMENT 'The id of the user currently executing this task',
  `status` enum('QUEUED','EXECUTING','COMPLETED','CANCELED','TERMINATED','PAUSED','FAILED') NOT NULL COMMENT 'Current status of the task',
  `last_updated` int(11) NOT NULL COMMENT 'When was the last status change of this task. Used in requeueing hung tasks',
  `data_in` blob NOT NULL COMMENT 'An input data to the task. Sets when the task is created.',
  `data_out` blob NOT NULL COMMENT 'An output data of the task. Sets upon task completion. Can be absent.',
  `speed` bigint(20) unsigned NOT NULL DEFAULT '0',
  `time_spent` int(11) NOT NULL DEFAULT '0',
  `has_data_out` tinyint(1) NOT NULL COMMENT 'Shows if the task has any output data. Serves caching purposes. Used by Summarizers to quickly find out what tasks of the job yielded data.',
  `comment` varchar(200) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  KEY `fk_task_job_id` (`job_id`),
  KEY `index_has_data_out` (`has_data_out`),
  KEY `index_last_updated` (`last_updated`),
  KEY `index_status` (`status`),
  KEY `fk_task_userid` (`node_id`),
  KEY `jobtype_id` (`jobtype_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 COMMENT='This table holds all subjobs - tasks' AUTO_INCREMENT=1081595 ;

--
-- Constraints for dumped tables
--

--
-- Constraints for table `task`
--
ALTER TABLE `task`
  ADD CONSTRAINT `task_ibfk_5` FOREIGN KEY (`job_id`) REFERENCES `job` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
  ADD CONSTRAINT `task_ibfk_7` FOREIGN KEY (`jobtype_id`) REFERENCES `jobtype` (`id`),
  ADD CONSTRAINT `task_ibfk_8` FOREIGN KEY (`node_id`) REFERENCES `node` (`id`);

And the following deadlock problem:

------------------------
LATEST DETECTED DEADLOCK
------------------------
110831 14:23:56
*** (1) TRANSACTION:
TRANSACTION 102B4D2, ACTIVE 0 sec, OS thread id 5480
mysql tables in use 2, locked 1
LOCK WAIT 7 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 3
MySQL thread id 74315, query id 2364347 192.168.1.120 usr_sl3 Sending data
select `usr_sl3`.`task`.`id` from `usr_sl3`.`task` where (`usr_sl3`.`task`.`node_id` = 103 and `usr_sl3`.`task`.`status` = 'EXECUTING') for update
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 38 page no 2303 n bits 576 index `index_status` of table `usr_sl3`.`task` trx id 102B4D2 lock_mode X locks rec but not gap waiting
Record lock, heap no 471 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 1; hex 02; asc  ;;
1: len 4; hex 00107dac; asc   } ;;

*** (2) TRANSACTION:
TRANSACTION 102B4D3, ACTIVE 0 sec, OS thread id 5692 starting index read, thread declared inside InnoDB 500
mysql tables in use 2, locked 1
7 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 3
MySQL thread id 74354, query id 2364348 192.168.1.120 usr_sl3 Sending data
select `usr_sl3`.`task`.`id` from `usr_sl3`.`task` where (`usr_sl3`.`task`.`node_id` = 95 and `usr_sl3`.`task`.`status` = 'EXECUTING') for update
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 38 page no 2303 n bits 576 index `index_status` of table `usr_sl3`.`task` trx id 102B4D3 lock_mode X locks rec but not gap
Record lock, heap no 471 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 1; hex 02; asc  ;;
1: len 4; hex 00107dac; asc   } ;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 38 page no 2303 n bits 576 index `index_status` of table `usr_sl3`.`task` trx id 102B4D3 lock_mode X locks rec but not gap waiting
Record lock, heap no 481 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
0: len 1; hex 02; asc  ;;
1: len 4; hex 00107dab; asc   } ;;

*** WE ROLL BACK TRANSACTION (2)

Can you please help me to understand the mechanism of this deadlock?

Those two queries are issued from different threads. Each thread has it's own node_id in query. No two queries have same node_id.

I suspect, that I can solve the situation by creating a compound index on fields (node_id, status), but this isn't a good solution, I think. I need to understand the nature of the problem.

These deadlocks on the same query occure periodically, not once or twice.

EXPLAIN on affected queries gives interesting results:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  task    index_merge     index_status,fk_task_userid     index_status,fk_task_userid     1,4     NULL    1   Using intersect(index_status,fk_task_userid); Using where; Using index

MySQL version is 5.5.

Also, at the moment of the deadlock, tables contain no rows matching condition of the affected query (e.g. select usr_sl3.task.id from usr_sl3.task where (usr_sl3.task.node_id = 95 and usr_sl3.task.status = 'EXECUTING') for update
yields no rows at all).

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

幽蝶幻影 2024-12-09 10:00:21

该查询使用index_status索引而不是fk_task_userid(node_id上​​的索引)。这就是它用其他node_id 锁定记录的原因。

您可以对查询运行说明以查看实际锁定的记录数量(检查的行中)与需要锁定的记录数量(返回的行)

我怀疑,我可以通过在字段(node_id,status)上创建复合索引来解决这种情况,但这不是一个好的解决方案,我认为。我需要了解问题的本质。

为什么?我觉得你的索引无论如何都不是最佳的...在node_id、status上创建索引,它应该可以解决问题

The query uses the index_status index instead of fk_task_userid (the index on node_id). This is the reason it locks records with other node_ids.

You can run explain on your query to see how much records are actually locked (in rows examined) vs how much of them you need to lock (rows returned)

I suspect, that I can solve the situation by creating a compound index on fields (node_id, status), but this isn't a good solution, I think. I need to understand the nature of the problem.

Why? I feel your indexes are not optimal anyway... Create index on node_id,status, it should resolve the problem

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文