MySQL 返回错误数据?
最近 Django 社区出现了一个关于 MySQL 测试(使用 MyISAM)的问题。
这是 django 票证: http://code.djangoproject.com/ticket/14661
其中之一Django 核心开发人员提出了这个测试,我们中的许多人都能够复制它。有人猜测我们在这里遇到了什么吗?这只是 MySQL 中的一个错误还是我遗漏了一些东西?
这是测试代码和查询:
DROP TABLE IF EXISTS `testapp_tag`;
CREATE TABLE `testapp_tag` (
`id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
`name` varchar(10) NOT NULL,
`parent_id` integer
);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t1", NULL);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t2", 1);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t3", 1);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t4", 3);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t5", 3);
SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag`.`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag`.`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
这是输出:
mysql> SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag` .`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
+----+------+-----------+
| id | name | parent_id |
+----+------+-----------+
| 1 | t1 | NULL |
| 3 | t3 | 1 |
| 5 | t5 | 3 |
+----+------+-----------+
3 rows in set (0.00 sec)
mysql> SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag` .`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
+----+------+-----------+
| id | name | parent_id |
+----+------+-----------+
| 1 | t1 | NULL |
| 3 | t3 | 1 |
+----+------+-----------+
2 rows in set (0.01 sec)
Recently an issue came up in the Django community regarding testing of MySQL (using MyISAM).
Here's the django ticket: http://code.djangoproject.com/ticket/14661
One of the Django core developers came up with this test and many of us have been able to replicate it. Anyone have a guess as to what we are running into here? Is it simply a bug in MySQL or am I missing something?
Here's the test code and queries:
DROP TABLE IF EXISTS `testapp_tag`;
CREATE TABLE `testapp_tag` (
`id` integer AUTO_INCREMENT NOT NULL PRIMARY KEY,
`name` varchar(10) NOT NULL,
`parent_id` integer
);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t1", NULL);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t2", 1);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t3", 1);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t4", 3);
INSERT INTO `testapp_tag` (`name`, `parent_id`) VALUES ("t5", 3);
SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag`.`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag`.`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
Here's the output:
mysql> SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag` .`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
+----+------+-----------+
| id | name | parent_id |
+----+------+-----------+
| 1 | t1 | NULL |
| 3 | t3 | 1 |
| 5 | t5 | 3 |
+----+------+-----------+
3 rows in set (0.00 sec)
mysql> SELECT `testapp_tag`.`id`, `testapp_tag`.`name`, `testapp_tag`.`parent_id` FROM `testapp_tag` WHERE NOT ((`testapp_tag` .`id` IN (SELECT U0.`id` FROM `testapp_tag` U0 LEFT OUTER JOIN `testapp_tag` U1 ON (U0.`id` = U1.`parent_id`) WHERE U1.`id` IS NULL) AND `testapp_tag`.`id` IS NOT NULL)) ORDER BY `testapp_tag`.`name` ASC;
+----+------+-----------+
| id | name | parent_id |
+----+------+-----------+
| 1 | t1 | NULL |
| 3 | t3 | 1 |
+----+------+-----------+
2 rows in set (0.01 sec)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
看起来很有趣,看起来像是 MySql 查询优化器中的一个错误。
如果您运行此命令而不是普通选择:
然后,比较
EXPLAIN EXTENDED
警告的输出,您可以看到第一次,优化器添加到选择:另外,请注意删除来自
WHERE
的AND testapp_tag.id IS NOT NULL
,由于该字段被标记为NOT NULL
,所以什么也不做,似乎解决了这个问题。Seems very interesting, and looks like a bug in the MySql query optimizer.
If you run this instead of the plain selects:
Then, comparing the output from the
EXPLAIN EXTENDED
warnings, You can see that the first time around, the optimizer adds to the select:Also, note that removing the
AND testapp_tag.id IS NOT NULL
from theWHERE
, which does nothing since the field is marked asNOT NULL
, seems to take away the issue.这种形式工作可靠:
NOT + IN + 附加过滤器组合似乎将 MySQL 排除在外。这绝对是一个错误。
NOT() 中的测试查找 2 个部分。如果第一部分为真,则第二部分不可能为真,无论该字段是否可以为空。这是一个多余的子句,似乎是导致该错误的原因。
从 ScrumMeister 的回答中得到提示,我确认该错误是由于针对 AUTO_INCRMENT 最后插入的 ID 进行某种缓存造成的。
生成此计划
如果插入停止于 t6,并且删除也是 t6,则错误会被掩盖,因为添加的子句是 or (test.t.id = 6),我们已经在标记为 #### 的行中删除了该子句###
This form works reliably:
The NOT + IN + additional filter mix seems to throw MySQL out. It is definitely a bug.
The test in the NOT() looks for 2 parts. If the first part is true, the 2nd cannot possibly be true, regardless of whether the field can be null or not. It is a redundant clause that seems to be the cause of the bug.
Taking a cue from ScrumMeister's answer, I confirm that the bug is due to some sort of caching against the last inserted ID against the AUTO_INCREMENT.
Produces this plan
If the insert stops at t6, and the delete is also of t6, the bug is masked because the clause added is or (test.t.id = 6) which we have already deleted in the line marked #######