mysql 连接子查询

发布于 2024-10-25 02:59:16 字数 1570 浏览 4 评论 0原文

我有下表:

CREATE TABLE `data` (
  `date_time` decimal(26,6) NOT NULL,
  `channel_id` mediumint(8) unsigned NOT NULL,
  `value` varchar(40) DEFAULT NULL,
  `status` tinyint(3) unsigned DEFAULT NULL,
  `connected` tinyint(1) unsigned NOT NULL,
  PRIMARY KEY (`channel_id`,`date_time`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE `channels` (
  `channel_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `channel_name` varchar(40) NOT NULL,
  PRIMARY KEY (`channel_id`),
  UNIQUE KEY `channel_name` (`channel_name`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

我想知道是否有人可以给我一些关于如何优化或重写以下查询的建议:

SELECT channel_name, t0.date_time, t0.value, t0.status, t0.connected, t1.date_time, t1.value, t1.status, t1.connected FROM channels,
    (SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
        WHERE date_time <= 1300818330
        GROUP BY channel_id) AS t0
    RIGHT JOIN
    (SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
        WHERE date_time <= 1300818334
        GROUP BY channel_id) AS t1
ON t0.channel_id = t1.channel_id
WHERE channels.channel_id = t1.channel_id

基本上我在两个不同的时间获取每个channel_name 的值、状态和连接字段。由于 t0 始终 <= t1,因此 t1 的字段可能存在,但 t0 不存在,我希望显示该字段。这就是我使用 RIGHT JOIN 的原因。如果 t1 不存在,那么 t0 也不存在,因此不应返回任何行。

问题好像是因为我加入子查询,所以不能使用索引?我尝试重写它,首先对数据表的channel_id 进行自连接,但这有数百万行。

如果能够向最后的每一行添加一个布尔字段,当 t0.value = t1.value & 时为 true,那就太好了。 t0.status = t1.status & t0.已连接 = t1.已连接。

非常感谢您抽出时间。

I have the following tables:

CREATE TABLE `data` (
  `date_time` decimal(26,6) NOT NULL,
  `channel_id` mediumint(8) unsigned NOT NULL,
  `value` varchar(40) DEFAULT NULL,
  `status` tinyint(3) unsigned DEFAULT NULL,
  `connected` tinyint(1) unsigned NOT NULL,
  PRIMARY KEY (`channel_id`,`date_time`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

CREATE TABLE `channels` (
  `channel_id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
  `channel_name` varchar(40) NOT NULL,
  PRIMARY KEY (`channel_id`),
  UNIQUE KEY `channel_name` (`channel_name`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1;

I was wondering if anyone could give me some advice on how to optimize or rewrite the following query:

SELECT channel_name, t0.date_time, t0.value, t0.status, t0.connected, t1.date_time, t1.value, t1.status, t1.connected FROM channels,
    (SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
        WHERE date_time <= 1300818330
        GROUP BY channel_id) AS t0
    RIGHT JOIN
    (SELECT MAX(date_time) AS date_time, channel_id, value, status, connected FROM data
        WHERE date_time <= 1300818334
        GROUP BY channel_id) AS t1
ON t0.channel_id = t1.channel_id
WHERE channels.channel_id = t1.channel_id

Basically I am getting the value, status and connected fields for each channel_name at two different times. Since t0 is always <= t1, the fields could exist for t1, but not t0, and I want that to be shown. That is why I am using the RIGHT JOIN. If it does not exist for t1, then it won't exist for t0, so no row should be returned.

The problem seems to be that since I am joining sub queries, no index can be used? I tried rewriting it to do a self join on the channel_id of the data table first but that is millions of rows.

It would also be nice to be able to add a boolean field to each of the final rows that is true when t0.value = t1.value & t0.status = t1.status & t0.connected = t1.connected.

Thank you very much for your time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

﹎☆浅夏丿初晴 2024-11-01 02:59:16

您可以将两个子查询减少为一个

SELECT channel_id,
   MAX(date_time) AS t1_date_time,
   MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id

GROUP BY 在 MySQL 中是出了名的误导性的。想象一下,如果在同一个选择中有 MIN() 和 MAX(),非分组列应该来自哪一行?一旦理解了这一点,您就会明白为什么它不是确定性的。

要获取完整的 t0 和 t1 行

SELECT x.channel_id,
       t0.date_time, t0.value, t0.status, t0.connected,
       t1.date_time, t1.value, t1.status, t1.connected
FROM (
    SELECT channel_id,
       MAX(date_time) AS t1_date_time,
       MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
    FROM data
    WHERE date_time <= {$p2}
    GROUP BY channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time

最后是一个连接来获取通道名称

SELECT c.channel_name,
       t0.date_time, t0.value, t0.status, t0.connected,
       t1.date_time, t1.value, t1.status, t1.connected,
       t0.value=t1.value AND t1.status=t0.status
                         AND t0.connected=t1.connected name_me
FROM (
    SELECT channel_id,
       MAX(date_time) AS t1_date_time,
       MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
    FROM data
    WHERE date_time <= {$p2}
    GROUP BY channel_id
) x
INNER JOIN channels c on c.channel_id = x.channel_id
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time


编辑

要对通道名称执行 RLIKE,看起来很简单,只需在 c.channel_name.然而,利用 MySQL 从左到右处理逗号符号连接的功能,在子查询中对其进行过滤可能会表现得更好。

SELECT x.channel_name,
       t0.date_time, t0.value, t0.status, t0.connected,
       t1.date_time, t1.value, t1.status, t1.connected,
       t0.value=t1.value AND t1.status=t0.status
                         AND t0.connected=t1.connected name_me
(
    SELECT c.channel_id, c.channel_name,
       MAX(d.date_time) AS t1_date_time,
       MAX(case when d.date_time <= {$p1} then d.date_time end) AS t0_date_time
    FROM channels c, data d
    WHERE c.channel_name RLIKE {$expr}
      AND c.channel_id = d.channel_id
      AND d.date_time <= {$p2}
    GROUP BY c.channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time

You can reduce the two sub-queries to one

SELECT channel_id,
   MAX(date_time) AS t1_date_time,
   MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
FROM data
WHERE date_time <= {$p2}
GROUP BY channel_id

GROUP BY is notoriously misleading in MySQL. Imagine if you had MIN() and MAX() in the same select, which row should the non-grouped columns come from? Once you understand this, you will see why it is not deterministic.

To get the full t0 and t1 rows

SELECT x.channel_id,
       t0.date_time, t0.value, t0.status, t0.connected,
       t1.date_time, t1.value, t1.status, t1.connected
FROM (
    SELECT channel_id,
       MAX(date_time) AS t1_date_time,
       MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
    FROM data
    WHERE date_time <= {$p2}
    GROUP BY channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time

And finally a join to get the channel name

SELECT c.channel_name,
       t0.date_time, t0.value, t0.status, t0.connected,
       t1.date_time, t1.value, t1.status, t1.connected,
       t0.value=t1.value AND t1.status=t0.status
                         AND t0.connected=t1.connected name_me
FROM (
    SELECT channel_id,
       MAX(date_time) AS t1_date_time,
       MAX(case when date_time <= {$p1} then date_time end) AS t0_date_time
    FROM data
    WHERE date_time <= {$p2}
    GROUP BY channel_id
) x
INNER JOIN channels c on c.channel_id = x.channel_id
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time

EDIT

To perform an RLIKE on channel name, it looks simple enough to add a WHERE clause at the end of the query on c.channel_name. It may however perform better to filter it at the subquery, making use of MySQL feature of processing comma-notation joins left to right.

SELECT x.channel_name,
       t0.date_time, t0.value, t0.status, t0.connected,
       t1.date_time, t1.value, t1.status, t1.connected,
       t0.value=t1.value AND t1.status=t0.status
                         AND t0.connected=t1.connected name_me
(
    SELECT c.channel_id, c.channel_name,
       MAX(d.date_time) AS t1_date_time,
       MAX(case when d.date_time <= {$p1} then d.date_time end) AS t0_date_time
    FROM channels c, data d
    WHERE c.channel_name RLIKE {$expr}
      AND c.channel_id = d.channel_id
      AND d.date_time <= {$p2}
    GROUP BY c.channel_id
) x
INNER JOIN data t1 on t1.channel_id = x.channel_id and t1.date_time = x.t1_date_time
LEFT JOIN data t0 on t0.channel_id = x.channel_id and t0.date_time = x.t0_date_time
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文