mysql - 选择每个值的最新值，在 x 日期之前，如果不是在 x 日期之后，并且不在其他表中

发布于 2024-12-20 07:37:14 字数 861 浏览 2 评论 0原文

我有表 t：

id, timestamp

有多个 id 值，并且多行可能共享给定的 id。

我想为每个 id 选择 x 日期之前的最新行，前提是在 x 日期之后找不到 id，并且在表 y 中也找不到 id。

我可以选择 x 日期之前的所有内容，在本例中 :date=5 ：

SELECT * FROM t WHERE timestamp < :date

我尝试仅获取最近的 id，而不是获取最新的，但每个 id 返回 1 行：

SELECT * FROM t WHERE timestamp < :date GROUP BY id ORDER BY timestamp DESC

我担心 GROUP BY 会减慢很多操作的速度数据。

这是一些示例数据库数据：

CREATE TABLE IF NOT EXISTS `t` (
  `id` int(2) NOT NULL,
  `timestamp` int(2) NOT NULL
) 
INSERT INTO `t` (`id`, `timestamp`) VALUES
(1, 1),
(1, 4),
(2, 3),
(2, 1),
(2, 6),
(3, 4),
(3, 2);

CREATE TABLE IF NOT EXISTS `y` (
  `id` int(2) NOT NULL,
  `timestamp` int(2) NOT NULL
) 
INSERT INTO `y` (`id`, `timestamp`) VALUES
(3, 1);

希望仅返回行 (1,4)...

谢谢！

原文

I have table t:

id, timestamp

There are multiple id values, and multiple rows might share a given id.

I want to select the most recent row for each id, before x date, only if id is not found after x date, and id is also not found in table y.

I can select all before x date, in this example :date=5 :

SELECT * FROM t WHERE timestamp < :date

I attempted to get most recent id only, not getting most recent--but returning 1 row per id:

SELECT * FROM t WHERE timestamp < :date GROUP BY id ORDER BY timestamp DESC

I'm concerned GROUP BY will slow things with lots of data.

Here is some sample db data:

CREATE TABLE IF NOT EXISTS `t` (
  `id` int(2) NOT NULL,
  `timestamp` int(2) NOT NULL
) 
INSERT INTO `t` (`id`, `timestamp`) VALUES
(1, 1),
(1, 4),
(2, 3),
(2, 1),
(2, 6),
(3, 4),
(3, 2);

CREATE TABLE IF NOT EXISTS `y` (
  `id` int(2) NOT NULL,
  `timestamp` int(2) NOT NULL
) 
INSERT INTO `y` (`id`, `timestamp`) VALUES
(3, 1);

Looking to return row (1,4) only...

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

回眸一遍 2024-12-27 07:37:14

您需要选择 MAX要获取最新时间（而不是排序），请执行 LEFT JOIN 比较对方的数据表，以及 HAVING 作为 GROUP BY 的参数选择适当的数据。

SELECT t.id, MAX(t.timestamp) AS latest_timestamp
FROM t
LEFT JOIN y on t.id = y.id
WHERE y.id IS NULL
GROUP BY t.id
HAVING latest_timestamp <= :date

当您执行 GROUP BY 时，您可以使用聚合函数进行选择。这里 MAX 返回组中所有行中该列的最大值（因为您按 id 分组，所以这将返回每个 id 的最大时间戳）。但是您只想选择 :date 之后没有时间戳的元素 — 这就是 HAVING 的用武之地（HAVING 本质上是 GROUP BY 聚合的 WHERE）。最后，您不想选择表 y 中的元素。因此，您 LEFT JOIN 表 y 中，并且仅选择表 y 中不存在相应行的行（即表 y 中不存在该 id）；您可以使用常规 WHERE 来执行此操作。

更新：为了提高效率，您所要做的就是向适当的列添加索引。在这种情况下，您需要为 t.id、t.timestamp 和 y.id 添加索引。请参阅 dev.mysql.com/doc/refman/5.0/en/ mysql-indexes.html。

You need to select with a MAX to get the latest time (rather than sorting) do a LEFT JOIN to compare data in the other table, and a HAVING as an argument to GROUP BY to only select the appropriate data.

SELECT t.id, MAX(t.timestamp) AS latest_timestamp
FROM t
LEFT JOIN y on t.id = y.id
WHERE y.id IS NULL
GROUP BY t.id
HAVING latest_timestamp <= :date

When you do a GROUP BY you can select with aggregate functions. Here MAX returns the maximum value for that column in all the rows in the group (since you are grouping by id, this will return the maximum timestamp for each id). But you only want to select elements that don't have a timestamp after :date — that's where HAVING comes in (HAVING is essentially a WHERE for GROUP BY aggregates). Finally, you don't want to select elements that are in table y. So you LEFT JOIN table y in, and only select rows where the corresponding row in table y doesn't exist (i.e. that id doesn't exist in table y); you do this using a regular WHERE.

UPDATE: To make this efficient, all you have to do is add indexes to the appropriate columns. In this case, you would want to add indexes for t.id, t.timestamp, and y.id. See dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html.

回复收藏 0 原文

~没有更多了~