MySQL - 连接两个带有日期时间列的表和该日期之前的三个最新条目

发布于 2024-12-02 21:29:27 字数 1277 浏览 1 评论 0原文

我的 SQL 中有两个表

例如 Table1 - ItemPrice:

DATETIME   |  ITEM | PRICE
2011-08-28 |   ABC   123 
2011-09-01 |   ABC   125
2011-09-02 |   ABC   124 
2011-09-03 |   ABC   127 
2011-09-04 |   ABC   126

Table2 - DayScore

DATETIME   |  ITEM | SCORE
2011-08-28 |   ABC    1
2011-08-29 |   ABC    8
2011-09-01 |   ABC    4
2011-09-02 |   ABC    2
2011-09-03 |   ABC    7  
2011-09-04 |   ABC    3

我想编写一个查询,给出一个项目 ID(例如 ABC< /em>),将从 ItemPrice 返回该日期的价格(如果该日期没有价格,则查询不应返回任何内容)。如果找到查询日期的有效价格,则查询应返回(9 列)

  • 过去三天 ItemPrice 中商品的价格(即查询日期之前的最近 3 个价格) )。
  • 在接下来的三列中,它应该从 DayScore 返回从 ItemPrice 中选择的 3 个日期的匹配分数。
  • 最后选择日期(t-1 到 t-3)

换句话说,仅查看 date='2011-09-03' 作为 item='abc' 示例的此查询的结果将返回:

DATE      |  ITEM  |  PRICE  |  SCR  | PRC_t-1 | PRC_t-2 | PRC_t-3 | SCR_t-1 | SCR_t-2 | SCR_t-3 | DATE_t-1  | DATE_t-2  | DATE_t-3 
2011-09-03|  ABC   |  127    |  7    | 124     | 125     | 123     | 2       | 4       | 1       | 2011-09-02| 2011-09-01| 2011-08-28
....

Etc 对于出现的每个日期在 ItemPrice 表中。

运行此查询的最简洁、最有效的方法是什么(因为它将运行数百万行)?

干杯!

I have two tables in my SQL

For example Table1 - ItemPrice:

DATETIME   |  ITEM | PRICE
2011-08-28 |   ABC   123 
2011-09-01 |   ABC   125
2011-09-02 |   ABC   124 
2011-09-03 |   ABC   127 
2011-09-04 |   ABC   126

Table2 - DayScore:

DATETIME   |  ITEM | SCORE
2011-08-28 |   ABC    1
2011-08-29 |   ABC    8
2011-09-01 |   ABC    4
2011-09-02 |   ABC    2
2011-09-03 |   ABC    7  
2011-09-04 |   ABC    3

I want to write a query, which given a item ID (e.g. ABC), will return the price at that date from ItemPrice (of there is no price for that date then the query should not return anything). If a valid price is found for the query date, the query should return (in 9 columns)

  • the price of the item from ItemPrice for the past three days (i.e. the most recent 3 prices before the date queried).
  • In the next three columns it should return, from DayScore, the matching score for those 3 dates selected from ItemPrice.
  • Finally the dates (t-1 to t-3) selected

In otherwords the results for this query looking at just date='2011-09-03' as an example for item='abc' would return:

DATE      |  ITEM  |  PRICE  |  SCR  | PRC_t-1 | PRC_t-2 | PRC_t-3 | SCR_t-1 | SCR_t-2 | SCR_t-3 | DATE_t-1  | DATE_t-2  | DATE_t-3 
2011-09-03|  ABC   |  127    |  7    | 124     | 125     | 123     | 2       | 4       | 1       | 2011-09-02| 2011-09-01| 2011-08-28
....

Etc for each date that appears in ItemPrice table.

What is the neatest and most efficient way to run this query (as its something that will be run over many millions of rows)?

Cheers!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

つ可否回来 2024-12-09 21:29:27

几乎没有,但它确实产生了结果。您可能可以删除一些子选择并使其少一点sql,但我尝试逐步构建它,以便您可以推断它正在做什么。

核心部分是这个选择:

SELECT 
  Sub2.*
, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
   (SELECT 
        Sub1.*
      , (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
    FROM
       (SELECT 
            ItemPrice.DateTime
          , (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 
        From ItemPrice) Sub1
   ) Sub2

这将返回一个包含日期的表(现在,t-1,t-2,t-3)。从那里可以简单地连接每个日期的价格和分数。包括 testdata 在内的所有内容都变成了一大堆 sql,

/*
CREATE TABLE ItemPrice (datetime Date, item varchar(3), price int);
CREATE TABLE DayScore ( datetime Date, item varchar(3), score int);

INSERT INTO ItemPrice VALUES ('20110828', 'ABC', 123);
INSERT INTO ItemPrice VALUES ('20110901', 'ABC', 125);
INSERT INTO ItemPrice VALUES ('20110902', 'ABC', 124);
INSERT INTO ItemPrice VALUES ('20110903', 'ABC', 127);
INSERT INTO ItemPrice VALUES ('20110904', 'ABC', 126);

INSERT INTO DayScore VALUES ('20110828', 'ABC', 1);
INSERT INTO DayScore VALUES ('20110829', 'ABC', 8);
INSERT INTO DayScore VALUES ('20110901', 'ABC', 4);
INSERT INTO DayScore VALUES ('20110902', 'ABC', 2);
INSERT INTO DayScore VALUES ('20110903', 'ABC', 7);
INSERT INTO DayScore VALUES ('20110904', 'ABC', 3);
*/

SELECT Hist.*, Current.Item, Current.Price, Current.Score
, Minus1.Price as PRC_1, Minus1.Score SCR_1
, Minus2.Price as PRC_2, Minus2.Score SCR_2
, Minus3.Price as PRC_3, Minus3.Score SCR_3
FROM 
    (SELECT Sub2.*, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
    FROM
        (SELECT Sub1.*, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
        FROM
            (SELECT ItemPrice.DateTime, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 From ItemPrice) Sub1) Sub2) Hist 
INNER JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Item, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) CURRENT
ON (Current.DateTime = Hist.DateTime)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS1
ON (Minus1.DateTime = Hist.T_1)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS2
ON (Minus2.DateTime = Hist.T_2)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS3
ON (Minus3.DateTime = Hist.T_3)        
WHERE Current.Item = 'ABC'

;

/*
DROP TABLE ItemPrice;
DROP TABLE DayScore;
*/

当您对 1M 行执行此操作时,我很好奇您的解释计划:) 如果您拥有正确的索引(您可能会这样做),那么它甚至可能不会那么可怕。

Pretty no but it does produce the results. You could probably get rid of some subselects and make it a bit less sql but I tried to build it up in steps so you can deduct what it is doing.

The core part is this select:

SELECT 
  Sub2.*
, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
FROM
   (SELECT 
        Sub1.*
      , (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
    FROM
       (SELECT 
            ItemPrice.DateTime
          , (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 
        From ItemPrice) Sub1
   ) Sub2

This returns a table with the dates (now, t-1, t-2, t-3). From there is is simple joining with price and score for each of those dates. The whole things including testdata the becomes this bulk of sql

/*
CREATE TABLE ItemPrice (datetime Date, item varchar(3), price int);
CREATE TABLE DayScore ( datetime Date, item varchar(3), score int);

INSERT INTO ItemPrice VALUES ('20110828', 'ABC', 123);
INSERT INTO ItemPrice VALUES ('20110901', 'ABC', 125);
INSERT INTO ItemPrice VALUES ('20110902', 'ABC', 124);
INSERT INTO ItemPrice VALUES ('20110903', 'ABC', 127);
INSERT INTO ItemPrice VALUES ('20110904', 'ABC', 126);

INSERT INTO DayScore VALUES ('20110828', 'ABC', 1);
INSERT INTO DayScore VALUES ('20110829', 'ABC', 8);
INSERT INTO DayScore VALUES ('20110901', 'ABC', 4);
INSERT INTO DayScore VALUES ('20110902', 'ABC', 2);
INSERT INTO DayScore VALUES ('20110903', 'ABC', 7);
INSERT INTO DayScore VALUES ('20110904', 'ABC', 3);
*/

SELECT Hist.*, Current.Item, Current.Price, Current.Score
, Minus1.Price as PRC_1, Minus1.Score SCR_1
, Minus2.Price as PRC_2, Minus2.Score SCR_2
, Minus3.Price as PRC_3, Minus3.Score SCR_3
FROM 
    (SELECT Sub2.*, (Select MAX(IP3.DateTime) FROM ItemPrice IP3 where IP3.DateTime < T_2) AS T_3
    FROM
        (SELECT Sub1.*, (Select MAX(IP2.DateTime) FROM ItemPrice IP2 where IP2.DateTime < T_1) AS T_2
        FROM
            (SELECT ItemPrice.DateTime, (Select MAX(IP.DateTime) FROM ItemPrice IP where IP.DateTime < ItemPrice.DateTime) AS T_1 From ItemPrice) Sub1) Sub2) Hist 
INNER JOIN
    (SELECT ItemPrice.DateTime, ItemPrice.Item, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) CURRENT
ON (Current.DateTime = Hist.DateTime)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS1
ON (Minus1.DateTime = Hist.T_1)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS2
ON (Minus2.DateTime = Hist.T_2)        
LEFT JOIN 
    (SELECT ItemPrice.DateTime, ItemPrice.Price, DayScore.Score FROM ItemPrice INNER JOIN DayScore ON (ItemPrice.Item = DayScore.Item AND ItemPrice.Datetime = DayScore.DateTime)) MINUS3
ON (Minus3.DateTime = Hist.T_3)        
WHERE Current.Item = 'ABC'

;

/*
DROP TABLE ItemPrice;
DROP TABLE DayScore;
*/

I'm curious about your explain plan when you do this on 1M rows :) It might not even be that horrible if you have the right indexes which you probably do.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文