SQL 连接 + GROUPBY 从具有 MAX(Date) 的行中选择数据

发布于 2024-12-22 21:55:29 字数 1730 浏览 2 评论 0原文

我无法找到此 SQL 查询的解决方案。

架构

编辑:添加项目表

项目表

  • PK ItemID
  • 许多其他列

链接表

  • FK ItemID uniqueidentifier
  • FK TransactionID uniqueidentifier

交易表

  • PK ID uniqueidentifier
  • EntryDateTime DateTime
  • (其他几行 int、varchar...)

编辑 :我想我还没有弄清楚关系。每个项目(表未显示)可以有多个事务。多个项目可以共享同一个事务(因此有链接表)。

请参阅我当前查询的底部。我留下了这个删除线以显示问题的进展。

我想做这样的查询。诀窍是我希望 t.varchar 和 t.int 列是 MAX(t.EntryDateTime) 行中的任何值。我什至不知道 group by 是否是执行此查询的正确方法。

SELECT lt.ItemID, MAX(t.EntryDateTime), t.varchar, t.int 
FROM LinkingTable lt
LEFT JOIN Transactions t ON lt.TransactionID = t.ID
GROUP BY lt.ItemID 

该表将在此 SQL 查询中进行连接,因此请尝试为我提供性能最佳的解决方案。假设 Table1 将包含数百万条记录。

SELECT 
(many columns)
FROM Table1
LEFT JOIN Table2 ON Table1.Table2ID = Table2.ID
LEFT JOIN Table3 ON ....
LEFT JOIN Table4 ON (Table2.ID = Table4.Table2ID and Table4.LocaleID = 127 and Table4.Type = 0)
LEFT JOIN **the query above** AS vTable1 ON  vTable1.ItemID = Table1.ID
WHERE Table1.CheckID IN (SELECT ID FROM Checks WHERE ....)

编辑:这是我的查询正在工作,但我不确定它是最有效的。 LinkingTable 有大约 200k 条记录,运行需要 6 秒。

SELECT DISTINCT lt.ItemID, t.EntryDateTime,  t.varchar, t.int
FROM LinkingTable lt 
     LEFT JOIN Transactions t ON t.id = (SELECT Top 1 t2.id FROM LinkingTable lt2
LEFT JOIN Transactions t2 on lt2.TransactionID = t2.ID
where lt2.ItemID = lt.ItemID ORDER BY t2.PrintTime DESC)

I'm having trouble figuring out the solution to this SQL query.

Schema

Edit: Adding Item Table

Item Table

  • PK ItemID
  • lots of other columns

Linking Table

  • FK ItemID uniqueidentifier
  • FK TransactionID uniqueidentifier

Transaction Table

  • PK ID uniqueidentifier
  • EntryDateTime DateTime
  • (several other rows of int, varchar...)

Edit : I think I haven't made the relationships clear. Each ITEM (table not shown) can have multiple transactions. Multiple items can share the same transaction (hence the linking table).

Please see the bottom for my current Query. I have left this striked to show the progression of the question.


I want to do something like this query. The trick is I want the t.varchar and t.int columns to be whatever values are in the MAX(t.EntryDateTime) row. I don't even know if group by is the right way to do this query.

SELECT lt.ItemID, MAX(t.EntryDateTime), t.varchar, t.int 
FROM LinkingTable lt
LEFT JOIN Transactions t ON lt.TransactionID = t.ID
GROUP BY lt.ItemID 


This table is going to be joined against in this SQL query, so please try and give me the most performant solution . Assume Table1 will contain millions of records.

SELECT 
(many columns)
FROM Table1
LEFT JOIN Table2 ON Table1.Table2ID = Table2.ID
LEFT JOIN Table3 ON ....
LEFT JOIN Table4 ON (Table2.ID = Table4.Table2ID and Table4.LocaleID = 127 and Table4.Type = 0)
LEFT JOIN **the query above** AS vTable1 ON  vTable1.ItemID = Table1.ID
WHERE Table1.CheckID IN (SELECT ID FROM Checks WHERE ....)

Edit : This is the query I have that is working, but I'm not sure its the most efficient. LinkingTable has ~ 200k records and its taking 6 seconds to run.

SELECT DISTINCT lt.ItemID, t.EntryDateTime,  t.varchar, t.int
FROM LinkingTable lt 
     LEFT JOIN Transactions t ON t.id = (SELECT Top 1 t2.id FROM LinkingTable lt2
LEFT JOIN Transactions t2 on lt2.TransactionID = t2.ID
where lt2.ItemID = lt.ItemID ORDER BY t2.PrintTime DESC)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

沐歌 2024-12-29 21:55:29

尝试一下,

SELECT i.*, outerT.EntryDateTime, outerT.varchar, outerT.int
FROM Item i
     LEFT JOIN
     (SELECT ItemId AS outerItemId, EntryDateTime, varchar, int
      FROM (SELECT ROW_NUMBER() OVER (PARTITION BY lt.ItemId ORDER BY t.EntryDateTime) AS RowNumber, lt.ItemId, t.EntryDateTime, t.varchar, t.int
            FROM Tranaction t INNER JOIN LinkingTable lt ON lt.TransactionId = t.ID) innerT
      WHERE RowNumber = 1) outerT ON outerT.outerItemId = Item.ID

希望这能解决您的问题

Try this,

SELECT i.*, outerT.EntryDateTime, outerT.varchar, outerT.int
FROM Item i
     LEFT JOIN
     (SELECT ItemId AS outerItemId, EntryDateTime, varchar, int
      FROM (SELECT ROW_NUMBER() OVER (PARTITION BY lt.ItemId ORDER BY t.EntryDateTime) AS RowNumber, lt.ItemId, t.EntryDateTime, t.varchar, t.int
            FROM Tranaction t INNER JOIN LinkingTable lt ON lt.TransactionId = t.ID) innerT
      WHERE RowNumber = 1) outerT ON outerT.outerItemId = Item.ID

Hope this solves your problem

栩栩如生 2024-12-29 21:55:29

即使有超过一百万条记录,您也会遇到一些性能问题,但我会确保基于( ItemID,主键)对事务表建立索引。原因是主键而不是日期——如果它是自动递增的,并且它是在事务发生时标记的日期/时间戳,那么它们本质上是一一的。文件中的最后一个条目始终具有最新日期。也就是说,ID 列使用索引应该比日期/时间更快。这也避免了查看最近日期的两个元素以及与该日期关联的交易 ID 的需要。这是我首先尝试查询的方式。

select 
      I.*,
      T2.*
   from
      Item I
         JOIN 
            ( select T.ItemID, MAX( T.PrimaryKey ) as LastEntryPerItem 
                 from Transactions T
                 group by T.ItemID ) MaxPerItem
            ON I.ItemID = T.ItemID

            JOIN Transactions T2
               on MaxPerItem.LastEntryPerItem = T2.PrimaryKey
   order by
      whatever

Even with a million plus records, you will have some performance hits, but I would ensure and index on the transaction table based on the ( ItemID, Primary Key ). The reason Primary Key and not the date -- if its auto-incremented, and it's date/time stamped at time the transaction occurs, they will be in-essence, one-in-the-same. The last entry in the file will always have the latest date. That said, an ID column should be faster with index than a date/time. This also prevents need of looking at BOTH elements of most recent date, and the transaction ID associated with that date. Here is how I would FIRST attempt the query.

select 
      I.*,
      T2.*
   from
      Item I
         JOIN 
            ( select T.ItemID, MAX( T.PrimaryKey ) as LastEntryPerItem 
                 from Transactions T
                 group by T.ItemID ) MaxPerItem
            ON I.ItemID = T.ItemID

            JOIN Transactions T2
               on MaxPerItem.LastEntryPerItem = T2.PrimaryKey
   order by
      whatever
魂牵梦绕锁你心扉 2024-12-29 21:55:29
select lt.ItemId, t.entrydatetime, t.varchar, t.int
from LinkingTable lt
left join transactions t 
     on lt.transactionId = t.id
        and t.entryDateTime = (select max(t.EntryDateTime)
                               from transactions t2
                               where t2.id = t.id)

我之前也有过类似的问题
( SQL 连接以获取属于最近日期的值< /a>)。 JNK 还有另一种解决方案,涉及两个连接,可能会更快。我已经在下面发帖了。您需要进行测试,看看哪个性能更好。

select lt.ItemId, t.entrydatetime, t.varchar, t.int
from LinkingTable lt
inner join transactions t 
   on lt.ItemId= t.ItemId
Inner join (SELECT ItemId, MAX(entrydatetime) entrydatetime
            FROM transactions t2
            GROUP BY ItemId) SubQ
ON SubQ.ItemId= t.ItemId
AND SubQ.entrydatetime= t.entrydatetime
select lt.ItemId, t.entrydatetime, t.varchar, t.int
from LinkingTable lt
left join transactions t 
     on lt.transactionId = t.id
        and t.entryDateTime = (select max(t.EntryDateTime)
                               from transactions t2
                               where t2.id = t.id)

I had a similar question before
( SQL Join to get value belong with most recent date). There's another solution by JNK involving two joins which may be faster. I've posted below. You'll need to test to see which performs better.

select lt.ItemId, t.entrydatetime, t.varchar, t.int
from LinkingTable lt
inner join transactions t 
   on lt.ItemId= t.ItemId
Inner join (SELECT ItemId, MAX(entrydatetime) entrydatetime
            FROM transactions t2
            GROUP BY ItemId) SubQ
ON SubQ.ItemId= t.ItemId
AND SubQ.entrydatetime= t.entrydatetime
不交电费瞎发啥光 2024-12-29 21:55:29

为什么不创建一个包含所有“多列”的视图,然后对该视图运行查询?

Why don't you create a view that has all your "many columns" and then run a query against that view?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文