WHERE子句包含IN(子查询)的查询性能

发布于 2024-11-06 19:00:35 字数 1453 浏览 0 评论 0原文

SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade INNER JOIN
             TradeLine ON Trade.TradeId = TradeLine.TradeId
WHERE  (TradeLine.Id IN
                      (SELECT     PairOffId
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)))
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate

当表开始增长时,我担心 WHERE 子句中 IN 的性能。对于这种查询,有人有更好的策略吗?子查询返回的记录数的增长速度比 TradeLine 表中的记录数增长慢得多。 TradeLine 表本身以 10/天的速度增长。

谢谢。

编辑: 我使用了将子查询从 WHERE 移至 FROM 的想法。我对促成这个新查询的所有答案都投票了。

   SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate,   
          PairOff.Notional / 1000 AS Expr1
   FROM         Trade INNER JOIN
                  TradeLine ON Trade.TradeId = TradeLine.TradeId INNER JOIN
                      (SELECT     PairOffId, SUM(Notional) AS Notional
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                   HAVING (SUM(Notional) <> 0)) AS PairOff ON TradeLine.Id = PairOff.PairOffId
   ORDER BY Trade.Type, Trade.TradeDate
SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade INNER JOIN
             TradeLine ON Trade.TradeId = TradeLine.TradeId
WHERE  (TradeLine.Id IN
                      (SELECT     PairOffId
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)))
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate

I am concerned about the performance of the IN in the WHERE clause when the table starts to grow. Does anyone have a better strategy for this kind of query? The number of records returned by the subquery grows much slower than the number of records in the TradeLine table. The TradeLine table itself grows at a rate of 10/day.

Thank you.

EDIT:
I used the idea of moving the subquery from WHERE to FROM. I voted up on all answers that contributed to this new query.

   SELECT Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate,   
          PairOff.Notional / 1000 AS Expr1
   FROM         Trade INNER JOIN
                  TradeLine ON Trade.TradeId = TradeLine.TradeId INNER JOIN
                      (SELECT     PairOffId, SUM(Notional) AS Notional
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                   HAVING (SUM(Notional) <> 0)) AS PairOff ON TradeLine.Id = PairOff.PairOffId
   ORDER BY Trade.Type, Trade.TradeDate

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

幽蝶幻影 2024-11-13 19:00:36

我在 XXXXXX 数据库中的数十万条记录中遇到了同样的问题。在我的代码中,我想从所有节点检索层次结构(至少包含一个子节点)节点。

输入图片这里的描述

写入的初始查询非常慢。

  SELECT SUPPLIER_ID, PARENT_SUPPLIER_ID,
  FROM SUPPLIER
  WHERE 
    SUPPLIER_ID != PARENT_SUPPLIER_ID
    OR 
    SUPPLIER_ID   IN
      (SELECT DISTINCT PARENT_SUPPLIER_ID
       FROM SUPPLIER
       WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
      );

然后重新写入

  SELECT a.SUPPLIER_ID, a.PARENT_SUPPLIER_ID,
  FROM SUPPLIER a
  LEFT JOIN
  (SELECT DISTINCT PARENT_SUPPLIER_ID
  FROM SUPPLIER
  WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
  ) b
  ON a. SUPPLIER_ID     = b.PARENT_SUPPLIER_ID
  WHERE a. SUPPLIER_ID != a.PARENT_SUPPLIER_ID
     OR a. SUPPLIER_ID     = b.PARENT_SUPPLIER_ID;

I have faced same issue with hundreds of thousands of records in XXXXXX DB. In my code i want to retrieve hierarchy (node which contain at least one child) nodes from all nodes.

enter image description here

The initial query written which is very slow.

  SELECT SUPPLIER_ID, PARENT_SUPPLIER_ID,
  FROM SUPPLIER
  WHERE 
    SUPPLIER_ID != PARENT_SUPPLIER_ID
    OR 
    SUPPLIER_ID   IN
      (SELECT DISTINCT PARENT_SUPPLIER_ID
       FROM SUPPLIER
       WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
      );

Then re-written to

  SELECT a.SUPPLIER_ID, a.PARENT_SUPPLIER_ID,
  FROM SUPPLIER a
  LEFT JOIN
  (SELECT DISTINCT PARENT_SUPPLIER_ID
  FROM SUPPLIER
  WHERE SUPPLIER_ID != PARENT_SUPPLIER_ID
  ) b
  ON a. SUPPLIER_ID     = b.PARENT_SUPPLIER_ID
  WHERE a. SUPPLIER_ID != a.PARENT_SUPPLIER_ID
     OR a. SUPPLIER_ID     = b.PARENT_SUPPLIER_ID;
昵称有卵用 2024-11-13 19:00:36

使用 IN 实际上会强制您进行表扫描。当表增长时,执行时间也会增长。此外,您还对返回的每条记录运行该查询。使用标量选择作为表会更容易:

SELECT t.TradeId, t.Type, t.Symbol, t.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade t,
(SELECT     TradeId, PairOffID
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)) tl       
WHERE  t.TradeId = tl.TradeId
  and  t.id <> tl.PairOffID
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate

Using an IN will essentially force you to do a table scan. When your table grows, your execution time grows. Also you are running that query for each record returned. Would be easier to use a scalar select as a table:

SELECT t.TradeId, t.Type, t.Symbol, t.TradeDate, 
       SUM(TradeLine.Notional) / 1000 AS Expr1
FROM   Trade t,
(SELECT     TradeId, PairOffID
                        FROM          TradeLine AS TradeLine_1
                        WHERE      (TradeDate <= '2011-05-11')
                        GROUP BY PairOffId
                        HAVING      (SUM(Notional) <> 0)) tl       
WHERE  t.TradeId = tl.TradeId
  and  t.id <> tl.PairOffID
GROUP BY Trade.TradeId, Trade.Type, Trade.Symbol, Trade.TradeDate
ORDER BY Trade.Type, Trade.TradeDate
不再见 2024-11-13 19:00:35

IN 子句中的子查询不依赖于外部查询中的任何内容。您可以安全地将其移至 FROM 子句中;一个理智的查询计划构建器会自动完成它。

此外,必须对要在生产中使用的任何查询调用 EXPLAIN PLAN。执行此操作并查看 DBMS 对此查询计划的看法。

The subquery in the IN clause does not depend on anything in the outer query. You can safely move it into FROM clause; a sane query plan builder would do it automatically.

Also, calling EXPLAIN PLAN on any query you're going to use in production is a must. Do it and see what the DBMS thinks of the plan for this query.

小嗷兮 2024-11-13 19:00:35

当子查询开始返回太大的结果集时,我很喜欢临时表。

因此,您的 where 子句将

Where TradeLine.Id In (Select PairOffId From #tempResults)

被定义为 #tempResults (警告:语法来自内存,这意味着可能存在错误)

Select PairOffId Into #tempResults
From TradeLine
Where (TradeDate <= @TradeDate) 
  //I prefer params in case the query becomes a StoredProc
Group By PairOffId
Having (Sum(Notional) <> 0)

I'm a fan of temp tables when a sub-query starts returning too large a result set.

So your where clause would just be

Where TradeLine.Id In (Select PairOffId From #tempResults)

and #tempResults would be defined as (warning: syntax is from memory, which means there may be errors)

Select PairOffId Into #tempResults
From TradeLine
Where (TradeDate <= @TradeDate) 
  //I prefer params in case the query becomes a StoredProc
Group By PairOffId
Having (Sum(Notional) <> 0)
债姬 2024-11-13 19:00:35

我有两个建议你可以尝试:

1)。使用 Exists 因为您不需要从子查询获取数据,如下所示:

存在的地方(从 TradeLine 选择 1
AS TradeLine_1
其中 TradeLine.Id = TradeLine_1.PairOffId
-- 继续您的子查询...)

2).例如,主查询连接到您的子查询

...加入 (your_subquery)
your_subquery.PairOffId = TradeLine.Id

我相信这两种方式可以实现比“In”操作更好的性能。

I have 2 suggestion you can try:

1). use Exists since you don't need get data from subquery, like this:

where exists ( select 1 from TradeLine
AS TradeLine_1
where TradeLine.Id = TradeLine_1.PairOffId
-- continue with your subquery ... )

2). main query join to your subquery, for instance

... join ( your_subquery) on
your_subquery.PairOffId = TradeLine.Id

I believe these 2 ways can achieve better performance than "In" operation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文