WHERE 子句中的子查询查询始终超时

发布于 2024-11-08 04:23:28 字数 1310 浏览 0 评论 0原文

我有以下查询（为清楚起见略有修改）：

CREATE PROCEDURE Kctc.CaseTasks_GetCaseTasks
@CaseNumber int
... other parameters
,@ChangedBefore datetime
,@ChangedAfter datetime
AS
SELECT Kctc.CaseTasks.CaseTaskId
  ...blah blah blah
  FROM Kctc.CaseTasks
  ... some joins here
  WHERE  
  ... some normal where clauses
  AND 
  (
    (@ChangedAfter IS NULL AND @ChangedBefore IS NULL)
    OR
    EXISTS (SELECT *
               FROM Kctc.FieldChanges
               WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND 
                     Kctc.FieldChanges.TableName = 'CaseTasks' AND 
                     Kctc.FieldChanges.DateOfChange BETWEEN
                         ISNULL(@ChangedAfter, '2000/01/01') AND
                         ISNULL(@ChangedBefore, '2050/01/01'))
  )

每当用户指定 @ChangedBefore 或 @ChangedAfter 的值时，此查询就会超时，因此调用子查询。

子查询检查名为 FieldChanges 的表中是否存在记录（它有效地记录 CaseTasks 表中每个字段的更改）。

查询 FieldChanges 效率不高，因为它涉及对未索引的文本字段 TableName 进行过滤。而且我知道子查询本质上是低效的。

所以我的一般问题是，有没有办法重新设计查询以使其性能更好？

当存在多个关联的 FieldChanges 时（即保留 EXISTS 语义），我想不出一种方法将子查询表示为联接，同时仍然只返回一个 CaseTask 行。我尚未对 FieldChanges 表的 TableName 字段建立索引，因为我对对文本字段建立索引犹豫不决。

那我该怎么办呢？

原文

I have the following query (slightly amended for clarity):

CREATE PROCEDURE Kctc.CaseTasks_GetCaseTasks
@CaseNumber int
... other parameters
,@ChangedBefore datetime
,@ChangedAfter datetime
AS
SELECT Kctc.CaseTasks.CaseTaskId
  ...blah blah blah
  FROM Kctc.CaseTasks
  ... some joins here
  WHERE  
  ... some normal where clauses
  AND 
  (
    (@ChangedAfter IS NULL AND @ChangedBefore IS NULL)
    OR
    EXISTS (SELECT *
               FROM Kctc.FieldChanges
               WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND 
                     Kctc.FieldChanges.TableName = 'CaseTasks' AND 
                     Kctc.FieldChanges.DateOfChange BETWEEN
                         ISNULL(@ChangedAfter, '2000/01/01') AND
                         ISNULL(@ChangedBefore, '2050/01/01'))
  )

This query times out whenever the user specifies values for @ChangedBefore or @ChangedAfter, therefore invoking the subquery.

The subquery checks for the existence of a record in the table called FieldChanges (which effectively records changes to every field in the CaseTasks table).

Querying FieldChanges isn't very efficient because it involves filtering on the text field TableName which isn't indexed. And I know that subqueries are inherently inefficient.

So my question in general is, is there a way to redesign the query so that it performs better?

I can't think of a way to express the subquery as a join while still returning just one CaseTask row when there are multiple associated FieldChanges (i.e. preserving the EXISTS semantic). I haven't yet indexed the TableName field of the FieldChanges table because I'm hesitant about indexing text fields.

So what should I do?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

枫以 2024-11-15 04:23:28

作为第一步，您可以尝试在表 Kctc.FieldChanges 的 RecordId、TableName 和 DateOfChange 字段上放置一个索引（一个包含所有三个字段的索引），看看是否有帮助。

分享并享受。

回复收藏 0 原文

幽蝶幻影 2024-11-15 04:23:28

我的第一直觉是限制结果集

SELECT *
FROM Kctc.FieldChanges
WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
    Kctc.FieldChanges.TableName = 'CaseTasks' AND 
    Kctc.FieldChanges.DateOfChange BETWEEN
        ISNULL(@ChangedAfter, '2000/01/01') AND
        ISNULL(@ChangedBefore, '2050/01/01'

更改，

SELECT TOP 1 Kctc.FieldChanges.RecordId 
FROM Kctc.FieldChanges
WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
     Kctc.FieldChanges.TableName = 'CaseTasks' AND 
     Kctc.FieldChanges.DateOfChange BETWEEN
         ISNULL(@ChangedAfter, '2000/01/01') AND
         ISNULL(@ChangedBefore, '2050/01/01'

然后查看 where 子句中字段的索引

编辑：关于 TOP 1 - 可能不会带来那么多好处，但不应该造成伤害，并且可能有助于避免表扫描。使用单个字段而不是 * 应该只返回该列（我假设它不是 NULL 值列）

其他想法：声明并设置本地值而不是多次处理的 ISNULL 事物：

DECLARE @checkmyafter datetime; -- assumption on my part here on the type
SET @checkmyafter = ISNULL(@ChangedAfter, '2000/01/01');

与之前执行相同的操作，然后使用

...
 SELECT TOP 1 Kctc.FieldChanges.RecordId 
    FROM Kctc.FieldChanges
    WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
         Kctc.FieldChanges.TableName = 'CaseTasks' AND 
         Kctc.FieldChanges.DateOfChange BETWEEN
             @checkmybefore  AND  @checkmyafter 
...

另一件事：检查 WHERE xxx AND 的序列 - 使用最有可能的候选者来隔离序列中的第一个，无论是哪个条件，这样它就可以更快地退出。如果是RecordId，则查找，如果TableName更好，则先使用它。如果一列也有索引，则认为其他条件都相同。

My first instinct is to limit the result set

SELECT *
FROM Kctc.FieldChanges
WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
    Kctc.FieldChanges.TableName = 'CaseTasks' AND 
    Kctc.FieldChanges.DateOfChange BETWEEN
        ISNULL(@ChangedAfter, '2000/01/01') AND
        ISNULL(@ChangedBefore, '2050/01/01'

changed to

SELECT TOP 1 Kctc.FieldChanges.RecordId 
FROM Kctc.FieldChanges
WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
     Kctc.FieldChanges.TableName = 'CaseTasks' AND 
     Kctc.FieldChanges.DateOfChange BETWEEN
         ISNULL(@ChangedAfter, '2000/01/01') AND
         ISNULL(@ChangedBefore, '2050/01/01'

then look at the index on the fields in the where clause

EDIT: Regarding the TOP 1 - likely not giving that much of a benefit, but should not hurt, and might help avoid a table scan. Using a single field instead of * should only return that column (I assume it is NOT a NULL value column here)

Additional thoughts: Declare and Set a local value instead of the ISNULL thing which gets processed multiple times:

DECLARE @checkmyafter datetime; -- assumption on my part here on the type
SET @checkmyafter = ISNULL(@ChangedAfter, '2000/01/01');

do same with before, then use

...
 SELECT TOP 1 Kctc.FieldChanges.RecordId 
    FROM Kctc.FieldChanges
    WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
         Kctc.FieldChanges.TableName = 'CaseTasks' AND 
         Kctc.FieldChanges.DateOfChange BETWEEN
             @checkmybefore  AND  @checkmyafter 
...

ONE MORE THING: Check the sequence of the WHERE xxx AND - use the MOST LIKELY candidate to isolate FIRST in the sequence, whichever condition that is, so it can get out faster. If that is the RecordId, then find, if the TableName is better, use that first instead. If one column also has an index already consider that one all else being equal.

回复收藏 0 原文

决绝 2024-11-15 04:23:28

这不会是一个“好的”解决方案，但它可能比现在发生的情况更好：

SELECT Kctc.CaseTasks.CaseTaskId
  ...blah blah blah
  FROM Kctc.CaseTasks
  ... some joins here
  LEFT JOIN (
     SELECT RecordID
     FROM Kctc.FieldChanges
     WHERE Kctc.FieldChanges.TableName = 'CaseTasks'
     AND Kctc.FieldChanges.DateOfChange BETWEEN
                         ISNULL(@ChangedAfter, '2000/01/01') AND
                         ISNULL(@ChangedBefore, '2050/01/01')
     GROUP BY RecordID
  ) AS MatchingChanges ON Kctc.CaseTasks.RecordId = MatchingChanges.RecordId
  WHERE  
  ... some normal where clauses
     AND (MatchingChanges.RecordID Is Not Null OR ((@ChangedAfter IS NULL AND @ChangedBefore IS NULL))

具体取决于查询计划是什么 - 如果它重复执行子查询，这个公式可能会帮助。

This is not going to be a "good" solution, but it might be better than what's happening right now:

SELECT Kctc.CaseTasks.CaseTaskId
  ...blah blah blah
  FROM Kctc.CaseTasks
  ... some joins here
  LEFT JOIN (
     SELECT RecordID
     FROM Kctc.FieldChanges
     WHERE Kctc.FieldChanges.TableName = 'CaseTasks'
     AND Kctc.FieldChanges.DateOfChange BETWEEN
                         ISNULL(@ChangedAfter, '2000/01/01') AND
                         ISNULL(@ChangedBefore, '2050/01/01')
     GROUP BY RecordID
  ) AS MatchingChanges ON Kctc.CaseTasks.RecordId = MatchingChanges.RecordId
  WHERE  
  ... some normal where clauses
     AND (MatchingChanges.RecordID Is Not Null OR ((@ChangedAfter IS NULL AND @ChangedBefore IS NULL))

Depends on exactly what the query plan is - if it was performing the subquery repeatedly, this formulation might help.

回复收藏 0 原文