SQL Server 查询调优：为什么 CPU 时间高于运行时间？它们与集合操作相关吗？

发布于 2024-11-16 06:47:32 字数 3252 浏览 7 评论 0原文

我有两个查询来过滤一些用户 ID，具体取决于问题及其答案。

场景

查询A是（原始版本）：

SELECT userid
FROM mem..ProfileResult
WHERE ( ( QuestionID = 4
          AND QuestionLabelID = 0
          AND AnswerGroupID = 4
          AND ResultValue = 1
        )
        OR ( QuestionID = 14
             AND QuestionLabelID = 0
             AND AnswerGroupID = 19
             AND ResultValue = 3
           )
        OR ( QuestionID = 23
             AND QuestionLabelID = 0
             AND AnswerGroupID = 28
             AND ( ResultValue & 16384 > 0 )
           )
        OR ( QuestionID = 17
             AND QuestionLabelID = 0
             AND AnswerGroupID = 22
             AND ( ResultValue = 6
                   OR ResultValue = 19
                   OR ResultValue = 21
                 )
           )
        OR ( QuestionID = 50
             AND QuestionLabelID = 0
             AND AnswerGroupID = 51
             AND ( ResultValue = 10
                   OR ResultValue = 41
                 )
           )
      )
GROUP BY userid
HAVING COUNT(*) = 5

我使用'set stats time on'和'set statistic io on'来检查cpu时间和io性能。

结果是：

CPU time = 47206 ms,  elapsed time = 20655 ms.

我通过使用设置操作重写了查询A，让我将其命名为查询B：

SELECT userid
FROM ( SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 4
            AND QuestionLabelID = 0
            AND AnswerGroupID = 4
            AND ResultValue = 1
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 14
            AND QuestionLabelID = 0
            AND AnswerGroupID = 19
            AND ResultValue = 3
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 23
            AND QuestionLabelID = 0
            AND AnswerGroupID = 28
            AND ( ResultValue & 16384 > 0 )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 17
            AND QuestionLabelID = 0
            AND AnswerGroupID = 22
            AND ( ResultValue = 6
                  OR ResultValue = 19
                  OR ResultValue = 21
                )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 50
            AND QuestionLabelID = 0
            AND AnswerGroupID = 51
            AND ( ResultValue = 10
                  OR ResultValue = 41
                )
     ) vv;

CPU时间和运行时间是：

CPU time = 8480 ms,  elapsed time = 18509 ms

我的简单分析

从上面的结果中可以看到，查询A的CPU时间超过了运行时间的2倍time

我搜索这个案例，大多数人说CPU时间应该小于Elapsed时间，因为CPU时间是CPU运行这个任务的时间。经过的时间包括 I/O 时间和其他类型的时间成本。但一种特殊情况是服务器具有多个核心 CPU 时。不过，我刚刚检查了开发数据库服务器，它有一个单核 CPU。

问题1

在单核CPU环境下，如何解释查询A中的CPU时间大于Elapsed时间？

问题2

使用集合运算后，性能真的提高了吗？

我有这个问题是因为查询 B 的逻辑读取是 280627，它高于查询 A 的 241885

Brad McGehee 在他的文章中说文章 '查询执行的逻辑读取越少，它的效率就越高，执行速度就越快，假设所有其他条件都相同。'

那么，它是否正确地说，即使查询 B 的逻辑读取量高于查询 A，但 CPU 时间明显少于查询 A，因此查询 B 应该具有更好的性能。

原文

I have two query to filter some userid depend on question and its answers.

Scenario

Query A is (the original version):

SELECT userid
FROM mem..ProfileResult
WHERE ( ( QuestionID = 4
          AND QuestionLabelID = 0
          AND AnswerGroupID = 4
          AND ResultValue = 1
        )
        OR ( QuestionID = 14
             AND QuestionLabelID = 0
             AND AnswerGroupID = 19
             AND ResultValue = 3
           )
        OR ( QuestionID = 23
             AND QuestionLabelID = 0
             AND AnswerGroupID = 28
             AND ( ResultValue & 16384 > 0 )
           )
        OR ( QuestionID = 17
             AND QuestionLabelID = 0
             AND AnswerGroupID = 22
             AND ( ResultValue = 6
                   OR ResultValue = 19
                   OR ResultValue = 21
                 )
           )
        OR ( QuestionID = 50
             AND QuestionLabelID = 0
             AND AnswerGroupID = 51
             AND ( ResultValue = 10
                   OR ResultValue = 41
                 )
           )
      )
GROUP BY userid
HAVING COUNT(*) = 5

I use 'set statistics time on' and 'set statistic io on' to check the cpu time and io performance.

the result is:

CPU time = 47206 ms,  elapsed time = 20655 ms.

I rewrote Query A via using Set Operation, let me name it Query B:

SELECT userid
FROM ( SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 4
            AND QuestionLabelID = 0
            AND AnswerGroupID = 4
            AND ResultValue = 1
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 14
            AND QuestionLabelID = 0
            AND AnswerGroupID = 19
            AND ResultValue = 3
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 23
            AND QuestionLabelID = 0
            AND AnswerGroupID = 28
            AND ( ResultValue & 16384 > 0 )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 17
            AND QuestionLabelID = 0
            AND AnswerGroupID = 22
            AND ( ResultValue = 6
                  OR ResultValue = 19
                  OR ResultValue = 21
                )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 50
            AND QuestionLabelID = 0
            AND AnswerGroupID = 51
            AND ( ResultValue = 10
                  OR ResultValue = 41
                )
     ) vv;

the CPU Time and Elapsed Time is:

CPU time = 8480 ms,  elapsed time = 18509 ms

My Simple Analysis

As you can see from up result, Query A have CPU Time more than 2 times of Elapsed time

I search for this case, mostly people say CPU time should less than Elapsed time, because CPU time is how long the CPU running this task. And the Elapsed time include I/O time and other sort of time cost. But one special case is when the Server has multiple Core CPU. However, I just checked the development db server and it has one single core CPU.

Question 1

How to explain that CPU time more than Elapsed time in Query A in a single core CPU environment?

Question 2

After, using set operation, Is the performance really improved?

I have this question because logical reads of Query B is 280627 which is higher than Query A's 241885

Brad McGehee said in his article that 'The fewer the logical reads performed by a query, the more efficient it is, and the faster it will perform, assuming all other things are held equal.'

Than, does it correctly say that even Query B have higher logical reads than Query A, but CPU time is significantly less than Query A, Query B should have a better performance.

分享到QQ

分享到微博