SQL Server 查询调优:为什么 CPU 时间高于运行时间?它们与集合操作相关吗?

发布于 2024-11-16 06:47:32 字数 3252 浏览 7 评论 0原文

我有两个查询来过滤一些用户 ID,具体取决于问题及其答案。

场景

查询A是(原始版本):

SELECT userid
FROM mem..ProfileResult
WHERE ( ( QuestionID = 4
          AND QuestionLabelID = 0
          AND AnswerGroupID = 4
          AND ResultValue = 1
        )
        OR ( QuestionID = 14
             AND QuestionLabelID = 0
             AND AnswerGroupID = 19
             AND ResultValue = 3
           )
        OR ( QuestionID = 23
             AND QuestionLabelID = 0
             AND AnswerGroupID = 28
             AND ( ResultValue & 16384 > 0 )
           )
        OR ( QuestionID = 17
             AND QuestionLabelID = 0
             AND AnswerGroupID = 22
             AND ( ResultValue = 6
                   OR ResultValue = 19
                   OR ResultValue = 21
                 )
           )
        OR ( QuestionID = 50
             AND QuestionLabelID = 0
             AND AnswerGroupID = 51
             AND ( ResultValue = 10
                   OR ResultValue = 41
                 )
           )
      )
GROUP BY userid
HAVING COUNT(*) = 5

我使用'set stats time on'和'set statistic io on'来检查cpu时间和io性能。

结果是:

CPU time = 47206 ms,  elapsed time = 20655 ms.

我通过使用设置操作重写了查询A,让我将其命名为查询B:

SELECT userid
FROM ( SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 4
            AND QuestionLabelID = 0
            AND AnswerGroupID = 4
            AND ResultValue = 1
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 14
            AND QuestionLabelID = 0
            AND AnswerGroupID = 19
            AND ResultValue = 3
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 23
            AND QuestionLabelID = 0
            AND AnswerGroupID = 28
            AND ( ResultValue & 16384 > 0 )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 17
            AND QuestionLabelID = 0
            AND AnswerGroupID = 22
            AND ( ResultValue = 6
                  OR ResultValue = 19
                  OR ResultValue = 21
                )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 50
            AND QuestionLabelID = 0
            AND AnswerGroupID = 51
            AND ( ResultValue = 10
                  OR ResultValue = 41
                )
     ) vv;

CPU时间和运行时间是:

CPU time = 8480 ms,  elapsed time = 18509 ms

我的简单分析

从上面的结果中可以看到,查询A的CPU时间超过了运行时间的2倍time

我搜索这个案例,大多数人说CPU时间应该小于Elapsed时间,因为CPU时间是CPU运行这个任务的时间。经过的时间包括 I/O 时间和其他类型的时间成本。但一种特殊情况是服务器具有多个核心 CPU 时。不过,我刚刚检查了开发数据库服务器,它有一个单核 CPU。

问题1

在单核CPU环境下,如何解释查询A中的CPU时间大于Elapsed时间?

问题2

使用集合运算后,性能真的提高了吗?

我有这个问题是因为查询 B 的逻辑读取是 280627,它高于查询 A 的 241885

Brad McGehee 在他的文章中说文章 '查询执行的逻辑读取越少,它的效率就越高,执行速度就越快,假设所有其他条件都相同。'

那么,它是否正确地说,即使查询 B 的逻辑读取量高于查询 A,但 CPU 时间明显少于查询 A,因此查询 B 应该具有更好的性能。

I have two query to filter some userid depend on question and its answers.

Scenario

Query A is (the original version):

SELECT userid
FROM mem..ProfileResult
WHERE ( ( QuestionID = 4
          AND QuestionLabelID = 0
          AND AnswerGroupID = 4
          AND ResultValue = 1
        )
        OR ( QuestionID = 14
             AND QuestionLabelID = 0
             AND AnswerGroupID = 19
             AND ResultValue = 3
           )
        OR ( QuestionID = 23
             AND QuestionLabelID = 0
             AND AnswerGroupID = 28
             AND ( ResultValue & 16384 > 0 )
           )
        OR ( QuestionID = 17
             AND QuestionLabelID = 0
             AND AnswerGroupID = 22
             AND ( ResultValue = 6
                   OR ResultValue = 19
                   OR ResultValue = 21
                 )
           )
        OR ( QuestionID = 50
             AND QuestionLabelID = 0
             AND AnswerGroupID = 51
             AND ( ResultValue = 10
                   OR ResultValue = 41
                 )
           )
      )
GROUP BY userid
HAVING COUNT(*) = 5

I use 'set statistics time on' and 'set statistic io on' to check the cpu time and io performance.

the result is:

CPU time = 47206 ms,  elapsed time = 20655 ms.

I rewrote Query A via using Set Operation, let me name it Query B:

SELECT userid
FROM ( SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 4
            AND QuestionLabelID = 0
            AND AnswerGroupID = 4
            AND ResultValue = 1
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 14
            AND QuestionLabelID = 0
            AND AnswerGroupID = 19
            AND ResultValue = 3
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 23
            AND QuestionLabelID = 0
            AND AnswerGroupID = 28
            AND ( ResultValue & 16384 > 0 )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 17
            AND QuestionLabelID = 0
            AND AnswerGroupID = 22
            AND ( ResultValue = 6
                  OR ResultValue = 19
                  OR ResultValue = 21
                )
       INTERSECT
       SELECT userid
        FROM mem..ProfileResult
        WHERE QuestionID = 50
            AND QuestionLabelID = 0
            AND AnswerGroupID = 51
            AND ( ResultValue = 10
                  OR ResultValue = 41
                )
     ) vv;

the CPU Time and Elapsed Time is:

CPU time = 8480 ms,  elapsed time = 18509 ms

My Simple Analysis

As you can see from up result, Query A have CPU Time more than 2 times of Elapsed time

I search for this case, mostly people say CPU time should less than Elapsed time, because CPU time is how long the CPU running this task. And the Elapsed time include I/O time and other sort of time cost. But one special case is when the Server has multiple Core CPU. However, I just checked the development db server and it has one single core CPU.

Question 1

How to explain that CPU time more than Elapsed time in Query A in a single core CPU environment?

Question 2

After, using set operation, Is the performance really improved?

I have this question because logical reads of Query B is 280627 which is higher than Query A's 241885

Brad McGehee said in his article that 'The fewer the logical reads performed by a query, the more efficient it is, and the faster it will perform, assuming all other things are held equal.'

Than, does it correctly say that even Query B have higher logical reads than Query A, but CPU time is significantly less than Query A, Query B should have a better performance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

泡沫很甜 2024-11-23 06:47:32

如果 CPU 大于已用时间,则您确实拥有多核或超线程 CPU。CPU

时间是安装 SQL Server 引擎的时间。它不适用于本地 Management Studio 安装。

至于逻辑 IO 与 CPU,我会选择较低的 CPU。如果它经常运行并且重叠,您将首先耗尽 CPU 资源。我会尝试 WHERE EXISTS (UNION ALL) 构造并确保我有良好的索引。

编辑,在注释之后,

  • 计划中有并行运算符 = 操作系统和 SQL Server 可见的多个逻辑处理器。所以它要么是多核要么是超线程

尝试 EXEC xp_msver

If CPU is greater than elapsed, you do have a multi core or hyper-threaded CPU

The CPU time is where the SQL Server Engine is installed. It isn't for a local Management Studio install.

As for logical IO vs CPU, I'd go with lower CPU. If this runs often and overlapping, you'll run out of CPU resource first. I'd try a WHERE EXISTS (UNION ALL) construct and make sure I have good indexes.

Edit, after comments

  • there are parallelism operators in the plan = more than one logical processor visible to the OS and SQL Server. So it's either multiple core or hyper-threaded

Try EXEC xp_msver

冬天旳寂寞 2024-11-23 06:47:32

就我而言 - SQL Server 执行时间:
CPU 时间 = 671 毫秒,运行时间 = 255 毫秒。

CPU 时间几乎是查询运行时间的三倍。因为查询
并行处理,CPU负担非常高,CPU可能成为瓶颈
对于这种情况。

SQL Server 2012带来了CPU负担问题的解决方案。它引入了迭代器
一次处理批量的行,而不仅仅是逐行处理。

为了优化查询,您可以在表上创建列存储索引-

CREATE COLUMNSTORE INDEX idx_cs_colname
ON dbo.Tablename(feild1,feild2);

In my case- SQL Server Execution Times:
CPU time = 671 ms, elapsed time = 255 ms.

CPU time was nearly three times bigger than the elapsed time for query. Because the query
was processed in parallel, the CPU burden was very high, and the CPU could become a bottleneck
for this scenario.

SQL Server 2012 brings a solution to the CPU burden problem. It introduces iterators
that process batches of rows at a time, not just row by row.

For query optimization you can Create columnstore index on your table-

CREATE COLUMNSTORE INDEX idx_cs_colname
ON dbo.Tablename(feild1,feild2);

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文