测量查询性能:“执行计划查询成本” 与“所花费的时间”相对

发布于 2024-07-14 09:12:22 字数 1276 浏览 6 评论 0原文

我正在尝试确定两个不同查询的相对性能,并且有两种可用的方法来衡量它:
1. 运行两个查询并计时每个查询
2.运行两者并从实际执行计划中获取“查询成本”

这是我运行的用于对查询进行计时的代码...

DBCC FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
GO
DECLARE @start DATETIME SET @start = getDate()
EXEC test_1a
SELECT getDate() - @start AS Execution_Time
GO

DBCC FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
GO
DECLARE @start DATETIME SET @start = getDate()
EXEC test_1b
SELECT getDate() - @start AS Execution_Time
GO

我得到的结果如下:

Stored_Proc     Execution_Time     Query Cost (Relative To Batch)

test_1a         1.673 seconds      17%
test_1b         1.033 seconds      83%

执行时间的结果与查询成本的结果直接矛盾,但我很难确定“查询成本”的实际含义。 我最好的猜测是它是读取/写入/CPU_Time/等的聚合,所以我想我有几个问题:

  1. 是否有明确的来源来解释该度量的含义?

  2. 人们还使用哪些其他“查询性能”指标?它们的相对优点是什么?


值得注意的是,这是一个中型 SQL Server,在 MS Server 2003 企业版上运行 MS SQL Server 2005,具有多个处理器和 100 多个并发用户。

编辑:

经过一番麻烦之后,我设法获得了 SQL Server 上的 Profiler 访问权限,并且可以提供额外的信息(这支持与系统资源相关的查询成本,而不是执行时间本身......)

Stored_Proc    CPU      Reads    Writes   Duration   

test_1a        1313     3975     93       1386
test_1b        2297     49839    93       1207

令人印象深刻的是,需要更多的 CPU 来进行更多的读取更短的时间 :)

I'm trying to determine the relative performance of two different queries and have two ways of measuring this available to me:
1. Run both and time each query
2. Run both and get "Query Cost" from the actual execution plan

Here is the code I run to time the queries...

DBCC FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
GO
DECLARE @start DATETIME SET @start = getDate()
EXEC test_1a
SELECT getDate() - @start AS Execution_Time
GO

DBCC FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
GO
DECLARE @start DATETIME SET @start = getDate()
EXEC test_1b
SELECT getDate() - @start AS Execution_Time
GO

What I get is the following:

Stored_Proc     Execution_Time     Query Cost (Relative To Batch)

test_1a         1.673 seconds      17%
test_1b         1.033 seconds      83%

The results of the execution time directly contradict the results of the Query Cost, but I'm having difficulty determining what "Query Cost" actually means. My best guess is that it is an aggregate of Reads/Writes/CPU_Time/etc, so I guess I have a couple of questions:

  1. Is there a definative source to explain what this measure means?

  2. What other "Query Performance" metrics do people use, and what are their relative merits?

It may be important to note that this is a medium sized SQL Server, running MS SQL Server 2005 on MS Server 2003 Enterprise Edition with multiple processors and 100+ concurrent users.

EDIT:

After some bother I managed to get Profiler access on that SQL Server, and can give extra info (Which supports Query Cost being related to system resources, not Execution Time itself...)

Stored_Proc    CPU      Reads    Writes   Duration   

test_1a        1313     3975     93       1386
test_1b        2297     49839    93       1207

Impressive that taking more CPU with MANY more Reads takes less time :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

-小熊_ 2024-07-21 09:12:22

分析器跟踪将其置于正确的位置。

  • 查询 A:1.3 秒 CPU,1.4 秒持续时间
  • 查询 B:2.3 秒 CPU,1.2 秒持续时间

查询 B 使用并行性:CPU > 1.4 秒 期间
例如,查询使用 2 个 CPU,每个

查询 A 平均需要 1.15 秒,但可能不是:CPU < 持续时间

这解释了相对于批处理的成本:对于更简单的非并行查询计划,成本为 17%。

优化器发现查询 B 的成本更高,并且会从并行性中受益,尽管这样做需要额外的努力。

但请记住,查询 B 使用 2 个 CPU 的 100%(因此 4 个 CPU 为 50%)持续一秒左右。 查询 A 使用 100% 的单个 CPU 时间为 1.5 秒。

查询 A 的峰值较低,但代价是持续时间增加。
对于一个用户,谁在乎呢? 有了100个,也许就会有所不同......

The profiler trace puts it into perspective.

  • Query A: 1.3 secs CPU, 1.4 secs duration
  • Query B: 2.3 secs CPU, 1.2 secs duration

Query B is using parallelism: CPU > duration
eg the query uses 2 CPUs, average 1.15 secs each

Query A is probably not: CPU < duration

This explains cost relative to batch: 17% of the for the simpler, non-parallel query plan.

The optimiser works out that query B is more expensive and will benefit from parallelism, even though it takes extra effort to do so.

Remember though, that query B uses 100% of 2 CPUS (so 50% for 4 CPUs) for one second or so. Query A uses 100% of a single CPU for 1.5 seconds.

The peak for query A is lower, at the expense of increased duration.
With one user, who cares? With 100, perhaps it makes a difference...

笔芯 2024-07-21 09:12:22
SET STATISTICS TIME ON

SELECT * 

FROM Production.ProductCostHistory
WHERE StandardCost < 500.00;

SET STATISTICS TIME OFF;

并查看消息选项卡,它看起来像这样:

SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 10 ms.

(778 row(s) affected)

SQL Server parse and compile time: 

   CPU time = 0 ms, elapsed time = 0 ms.
SET STATISTICS TIME ON

SELECT * 

FROM Production.ProductCostHistory
WHERE StandardCost < 500.00;

SET STATISTICS TIME OFF;

And see the message tab it will look like this:

SQL Server Execution Times:

   CPU time = 0 ms,  elapsed time = 10 ms.

(778 row(s) affected)

SQL Server parse and compile time: 

   CPU time = 0 ms, elapsed time = 0 ms.
何以心动 2024-07-21 09:12:22

执行时间的结果与查询成本的结果直接矛盾,但我很难确定“查询成本”的实际含义。

查询成本是优化器认为查询将花费多长时间(相对于总批处理时间)。

优化器尝试通过查看查询和数据统计信息、尝试多个执行计划并选择其中成本最低的执行计划来选择最佳查询计划。

在这里您可以详细了解它如何尝试做到这一点。

正如您所看到的,这可能与您实际得到的有很大不同。

当然,唯一真正的查询性能指标是查询实际花费的时间。

The results of the execution time directly contradict the results of the Query Cost, but I'm having difficulty determining what "Query Cost" actually means.

Query cost is what optimizer thinks of how long your query will take (relative to total batch time).

The optimizer tries to choose the optimal query plan by looking at your query and statistics of your data, trying several execution plans and selecting the least costly of them.

Here you may read in more detail about how does it try to do this.

As you can see, this may differ significantly of what you actually get.

The only real query perfomance metric is, of course, how long does the query actually take.

初心 2024-07-21 09:12:22

使用SET STATISTICS TIME ON

在查询上方

。 在结果选项卡附近,您可以看到一个消息选项卡。 在那里你可以看到时间。

Use SET STATISTICS TIME ON

above your query.

Below near result tab you can see a message tab. There you can see the time.

毁梦 2024-07-21 09:12:22

查询执行时间:

DECLARE @EndTime datetime
DECLARE @StartTime datetime 
SELECT @StartTime=GETDATE() 


` -- Write Your Query`

SELECT @EndTime=GETDATE()
--This will return execution time of your query
SELECT DATEDIFF(MILLISECOND,@StartTime,@EndTime) AS [Duration in millisecs] 

查询输出将类似于:

在此处输入图像描述

至优化查询成本:

单击 SQL Management Studio

在此处输入图像描述

运行查询并单击执行计划在查询结果的“消息”选项卡旁边。 你会看到类似

在此处输入图像描述

Query Execution Time:

DECLARE @EndTime datetime
DECLARE @StartTime datetime 
SELECT @StartTime=GETDATE() 


` -- Write Your Query`

SELECT @EndTime=GETDATE()
--This will return execution time of your query
SELECT DATEDIFF(MILLISECOND,@StartTime,@EndTime) AS [Duration in millisecs] 

Query Out Put Will be Like:

enter image description here

To Optimize Query Cost :

Click on your SQL Management Studio

enter image description here

Run your query and click on Execution plan beside the Messages tab of your query result. you will see like

enter image description here

白云不回头 2024-07-21 09:12:22

我知道这是一个老问题 - 但是我想添加一个示例,其中成本相同但一个查询比另一个查询更好。

正如您在问题中观察到的那样,执行计划中显示的%并不是确定最佳查询的唯一标准。 在下面的示例中,我有两个查询执行相同的任务。 执行计划显示两者同样好(各 50%)。 现在,我使用 SET STATISTICS IO ON 执行查询,这显示出明显的差异。

在以下示例中,查询 1 对表 LWManifestOrderLineItems 使用 seek,而查询 2 使用 scan。 然而,当我们实际检查执行时间时,发现查询 2 效果更好。

另请阅读 什么时候“Seek”不是“Seek”? 作者:Paul White

QUERY

---Preparation---------------
-----------------------------
DBCC FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
GO

SET STATISTICS IO ON  --IO
SET STATISTICS TIME ON

--------Queries---------------
------------------------------

SELECT LW.Manifest,LW.OrderID,COUNT(DISTINCT LineItemID)
FROM LWManifestOrderLineItems LW
INNER JOIN ManifestContainers MC
    ON MC.Manifest = LW.Manifest
GROUP BY LW.Manifest,LW.OrderID
ORDER BY COUNT(DISTINCT LineItemID) DESC  

SELECT LW.Manifest,LW.OrderID,COUNT( LineItemID) LineCount
FROM LWManifestOrderLineItems LW
WHERE LW.Manifest IN (SELECT Manifest FROM ManifestContainers)
GROUP BY LW.Manifest,LW.OrderID
ORDER BY COUNT( LineItemID) DESC  

Statistics IO

在此处输入图像描述

执行计划

< a href="https://i.sstatic.net/cesIO.jpg" rel="nofollow noreferrer">输入图像描述这里

I understand it’s an old question – however I would like to add an example where cost is same but one query is better than the other.

As you observed in the question, % shown in execution plan is not the only yardstick for determining best query. In the following example, I have two queries doing the same task. Execution Plan shows both are equally good (50% each). Now I executed the queries with SET STATISTICS IO ON which shows clear differences.

In the following example, the query 1 uses seek whereas Query 2 uses scan on the table LWManifestOrderLineItems. When we actually checks the execution time however it is find that Query 2 works better.

Also read When is a Seek not a Seek? by Paul White

QUERY

---Preparation---------------
-----------------------------
DBCC FREEPROCCACHE
GO
DBCC DROPCLEANBUFFERS
GO

SET STATISTICS IO ON  --IO
SET STATISTICS TIME ON

--------Queries---------------
------------------------------

SELECT LW.Manifest,LW.OrderID,COUNT(DISTINCT LineItemID)
FROM LWManifestOrderLineItems LW
INNER JOIN ManifestContainers MC
    ON MC.Manifest = LW.Manifest
GROUP BY LW.Manifest,LW.OrderID
ORDER BY COUNT(DISTINCT LineItemID) DESC  

SELECT LW.Manifest,LW.OrderID,COUNT( LineItemID) LineCount
FROM LWManifestOrderLineItems LW
WHERE LW.Manifest IN (SELECT Manifest FROM ManifestContainers)
GROUP BY LW.Manifest,LW.OrderID
ORDER BY COUNT( LineItemID) DESC  

Statistics IO

enter image description here

Execution Plan

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文