SQL查询成本计算,聚合视图更快?

发布于 2024-09-09 19:37:47 字数 1309 浏览 1 评论 0原文

我有一个表 Sheet1$,其中包含 616 条记录。我有另一个表,Rates$,其中包含 47880 条记录。费率包含工作表中给定记录自邮寄日期起 90 天内的响应率。在记录比率关系的所有 90 天内,总响应始终为 1 (100%)

示例:

Sheet1$: Record 1, 1000 QTY, 5% Response, Mail 1/1/2009

Rates$: Record 1, Day 1, 2% Response
        Record 1, Day 2, 3% Response
     Record 1, Day 90, 1% Response
     Record N, Day N, N Response

因此,我编写了一个视图,该视图采用这些表并将它们连接到比率右侧以扩展数据,以便我可以执行一些数学运算以获得任何给定记录的每日回报。

SELECT s.[Mail Date] + r.Day as Mail_Date, s.Quantity * s.[Expected Response Rate] * r.Response as Pieces, s.[Bounce Back Card], s.Customer, s.[Point of Entry]
  FROM Sheet1$ as s
 RIGHT OUTER JOIN Rates$ as r
            ON s.[Appeal Code] = r.Appeal
 WHERE s.[Mail Date] IS NOT NULL 
   AND s.Quantity <> 0 
   AND s.[Expected Response Rate] <> 0
   AND s.Quantity IS NOT NULL 
   AND s.[Expected Response Rate] IS NOT NULL);

所以我将其保存为名为 Test_Results 的视图。我使用 SQL Server Management Studio 运行此查询并获得 211,140 条记录的结果。预计经过时间为 4.121 秒。子树成本为 0.751。

现在,我针对此视图运行查询以汇总每天的件数。

SELECT   Mail_Date, SUM(Pieces) AS Piececount
FROM     Test_Results
GROUP BY Mail_Date

返回 773 行,执行只花了 0.452 秒!预计 1.458子树成本。

我的问题是,通过更高的估计,它的执行速度如何比原始视图本身快得多?!我认为其中一个可能是将行返回到管理工作室。如果是这种情况,我将如何查看此查询的真实成本,而不必考虑返回反馈?

I have a table, Sheet1$ that contains 616 records. I have another table, Rates$ that contains 47880 records. Rates contains a response rate for a given record in the sheet for 90 days from a mailing date. Within all 90 days of a records Rates relation the total response is ALWAYS 1 (100%)

Example:

Sheet1$: Record 1, 1000 QTY, 5% Response, Mail 1/1/2009

Rates$: Record 1, Day 1, 2% Response
        Record 1, Day 2, 3% Response
     Record 1, Day 90, 1% Response
     Record N, Day N, N Response

So in that, I've written a view that takes these tables and joins them to the right on the rates to expand the data so I can perform some math to get a return per day for any given record.

SELECT s.[Mail Date] + r.Day as Mail_Date, s.Quantity * s.[Expected Response Rate] * r.Response as Pieces, s.[Bounce Back Card], s.Customer, s.[Point of Entry]
  FROM Sheet1$ as s
 RIGHT OUTER JOIN Rates$ as r
            ON s.[Appeal Code] = r.Appeal
 WHERE s.[Mail Date] IS NOT NULL 
   AND s.Quantity <> 0 
   AND s.[Expected Response Rate] <> 0
   AND s.Quantity IS NOT NULL 
   AND s.[Expected Response Rate] IS NOT NULL);

So I save this as a view called Test_Results. Using SQL Server Management Studio I run this query and get a result of 211,140 records. Elapsed time was 4.121 seconds, Est. Subtree Cost was 0.751.

Now I run a query against this view to aggregate a piece count on each day.

SELECT   Mail_Date, SUM(Pieces) AS Piececount
FROM     Test_Results
GROUP BY Mail_Date

That returns 773 rows and it only took 0.452 seconds to execute! 1.458 Est. Subtree Cost.

My question is, with a higher estimate how did this execute SO much faster than the original view itself?! I would assume a piece might be that its returning rows to management studio. If that is the case, how would I go about viewing the true cost of this query without having to account for the return feedback?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

倾`听者〃 2024-09-16 19:37:48

SELECT * FROM view1 将有一个计划

SELECT * FROM view2 (其中 view2 基于 view1)将有自己的完整计划

优化器足够智能,可以使 view2 的计划将操作组合/折叠为最有效的操作。它只是观察 view1 设计的语义,但不一定需要使用 SELECT * FROM view1 的计划,而不是为 view2 应用另一个计划 - 一般来说,这将是一个完全不同的计划,并且它会尽一切努力获得最有效的结果。

通常,它将向下推动聚合以提高选择性,并减少数据需求,这将加快操作速度。

SELECT * FROM view1 will have a plan

SELECT * FROM view2 (where view2 is based on view1) will have its own complete plan

The optimizer is smart enough to make the plan for view2 combine/collapse the operations into a most efficient operation. It is only going to observe the semantics of the design of view1, but it is not necessarily required to use the plan for SELECT * FROM view1 and than apply another plan for view2 - this will, in general, be a completely different plan, and it will do whatever it can to get the most efficient results.

Typically, it's going to push the aggregation down to improve the selectivity, and reduce the data requirements, and that's going to speed up the operation.

寒尘 2024-09-16 19:37:48

我认为 Cade 已经涵盖了最重要的部分 - 从视图中进行选择并不一定需要返回所有视图行,然后对其进行选择。 SQL Server 将优化整体查询。

不过,要回答您的问题,如果您想避免网络和显示成本,那么您只需将每个查询结果选择到表中即可。只需在 SELECT 子句中的列列表后面添加“INTO Some_Table”即可。

您还应该能够通过显示客户端统计信息或使用 Profiler 来区分事物,但 SELECT...INTO 方法既快速又简单。

I think that Cade has covered the most important part - selecting from a view doesn't necessarily entail returning all of the view rows and then selecting against that. SQL Server will optimize the overall query.

To answer your question though, if you want to avoid the network and display costs then you can simply select each query result into a table. Just add "INTO Some_Table" after the column list in the SELECT clause.

You should also be able to separate things out by showing client statistics or by using Profiler, but the SELECT...INTO method is quick and easy.

你与清晨阳光 2024-09-16 19:37:48

查询成本是无单位的,优化器仅使用它来选择它认为特定查询最有效的执行路径。它们无法在查询之间进行真正的比较。 本文虽然很旧,但还是值得快速阅读的。然后,如果您确实感兴趣,您可能会想要查找一些有关 MSSQL 优化器以及阅读查询计划的书籍或文章。

(另外,请确保您查看的是实际执行计划,而不是解释计划......它们可能不同)

Query costs are unitless, and are just used by the optimizer to choose what it thinks the most efficient execution path for a particular query is. They can't really be compared between queries. This, although old, is a good quick read. Then you'll probably want to look around for some books or articles on the MSSQL optimizer and about reading query plans if you're really interested.

(Also, make sure you're viewing the actual execution plan, and not the explain plan ... they can be different)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文