读取表变量的查询可以在 SQL Server 2008 中生成并行执行计划吗?

发布于 2024-08-09 13:25:34 字数 805 浏览 11 评论 0原文

首先,来自 BOL

修改变量的查询不会生成并行查询执行计划。修改非常大的变量或复杂查询中的表变量时,性能可能会受到影响。在这些情况下,请考虑使用临时表。有关详细信息,请参阅创建表 (Transact-SQL)。读取变量而不修改它们的查询仍然可以并行化。

这似乎很清楚了。 读取表变量而不修改它们的查询仍然可以并行化。

但随后在 SQL Server 存储引擎,一个其他信誉良好的来源,Sunil Agarwal 在 2008 年 3 月 30 日关于 tempdb 的一篇文章中这样说道:

涉及表变量的查询不会生成并行计划。

Sunil 是否解释了 BOL re: INSERT,或者 FROM 子句中表变量的存在是否妨碍了并行性?如果是这样,为什么?

我特别考虑控制表用例,其中您有一个小的控制表连接到一个更大的表,以映射值,充当过滤器,或两者兼而有之。

谢谢!

First, from BOL:

Queries that modify table variables do not generate parallel query execution plans. Performance can be affected when very large table variables, or table variables in complex queries, are modified. In these situations, consider using temporary tables instead. For more information, see CREATE TABLE (Transact-SQL). Queries that read table variables without modifying them can still be parallelized.

That seems clear enough. Queries that read table variables, without modifying them, can still be parallelized.

But then over at SQL Server Storage Engine, an otherwise reputable source, Sunil Agarwal said this in an article on tempdb from March 30, 2008:

Queries involving table variables don't generate parallel plans.

Was Sunil paraphrasing BOL re: INSERT, or does the presence of table variables in the FROM clause prevent parallelism? If so, why?

I am thinking specifically of the control table use case, where you have a small control table being joined to a larger table, to map values, act as a filter, or both.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

删除→记忆 2024-08-16 13:25:34

好的,我有一个并行选择,但不是在表变量上,

我已将其匿名化并且:

  • BigParallelTable 为 900k 行且宽
  • 由于遗留原因,BigParallelTable 部分非规范化(稍后我会修复它,承诺)
  • BigParallelTable 经常生成并行计划,因为它不理想并且“昂贵”
  • SQL Server 2005 x64,SP3,构建 4035,16 核

查询 + 计划:

DECLARE @FilterList TABLE (bar varchar(100) NOT NULL)

INSERT @FilterList (bar)
SELECT 'val1' UNION ALL 'val2' UNION ALL 'val3'

--snipped

SELECT
     *
FROM
    dbo.BigParallelTable BPT
    JOIN
    @FilterList FL ON BPT.Thing = FL.Bar

StmtText
  |--Parallelism(Gather Streams)
       |--Hash Match(Inner Join, HASH:([FL].[bar])=([BPT].[Thing]), RESIDUAL:(@FilterList.[bar] as [FL].[bar]=[MyDB].[dbo].[BigParallelTable].[Thing] as [BPT].[Thing]))
            |--Parallelism(Distribute Streams, Broadcast Partitioning)
            |    |--Table Scan(OBJECT:(@FilterList AS [FL]))
            |--Clustered Index Scan(OBJECT:([MyDB].[dbo].[BigParallelTable].[PK_BigParallelTable] AS [BPT]))

现在,考虑一下,表变量几乎总是表扫描,没有stats 并假设一行“估计行数 = 1”,“实际.. = 3”。

我们可以声明表变量不并行使用,但包含计划可以在其他地方使用并行性吗?所以 BOL 是正确的,而 SQL Storage 文章是错误的

OK, I have a parallel select but not on the table variable

I've anonymised it and:

  • BigParallelTable is 900k rows and wide
  • For legacy reasons, BigParallelTable is partially denormalised (I'll fix it, later, promise)
  • BigParallelTable often generates parallel plans because it's not ideal and is "expensive"
  • SQL Server 2005 x64, SP3, build 4035, 16 cores

Query + plan:

DECLARE @FilterList TABLE (bar varchar(100) NOT NULL)

INSERT @FilterList (bar)
SELECT 'val1' UNION ALL 'val2' UNION ALL 'val3'

--snipped

SELECT
     *
FROM
    dbo.BigParallelTable BPT
    JOIN
    @FilterList FL ON BPT.Thing = FL.Bar

StmtText
  |--Parallelism(Gather Streams)
       |--Hash Match(Inner Join, HASH:([FL].[bar])=([BPT].[Thing]), RESIDUAL:(@FilterList.[bar] as [FL].[bar]=[MyDB].[dbo].[BigParallelTable].[Thing] as [BPT].[Thing]))
            |--Parallelism(Distribute Streams, Broadcast Partitioning)
            |    |--Table Scan(OBJECT:(@FilterList AS [FL]))
            |--Clustered Index Scan(OBJECT:([MyDB].[dbo].[BigParallelTable].[PK_BigParallelTable] AS [BPT]))

Now, thinking about it, a table variable is almost always a table scan, has no stats and is assumed one row "Estimated number of rows = 1", "Actual.. = 3".

Can we declare that table variables are not used in parallel, but the containing plan can use parallelism elsewhere? So BOL is correct and the SQL Storage article is wrong

腻橙味 2024-08-16 13:25:34

显示表变量本身的并行运算符的简单示例。

DECLARE @T TABLE
(
X INT
)
INSERT INTO @T
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM master..spt_values v1,master..spt_values v2;

WITH E8(N)
     AS (SELECT 1
         FROM   @T a,
                @T b),
     Nums(N)
     AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT 0))
         FROM   E8)
SELECT COUNT(N)
FROM   Nums
OPTION (RECOMPILE)  

计划

Simple Example showing a parallel operator on a table variable itself.

DECLARE @T TABLE
(
X INT
)
INSERT INTO @T
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM master..spt_values v1,master..spt_values v2;

WITH E8(N)
     AS (SELECT 1
         FROM   @T a,
                @T b),
     Nums(N)
     AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT 0))
         FROM   E8)
SELECT COUNT(N)
FROM   Nums
OPTION (RECOMPILE)  

Plan

毁梦 2024-08-16 13:25:34

[在这里回答我自己的问题,以便我可以适当地提出相关引用......]

Boris B,来自 线程

使用表变量的只读查询仍然可以并行化。涉及已修改的表变量的查询连续运行。 我们将更正在线图书中的声明。(已添加)

并且:

请注意,有两种并行支持:

A.操作符可以/不可以在并行线程中

B.查询可以/不能并行运行,因为该运算符存在于树中。

B 是 A 的超集。

据我所知,表变量不是 B,可能是 A。

另一个相关引用,回复:非内联 T-SQL TVF :

如果 TVF 输入是运行时常量(例如变量和参数),则考虑并行性非内联 T-SQL TVF。如果输入是一列(来自交叉应用),则整个语句禁用并行性。

[Answering my own question here, so I can present the relevant quotes appropriately....]

Boris B, from an thread at MSDN SQL Server forums:

Read-only queries that use table variables can still be parallelized. Queries that involve table variables that are modified run serially. We will correct the statement in Books Online. (emp. added)

and:

Note that there are two flavors of parallelism support:

A. The operator can/can not be in a parallel thread

B. The query can/can not be run in parallel because this operator exists in the tree.

B is a superset of A.

As best I can tell, table variables are not B and may be A.

Another relevant quote, re: non-inlined T-SQL TVFs:

Non-inlined T-SQL TVFs...is considered for parallelism if the TVF inputs are run-time constants, e.g. variables and parameters. If the input is a column (from a cross apply) then parallelism is disabled for the whole statement.

谁与争疯 2024-08-16 13:25:34

我的理解是,对于 UPDATE/DELETE/INSERT 操作,并行性在表变量上被阻止,但对于 SELECT 则不会。当然,证明这一点比仅仅假设要困难得多。 :-)

My understanding is that parallelism is blocked on table variables for UPDATE/DELETE/INSERT operations, but not for SELECTs. Proving that would be a lot more difficult than just hypothesizing, of course. :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文