并行性如何影响结果数量？

发布于 2024-07-18 13:08:43 字数 1486 浏览 7 评论 0原文

我有一个相当复杂的查询，如下所示：

create table Items(SomeOtherTableID int,SomeField int)
create table SomeOtherTable(Id int,GroupID int)
with cte1 as
(
    select 
        SomeOtherTableID,COUNT(*) SubItemCount 
    from 
        Items t 
    where 
        t.SomeField is not null 
    group by 
        SomeOtherTableID

),cte2 as
(
    select
        tc.SomeOtherTableID,ROW_NUMBER() over (partition by a.GroupID order by tc.SubItemCount desc) SubItemRank
    from 
        Items t
        inner join SomeOtherTable a on a.Id=t.SomeOtherTableID 
        inner join cte1 tc on tc.SomeOtherTableID=t.SomeOtherTableID
    where 
        t.SomeField is not null 

),cte3 as
(
    select 
        SomeOtherTableID 
    from 
        cte2 
    where 
        SubItemRank=1
)
select 
    * 
from 
    cte3 t1 
    inner join cte3 t2 on t1.SomeOtherTableID<t2.SomeOtherTableID 
option (maxdop 1)

该查询的 cte3 填充有 6222 不同结果。在最后的选择中，我对cte3与其自身执行交叉连接（这样我就可以将表中的每个值与表中的每个其他值进行比较）稍后一点）。注意最后一行：

option (maxdop 1)

显然，这会关闭并行性。

因此，对于 cte3 中的 6222 结果行，我预计 (6222*6221)/2 或 19353531 结果会出现在后续的交叉连接选择中，并且随着最终的 maxdop 线就位，情况确实如此。

但是，当我删除 maxdop 行时，结果数会跳至 19380454。我的开发盒上有 4 个核心。

搞什么？谁能解释这是为什么吗？我是否需要重新考虑以前以这种方式交叉连接的查询？

原文

I have a fairly complex query that looks something like this:

create table Items(SomeOtherTableID int,SomeField int)
create table SomeOtherTable(Id int,GroupID int)
with cte1 as
(
    select 
        SomeOtherTableID,COUNT(*) SubItemCount 
    from 
        Items t 
    where 
        t.SomeField is not null 
    group by 
        SomeOtherTableID

),cte2 as
(
    select
        tc.SomeOtherTableID,ROW_NUMBER() over (partition by a.GroupID order by tc.SubItemCount desc) SubItemRank
    from 
        Items t
        inner join SomeOtherTable a on a.Id=t.SomeOtherTableID 
        inner join cte1 tc on tc.SomeOtherTableID=t.SomeOtherTableID
    where 
        t.SomeField is not null 

),cte3 as
(
    select 
        SomeOtherTableID 
    from 
        cte2 
    where 
        SubItemRank=1
)
select 
    * 
from 
    cte3 t1 
    inner join cte3 t2 on t1.SomeOtherTableID<t2.SomeOtherTableID 
option (maxdop 1)

The query is such that cte3 is filled with 6222 distinct results. In the final select, I am performing a cross join on cte3 with itself, (so that I can compare every value in the table with every other value in the table at a later point). Notice the final line :

option (maxdop 1)

Apparently, this switches off parallelism.

So, with 6222 results rows in cte3, I would expect (6222*6221)/2, or 19353531 results in the subsequent cross joining select, and with the final maxdop line in place, that is indeed the case.

However, when I remove the maxdop line, the number of results jumps to 19380454. I have 4 cores on my dev box.

WTF? Can anyone explain why this is? Do I need to reconsider previous queries that cross join in this way?

分享到QQ

分享到微博