并行性如何影响结果数量?

发布于 2024-07-18 13:08:43 字数 1486 浏览 7 评论 0原文

我有一个相当复杂的查询,如下所示:

create table Items(SomeOtherTableID int,SomeField int)
create table SomeOtherTable(Id int,GroupID int)
with cte1 as
(
    select 
        SomeOtherTableID,COUNT(*) SubItemCount 
    from 
        Items t 
    where 
        t.SomeField is not null 
    group by 
        SomeOtherTableID

),cte2 as
(
    select
        tc.SomeOtherTableID,ROW_NUMBER() over (partition by a.GroupID order by tc.SubItemCount desc) SubItemRank
    from 
        Items t
        inner join SomeOtherTable a on a.Id=t.SomeOtherTableID 
        inner join cte1 tc on tc.SomeOtherTableID=t.SomeOtherTableID
    where 
        t.SomeField is not null 

),cte3 as
(
    select 
        SomeOtherTableID 
    from 
        cte2 
    where 
        SubItemRank=1
)
select 
    * 
from 
    cte3 t1 
    inner join cte3 t2 on t1.SomeOtherTableID<t2.SomeOtherTableID 
option (maxdop 1)

该查询的 cte3 填充有 6222 不同 结果。 在最后的选择中,我对cte3与其自身执行交叉连接(这样我就可以将表中的每个值与表中的每个其他值进行比较)稍后一点)。 注意最后一行:

option (maxdop 1)

显然,这会关闭并行性。

因此,对于 cte3 中的 6222 结果行,我预计 (6222*6221)/2 或 19353531 结果会出现在后续的交叉连接选择中,并且随着最终的 ma​​xdop 线就位,情况确实如此。

但是,当我删除 ma​​xdop 行时,结果数会跳至 19380454。 我的开发盒上有 4 个核心。

搞什么? 谁能解释这是为什么吗? 我是否需要重新考虑以前以这种方式交叉连接的查询?

I have a fairly complex query that looks something like this:

create table Items(SomeOtherTableID int,SomeField int)
create table SomeOtherTable(Id int,GroupID int)
with cte1 as
(
    select 
        SomeOtherTableID,COUNT(*) SubItemCount 
    from 
        Items t 
    where 
        t.SomeField is not null 
    group by 
        SomeOtherTableID

),cte2 as
(
    select
        tc.SomeOtherTableID,ROW_NUMBER() over (partition by a.GroupID order by tc.SubItemCount desc) SubItemRank
    from 
        Items t
        inner join SomeOtherTable a on a.Id=t.SomeOtherTableID 
        inner join cte1 tc on tc.SomeOtherTableID=t.SomeOtherTableID
    where 
        t.SomeField is not null 

),cte3 as
(
    select 
        SomeOtherTableID 
    from 
        cte2 
    where 
        SubItemRank=1
)
select 
    * 
from 
    cte3 t1 
    inner join cte3 t2 on t1.SomeOtherTableID<t2.SomeOtherTableID 
option (maxdop 1)

The query is such that cte3 is filled with 6222 distinct results. In the final select, I am performing a cross join on cte3 with itself, (so that I can compare every value in the table with every other value in the table at a later point). Notice the final line :

option (maxdop 1)

Apparently, this switches off parallelism.

So, with 6222 results rows in cte3, I would expect (6222*6221)/2, or 19353531 results in the subsequent cross joining select, and with the final maxdop line in place, that is indeed the case.

However, when I remove the maxdop line, the number of results jumps to 19380454. I have 4 cores on my dev box.

WTF? Can anyone explain why this is? Do I need to reconsider previous queries that cross join in this way?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

秋意浓 2024-07-25 13:08:43

除了错误之外,并行性不应该影响结果。

Aside from a bug, parallelism should not affect the result.

凤舞天涯 2024-07-25 13:08:43

看起来 SCOPE_IDENTITY 也存在类似的并行错误

或者您是否使用快照隔离,这也是另一个错误? 还有一些博客演示了在某些情况下暂时关闭快照隔离的情况。

编辑:

回到快照隔离,这里的第 3 点: 您应该对并行性感到紧张的六个原因

Looks like there is a similar parallism bug with SCOPE_IDENTITY too

Or are you using snaphot isolation, another bug too? There are also some blogs that demonstrate snapshot isolation being turned off temporarily under certain conditions.

Edit:

Going back to Snapshot isolation, number 3 here: Six reasons you should be nervous about parallelism

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文