在子查询上使用 DISTINCT 来删除实体框架中的重复项

发布于 2024-11-08 02:35:47 字数 1230 浏览 0 评论 0原文

我对使用 Sql 2005 将 Distinct 与 Entity Framework 结合使用有疑问。在此示例中:

practitioners = from p in context.Practitioners
                join pn in context.ProviderNetworks on
                     p.ProviderId equals pn.ProviderId
                (notNetworkIds.Contains(pn.Network))
                select p;

practitioners = practitioners
                  .Distinct()
                  .OrderByDescending(p => p.UpdateDate); 

data = practitioners.Skip(PageSize * (pageOffset ?? 0)).Take(PageSize).ToList();

一切正常,但使用 distinct 效率非常低。较大的结果集会导致性能不可接受。 DISTINCT 简直要了我的命。之所以需要distinct,是因为可以查询多个网络,导致Providers记录重复。实际上,我需要询问数据库“即使提供商位于多个网络中,也只返回一次”。如果我可以将 DISTINCT 放在 ProviderNetworks 上,查询运行速度会快得多。

如何使 EF 仅将 DISTINCT 添加到子查询中,而不是添加到整个结果集中?

我不想要的结果简化的 sql 是:

select DISTINCT p.* from Providers 
inner join Networks pn on p.ProviderId = pn.ProviderId
where NetworkName in ('abc','def')

理想的 sql 是:

select p.* from Providers 
inner join (select DISTINCT ProviderId from Networks 
            where NetworkName in ('abc','def')) 
as pn on p.ProviderId = pn.ProviderId

谢谢 戴夫

I have question about use of Distinct with Entity Framework, using Sql 2005. In this example:

practitioners = from p in context.Practitioners
                join pn in context.ProviderNetworks on
                     p.ProviderId equals pn.ProviderId
                (notNetworkIds.Contains(pn.Network))
                select p;

practitioners = practitioners
                  .Distinct()
                  .OrderByDescending(p => p.UpdateDate); 

data = practitioners.Skip(PageSize * (pageOffset ?? 0)).Take(PageSize).ToList();

It all works fine, but the use of distinct is very inefficient. Larger result sets incur unacceptable performance. The DISTINCT is killing me. The distinct is only needed because multiple networks can be queried, causing Providers records to be duplicated. In effect I need to ask the DB "only return providers ONCE even if they're in multiple networks". If I could place the DISTINCT on the ProviderNetworks, the query runs much faster.

How can I cause EF to add the DISTINCT only the subquery, not to the entire resultset?

The resulting simplified sql I DON'T want is:

select DISTINCT p.* from Providers 
inner join Networks pn on p.ProviderId = pn.ProviderId
where NetworkName in ('abc','def')

IDEAL sql is:

select p.* from Providers 
inner join (select DISTINCT ProviderId from Networks 
            where NetworkName in ('abc','def')) 
as pn on p.ProviderId = pn.ProviderId

Thanks
Dave

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

夏末染殇 2024-11-15 02:35:47

我认为您不需要在这里使用 Distinct ,而是需要 Exists (或在 Linq 中称为 Any

试试这个:

    var q = (from p in context.Practitioners
            where context.ProviderNetworks.Any(pn => pn.ProviderId == p.ProviderId && notNetworkIds.Contains(pn.Network))
            orderby p.UpdateDate descending
            select p).Skip(PageSize * (pageOffset ?? 0)).Take(PageSize).ToList();

I dont think you need a Distinct here but a Exists (or Any as it is called in Linq)

Try this:

    var q = (from p in context.Practitioners
            where context.ProviderNetworks.Any(pn => pn.ProviderId == p.ProviderId && notNetworkIds.Contains(pn.Network))
            orderby p.UpdateDate descending
            select p).Skip(PageSize * (pageOffset ?? 0)).Take(PageSize).ToList();
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文