使用较低级别组内顶级组的聚合结果

发布于 2024-11-07 04:39:27 字数 676 浏览 11 评论 0原文

我有 2 个表 A{int id,int grp},B{int help,int cat}。

表B包含表A的记录所属的类别列表,因此B.aid是引用A.id的外键。

A.id 是表 A 的唯一主键。B.cat

包含从 1 到 5 的类别编号,A.grp 包含从 1 到 1000 的编号。

表 A 有 300 万条记录,表 B - 大约 500 万条。

对于每个组 A.grp,我需要计算 A 中包含 B.cat 的记录占 A.grp 组内记录数的百分比。

所以如果 A:[{1,1},{2,1},{3,2}], B:[{1,3},{1,4},{2,3},{3,4} ] 那么查询结果应该是下面的3列表: R{int grp,int cat,double%}:[{1,3,100},{1,4,50},{2,4,100}]

如何在 Linq 中使用单个查询来完成此操作?

希望 A 在该查询中只出现一次,因为我希望能够用 A.Where(e=>some复杂表达式) 替换 A ,而无需在该单个查询中多次重复它。

表 A 和 B 使用外键导入到 Linq to Entities 中,以便可以引用 from a in A from b in aB select b.catfrom b in B select bAgrp代码>

I have 2 tables A{int id,int grp}, B{int aid,int cat}.

Table B contains list of categories that record of table A belongs to, so B.aid is Foreign Key that references A.id.

A.id is unique primary key of table A.

B.cat contains category number from 1 to 5, A.grp contains numbers from 1 to 1000.

Table A has 3 million of records, table B - about 5 million.

For each group A.grp I need to calculate % of records in A that contain B.cat out of number of records within group A.grp.

So if A:[{1,1},{2,1},{3,2}], B:[{1,3},{1,4},{2,3},{3,4}] then result of the query should be the following 3 column table:
R{int grp,int cat,double percent}:[{1,3,100},{1,4,50},{2,4,100}]

How can I do it with one single query in Linq ?

It is desired that A to appear only once in that query because I want to be able to replace A with A.Where(e=>some complicated expression) without duplicating it many times in that single query.

Tables A and B are imported into Linq to Entities with foreign keys so that it's possible to reference from a in A from b in a.B select b.cat or from b in B select b.A.grp

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

太阳哥哥 2024-11-14 04:39:27

您可以像这样组合您的查询

var query = from g in 
              (from a in db.A
               group a by new
               {
                 grp = a.grp
               }
              )
            join c in  
              (from a in db.A
               from b in a.B
               group b by new
               {
                 a.grp,
                 b.cat
               }
              )            
            on g.Key.grp equals c.Key.grp
            select new
            {
              g.Key.grp,
              c.Key.cat,
              percent = c.Count() * 100 / g.Count()
            };

You can combine your queries like this

var query = from g in 
              (from a in db.A
               group a by new
               {
                 grp = a.grp
               }
              )
            join c in  
              (from a in db.A
               from b in a.B
               group b by new
               {
                 a.grp,
                 b.cat
               }
              )            
            on g.Key.grp equals c.Key.grp
            select new
            {
              g.Key.grp,
              c.Key.cat,
              percent = c.Count() * 100 / g.Count()
            };
南街女流氓 2024-11-14 04:39:27

以下是生成所需结果的 SQL 代码:


with grp as (从 a.grp 的组中选择 a.grp,cnt=count(*))
,cat as(选择a.grp,b.cat,cnt=count( * ) * 100/grp.cnt
来自
在 b.aid=a.id
上加入 b
加入 grp 上 grp.grp=a.grp
按 a.grp,b.cat,grp.cnt 分组)
从猫中选择*

这是生成所需结果的 Linq 代码:

var grp=
        from a in db.A
        group a by new{grp=a.grp}
        ;

var cat=
        from a in db.A
        from b in a.B
        group b by new{a.grp,b.cat}
        ;

var q=from g in grp
        join c in cat on g.Key.grp equals c.Key.grp
        select new{g.Key.grp,c.Key.cat,percent=c.Count()*100/g.Count()};

但如果有这样的东西就太好了:

from a in db.A
group a by new{grp=a.grp} into grp
from g in grp
from c in g.B
group c by new{gcnt=grp.Count(),c.cat} into cat
from c in cat
select new{c.A.grp,c.cat,cnt=cat.Count()*100/cat.Key.gcnt}

但它给了我以下运行时异常:
不支持嵌套查询。操作1='GroupBy' 操作2='MultiStreamNest'"

Here is SQL code that generates desired result:


with grp as (select a.grp,cnt=count(*) from a group by a.grp)
,cat as(select a.grp,b.cat,cnt=count( * ) * 100/grp.cnt
from a
join b on b.aid=a.id
join grp on grp.grp=a.grp
group by a.grp,b.cat,grp.cnt)
select * from cat

Here is Linq code that generates desired result:

var grp=
        from a in db.A
        group a by new{grp=a.grp}
        ;

var cat=
        from a in db.A
        from b in a.B
        group b by new{a.grp,b.cat}
        ;

var q=from g in grp
        join c in cat on g.Key.grp equals c.Key.grp
        select new{g.Key.grp,c.Key.cat,percent=c.Count()*100/g.Count()};

But it would be nice to have something like this:

from a in db.A
group a by new{grp=a.grp} into grp
from g in grp
from c in g.B
group c by new{gcnt=grp.Count(),c.cat} into cat
from c in cat
select new{c.A.grp,c.cat,cnt=cat.Count()*100/cat.Key.gcnt}

But it gives me the following runtime exception:
The nested query is not supported. Operation1='GroupBy' Operation2='MultiStreamNest'"

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文