我可以在 django 1.3 的 orm 中控制 GROUP BY 吗?
我认为最好用一个例子来解释这一点。
数据如下:
|project |
|id|name |
|1 |some project |
|2 |my other project|
|run |
|id|project_id|start_time |result |
|1 |1 |1305732581845|something|
|2 |1 |1305732593721|nothing |
|3 |2 |1305732343721|nothing |
|4 |2 |1305732556821|something|
我希望能够从项目的每个最新运行中获取完整的记录集。 SQL 查询看起来像这样:
SELECT *, MAX("run"."start_time")
FROM "run"
LEFT OUTER JOIN "project" ON ("run"."project_id" = "project"."id")
GROUP BY "project"."id"
这将返回项目最新运行的两个表中的所有列,这很棒,这正是我所需要的。
因此,在尝试在 django 1.3 中找到 django orm 等效项时,我根本找不到正确的方法。如果我这样做:
Run.objects.annotate(Max('start_time'))
生成的 SQL 查询将类似于:
SELECT
"run"."id", "run"."result", "run"."project_id", "project"."id", "project"."name",
MAX("run"."start_time")
FROM "run"
LEFT OUTER JOIN "project" ON ("run"."project_id" = "project"."id")
GROUP BY "run"."id", "run"."result", "run"."project_id", "project"."id", "project"."name"
这不会返回正确的结果,因为分组依据不符合我的要求。我相信在以前版本的 django 中,以下内容可以在查询中正确且显式地设置 group by 子句,但在 1.3 中似乎不起作用:
q = Run.objects.annotate(Max('start_time'))
q.query.group_by = [("project", "id")]
在 1.3 中,这会生成与不手动修改查询中的 group_by 属性完全相同的查询。
我还根据 annotate() 调用之前和之后记录的 .values() 行为尝试了这种逻辑方法,但它没有按预期工作。当我尝试这样做时:
q = Run.objects.values('project__id').annotate(Max('start_time')).values('id')
我最终得到了这样的查询:
SELECT
"run"."id", "run"."project_id"
MAX("run"."start_time")
FROM "run"
LEFT OUTER JOIN "project" ON ("run"."project_id" = "project"."id")
GROUP BY "run"."id", "run"."project_id"
任何人都可以向我指出正确的方法来完成我正在做的事情,而无需执行以下任何操作:
- 使用原始sql - 当我不断地使用orm时,使用orm的意义何在必须生成我自己的查询吗?
- 使用 .extra(select = {'latest': 'somequery'}) - 当没有子查询的完全有效的查询可以给我我想要的东西时,为什么我必须使用子查询。
- 使用多个查询来获取相同的数据 - 同样,为什么我必须进行多个查询才能获取 1 中可用的结果?
I think this will best be explained with an example.
Here is what the data would looks like:
|project |
|id|name |
|1 |some project |
|2 |my other project|
|run |
|id|project_id|start_time |result |
|1 |1 |1305732581845|something|
|2 |1 |1305732593721|nothing |
|3 |2 |1305732343721|nothing |
|4 |2 |1305732556821|something|
I would like to be able to get an entire recordset from each of the latest runs by project. The SQL Query would look something like this:
SELECT *, MAX("run"."start_time")
FROM "run"
LEFT OUTER JOIN "project" ON ("run"."project_id" = "project"."id")
GROUP BY "project"."id"
This will return me all columns across both tables for the latest run of the project, which is great, it is exactly what I need.
So in attempting to find the django orm equivalent in django 1.3 I simply can't find a proper way to do it. If I do something like this:
Run.objects.annotate(Max('start_time'))
The generated SQL query will look something like:
SELECT
"run"."id", "run"."result", "run"."project_id", "project"."id", "project"."name",
MAX("run"."start_time")
FROM "run"
LEFT OUTER JOIN "project" ON ("run"."project_id" = "project"."id")
GROUP BY "run"."id", "run"."result", "run"."project_id", "project"."id", "project"."name"
This will not return me the proper results as the group by is incorrect for what I want. I believe in previous versions of django the following would properly and explicitly set the group by clause in the query but appears to not work in 1.3:
q = Run.objects.annotate(Max('start_time'))
q.query.group_by = [("project", "id")]
In 1.3 this generates the exact same query as not manually modifying the group_by property in the query.
I also tried this the logical way based on the documented behavior of .values() before and after the annotate() call but it did not work as expected. When I tried this:
q = Run.objects.values('project__id').annotate(Max('start_time')).values('id')
I ended up with a query like this:
SELECT
"run"."id", "run"."project_id"
MAX("run"."start_time")
FROM "run"
LEFT OUTER JOIN "project" ON ("run"."project_id" = "project"."id")
GROUP BY "run"."id", "run"."project_id"
Can anyone point me to the correct way to do what I am doing without any of the following:
- Using raw sql - what would be the point of using an orm when I constantly have to generate my own queries?
- Using .extra(select = {'latest': 'somequery'}) - why should I have to use subqueries when a perfectly valid query without subqueries can give me what I want.
- Using multiple queries to grab the same data - again, why should I have to make multiple queries to get results that are available in 1?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
tl;dr:Django 确实允许您控制 group by 子句,但它限制它在所有类型的 SQL 上工作,所以我不能做我想做的事。
有人向我指出,我尝试使用 django ORM 生成的原始查询实际上并不对所有类型的 SQL 都有效。这是我正在寻找的查询的回顾:
如果一个人尝试选择不在 MSSQL 中的 GROUP BY 中的内容,他们实际上会收到错误。所以在我看来,django 实际上不应该让我生成这样的查询,我本质上是在尝试错误地解决我的问题。
tl;dr: Django does allow you to control the group by clause but it limits it to work across all flavors of SQL so I can't do what I want.
It has been pointed out to me that the original query I am trying to generate with the django ORM is not actually valid for all flavors of SQL. Here is a refresher of the query I was looking for:
If a person attempts to select something that is not in the GROUP BY in MSSQL they will actually get an error. So it seems to me that django actually shouldn't let me generate a query like this and I am essentially attempting to solve my problem incorrectly.
这在文档的注释部分非常简单和详细,并且在以前的版本中都无法手动设置分组依据。
This is rather easy and detailed in the annotation section of the docs, and no in no previous version could you manually set the group by.