Django - 找到每个组的极端成员

发布于 2024-07-27 17:01:45 字数 1499 浏览 7 评论 0原文

我一直在尝试 Django ORM 中的新聚合功能,有一类问题我认为应该是可能的,但我似乎无法让它工作。 我试图生成的查询类型被描述为此处

因此,假设我有以下模型 -

class ContactGroup(models.Model):
    .... whatever ....

class Contact(models.Model):
    group = models.ForeignKey(ContactGroup)
    name = models.CharField(max_length=20)
    email = models.EmailField()
...

class Record(models.Model):
    contact = models.ForeignKey(Contact)
    group = models.ForeignKey(ContactGroup)
    record_date = models.DateTimeField(default=datetime.datetime.now)

    ... name, email, and other fields that are in Contact ...

因此,每次创建或修改联系人时,都会创建一个新记录,用于保存当时出现在联系人中的信息以及时间戳。 现在,我想要一个查询,例如,返回与 ContactGroup 关联的每个联系人的最新记录实例。 用伪代码表示:

group = ContactGroup.objects.get(...)
records_i_want = group.record_set.most_recent_record_for_every_contact()

一旦我弄清楚了这一点,我只想能够在查询集上抛出一个 filter(record_date__lt=some_date) ,并获取 some_date 中存在的信息

有人有什么想法吗?

编辑:看来我并没有真正说清楚。 使用这样的模型,我想要一种使用纯 django ORM 执行以下操作的方法(没有 extra()):

ContactGroup.record_set.extra(where=["history_date = (select max(history_date) from app_record r where r.id=app_record.id and r.history_date <= '2009-07-18')"])

将子查询放在 where 子句中只是解决此问题的一种策略,第一个链接很好地涵盖了其他策略我上面给了。 我知道如果不使用 extra() 就不可能进行 where 子句子选择,但我认为新的聚合功能可能使其他方法之一成为可能。

I've been playing around with the new aggregation functionality in the Django ORM, and there's a class of problem I think should be possible, but I can't seem to get it to work. The type of query I'm trying to generate is described here.

So, let's say I have the following models -

class ContactGroup(models.Model):
    .... whatever ....

class Contact(models.Model):
    group = models.ForeignKey(ContactGroup)
    name = models.CharField(max_length=20)
    email = models.EmailField()
...

class Record(models.Model):
    contact = models.ForeignKey(Contact)
    group = models.ForeignKey(ContactGroup)
    record_date = models.DateTimeField(default=datetime.datetime.now)

    ... name, email, and other fields that are in Contact ...

So, each time a Contact is created or modified, a new Record is created that saves the information as it appears in the contact at that time, along with a timestamp. Now, I want a query that, for example, returns the most recent Record instance for every Contact associated to a ContactGroup. In pseudo-code:

group = ContactGroup.objects.get(...)
records_i_want = group.record_set.most_recent_record_for_every_contact()

Once I get this figured out, I just want to be able to throw a filter(record_date__lt=some_date) on the queryset, and get the information as it existed at some_date.

Anybody have any ideas?

edit: It seems I'm not really making myself clear. Using models like these, I want a way to do the following with pure django ORM (no extra()):

ContactGroup.record_set.extra(where=["history_date = (select max(history_date) from app_record r where r.id=app_record.id and r.history_date <= '2009-07-18')"])

Putting the subquery in the where clause is only one strategy for solving this problem, the others are pretty well covered by the first link I gave above. I know where-clause subselects are not possible without using extra(), but I thought perhaps one of the other ways was made possible by the new aggregation features.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

笑梦风尘 2024-08-03 17:01:45

听起来您想保留 Django 中对象的更改记录。

Pro Django 在第 11 章(增强应用程序)中有一个部分,其中作者展示如何创建一个模型,该模型使用另一个模型作为客户端来跟踪插入/删除/更新。该模型是根据客户端定义动态生成的,并依赖于信号。 该代码显示了most_recent() 函数,但您可以调整它以获取特定日期的对象状态。

我认为 Django 中的跟踪有问题,而不是获取此值的 SQL,对吧?

It sounds like you want to keep records of changes to objects in Django.

Pro Django has a section in chapter 11 (Enhancing Applications) in which the author shows how to create a model that uses another model as a client that it tracks for inserts/deletes/updates.The model is generated dynamically from the client definition and relies on signals. The code shows most_recent() function but you could adapt this to obtain the object state on a particular date.

I assume it is the tracking in Django that is problematic, not the SQL to obtain this, right?

苍景流年 2024-08-03 17:01:45

首先,我要指出的是:

ContactGroup.record_set.extra(where=["history_date = (select max(history_date) from app_record r where r.id=app_record.id and r.history_date <= '2009-07-18')"])

不会得到与以下相同的效果:

records_i_want = group.record_set.most_recent_record_for_every_contact()

第一个查询返回与特定组关联(或与特定组的任何联系人关联)且 record_date 小于的每条记录额外指定的日期/时间。 在 shell 上运行此命令,然后执行此操作以查看 django 创建的查询:

from django.db import connection
connection.queries[-1]

这表明:

'SELECT "contacts_record"."id", "contacts_record"."contact_id", "contacts_record"."group_id", "contacts_record"."record_date", "contacts_record"."name", "contacts_record"."email" FROM "contacts_record" WHERE "contacts_record"."group_id" = 1  AND record_date = (select max(record_date) from contacts_record r where r.id=contacts_record.id and r.record_date <= \'2009-07-18\')

不完全是您想要的,对吧?

现在,聚合功能用于检索聚合数据,而不是与聚合数据关联的对象。 因此,如果您在尝试获取group.record_set.most_recent_record_for_every_contact()时尝试最大程度地减少使用聚合执行的查询数量,您将不会成功。

如果不使用聚合,您可以使用以下方法获取与某个组关联的所有联系人的最新记录:

[x.record_set.all().order_by('-record_date')[0] for x in group.contact_set.all()]

使用聚合,我能得到的最接近的记录是:

group.record_set.values('contact').annotate(latest_date=Max('record_date'))

后者返回一个字典列表,例如:

[{'contact': 1, 'latest_date': somedate }, {'contact': 2, 'latest_date': somedate }]

因此,每个联系人的一个条目给定组以及与其关联的最新记录日期。

无论如何,最小查询数量可能是 1 + 组中的联系人数量。 如果您有兴趣使用单个查询获取结果,这也是可能的,但您必须以不同的方式构建模型。 但这是你的问题的一个完全不同的方面。

我希望这将帮助您了解如何使用聚合/常规 ORM 函数来解决问题。

First of all, I'll point out that:

ContactGroup.record_set.extra(where=["history_date = (select max(history_date) from app_record r where r.id=app_record.id and r.history_date <= '2009-07-18')"])

will not get you the same effect as:

records_i_want = group.record_set.most_recent_record_for_every_contact()

The first query returns every record associated with a particular group (or associated with any of the contacts of a particular group) that has a record_date less than the date/ time specified in the extra. Run this on the shell and then do this to review the query django created:

from django.db import connection
connection.queries[-1]

which reveals:

'SELECT "contacts_record"."id", "contacts_record"."contact_id", "contacts_record"."group_id", "contacts_record"."record_date", "contacts_record"."name", "contacts_record"."email" FROM "contacts_record" WHERE "contacts_record"."group_id" = 1  AND record_date = (select max(record_date) from contacts_record r where r.id=contacts_record.id and r.record_date <= \'2009-07-18\')

Not exactly what you want, right?

Now the aggregation feature is used to retrieve aggregated data and not objects associated with aggregated data. So if you're trying to minimize number of queries executed using aggregation when trying to obtain group.record_set.most_recent_record_for_every_contact() you won't succeed.

Without using aggregation, you can get the most recent record for all contacts associated with a group using:

[x.record_set.all().order_by('-record_date')[0] for x in group.contact_set.all()]

Using aggregation, the closest I could get to that was:

group.record_set.values('contact').annotate(latest_date=Max('record_date'))

The latter returns a list of dictionaries like:

[{'contact': 1, 'latest_date': somedate }, {'contact': 2, 'latest_date': somedate }]

So one entry for for each contact in a given group and the latest record date associated with it.

Anyway, the minimum query number is probably 1 + # of contacts in a group. If you are interested obtaining the result using a single query, that is also possible, but you'll have to construct your models in a different way. But that's a totally different aspect of your problem.

I hope this will help you understand how to approach the problem using aggregation/ the regular ORM functions.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文