在 Django ORM 中旋转数据和复杂注释

发布于 2024-11-06 07:50:49 字数 1479 浏览 2 评论 0原文

Django 中的 ORM 让我们可以轻松地根据相关数据对查询集进行注释（添加字段），但我找不到一种方法来为相关数据的不同过滤子集获取多个注释。

这是与 django-helpdesk 有关的问题，这是一个由 Django 驱动的开源麻烦 -票务跟踪器。我需要像这样进行数据透视以用于图表和报告目的

考虑这些模型：

CHOICE_LIST = (
    ('open', 'Open'),
    ('closed', 'Closed'),
)

class Queue(models.model):
    name = models.CharField(max_length=40)

class Issue(models.Model):
    subject = models.CharField(max_length=40)
    queue = models.ForeignKey(Queue)
    status = models.CharField(max_length=10, choices=CHOICE_LIST)

这个数据集：

队列：

ID | Name
---+------------------------------
1  | Product Information Requests
2  | Service Requests

问题：

ID | Queue | Status
---+-------+---------
1  | 1     | open
2  | 1     | open
3  | 1     | closed
4  | 2     | open
5  | 2     | closed
6  | 2     | closed
7  | 2     | closed

我希望看到注释/聚合看起来像什么就像这样：

Queue ID | Name                          | open | closed
---------+-------------------------------+------+--------
1        | Product Information Requests  | 2    | 1
2        | Service Requests              | 1    | 3

用 Excel 的话说，这基本上是一个交叉表或数据透视表。我目前正在使用一些自定义 SQL 查询构建此输出，但是如果我可以转而使用 Django ORM，我可以更轻松地动态过滤数据，而无需在 SQL 中危险地插入 WHERE 子句。

对于“奖励积分”：如果数据透视字段（上例中的status）是日期，并且我们希望列是月/周/季度/天，那么如何做到这一点？

原文

The ORM in Django lets us easily annotate (add fields to) querysets based on related data, hwoever I can't find a way to get multiple annotations for different filtered subsets of related data.

This is being asked in relation to django-helpdesk, an open-source Django-powered trouble-ticket tracker. I need to have data pivoted like this for charting and reporting purposes

Consider these models:

CHOICE_LIST = (
    ('open', 'Open'),
    ('closed', 'Closed'),
)

class Queue(models.model):
    name = models.CharField(max_length=40)

class Issue(models.Model):
    subject = models.CharField(max_length=40)
    queue = models.ForeignKey(Queue)
    status = models.CharField(max_length=10, choices=CHOICE_LIST)

And this dataset:

Queues:

ID | Name
---+------------------------------
1  | Product Information Requests
2  | Service Requests

Issues:

ID | Queue | Status
---+-------+---------
1  | 1     | open
2  | 1     | open
3  | 1     | closed
4  | 2     | open
5  | 2     | closed
6  | 2     | closed
7  | 2     | closed

I would like to see an annotation/aggregate look something like this:

Queue ID | Name                          | open | closed
---------+-------------------------------+------+--------
1        | Product Information Requests  | 2    | 1
2        | Service Requests              | 1    | 3

This is basically a crosstab or pivot table, in Excel parlance. I am currently building this output using some custom SQL queries, however if I can move to using the Django ORM I can more easily filter the data dynamically without doing dodgy insertion of WHERE clauses in my SQL.

For "bonus points": How would one do this where the pivot field (status in the example above) was a date, and we wanted the columns to be months / weeks / quarters / days?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

抱着落日 2024-11-13 07:50:49

你有Python，就用它吧。

from collections import defaultdict
summary = defaultdict( int )
for issue in Issues.objects.all():
    summary[issue.queue, issue.status] += 1

现在，您的 summary 对象已将队列、状态作为二元组键。您可以使用各种模板技术直接显示它。

或者，如果更简单的话，您可以将其重新组合成类似表格的结构。

table = []
queues = list( q for q,_ in summary.keys() )
for q in sorted( queues ):
    table.append( q.id, q.name, summary.count(q,'open'), summary.count(q.'closed') )

您有很多很多用于制作数据透视表的 Python 技术。

如果您进行测量，您可能会发现像这样的主要使用 Python 的解决方案实际上比纯 SQL 解决方案更快。为什么？映射比需要将排序作为 GROUP-BY 一部分的 SQL 算法更快。

You have Python, use it.

from collections import defaultdict
summary = defaultdict( int )
for issue in Issues.objects.all():
    summary[issue.queue, issue.status] += 1

Now your summary object has queue, status as a two-tuple key. You can display it directly, using various template techniques.

Or, you can regroup it into a table-like structure, if that's simpler.

table = []
queues = list( q for q,_ in summary.keys() )
for q in sorted( queues ):
    table.append( q.id, q.name, summary.count(q,'open'), summary.count(q.'closed') )

You have lots and lots of Python techniques for doing pivot tables.

If you measure, you may find that a mostly-Python solution like this is actually faster than a pure SQL solution. Why? Mappings can be faster than SQL algorithms which require a sort as part of a GROUP-BY.

回复收藏 0 原文