如何在“Q”表达式中使用 Django 隐藏的“through”模型
TL;DR; 在 Q
表达式中包含 through
表(针对 m2m 相关表字段)的正确方法是什么?例如,我如何引用当您拥有 ManyToManyField 时自动创建的隐藏表,而不是引用 Q(compounds__name__iexact="citrate")
中的 m2m 字段“compounds”?我的猜测是 Q(through_peakgroup_compound__name__iexact="citrate"),但这会引发异常。
长版本...
我昨天了解到,您可以在 Q
表达式中的关键路径^ 中使用 through
表达式。我了解到,当使用 M:M 相关表中的字段进行过滤时,生成的查询集更像是真正的 SQL 连接。我的意思在下面的示例中得到了更好的解释,但要点是,如果我通过 m2m 关系在字段上使用搜索词,我只会返回与这些搜索词匹配的 m2m 相关表记录,而不是该表中的每个链接记录。
我在 shell 中进行了实验,发现可以使用 ModelA.m2mModelB.through.filter() 来完成此操作。
例如,我有 56 个 PeakGroup 记录,它们通过多对多关系链接到相同的 2 个化合物(柠檬酸盐和异柠檬酸盐)。
这是我在不使用 through
时能得到的最接近的结果:
仅查询柠檬酸盐:
In [93]: pgs = PeakGroup.objects.filter(Q(compounds__name__iexact="citrate")).distinct()
In [94]: pgs.count()
Out[94]: 56
In [95]: for i in range(0, 56):
...: for compound in pgs[i].compounds.all():
...: print(", ".join(map(lambda s: str(s), [pgs[i].id, pgs[i].name, compound.id, compound.name])))
...:
4, citrate/isocitrate, 12, citrate
4, citrate/isocitrate, 28, isocitrate
11, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 28, isocitrate
...
请注意,它给了我不需要的异柠檬酸盐记录,在 SQL 中,这些记录不会包含在左连接的行。 (我能够通过使用过滤器的重新根^^版本提供带有查询集的 Prefetch
来解决这个问题,但它在下一个案例中重新引入了问题,所以我赢了不谈这个。)
查询柠檬酸盐或异柠檬酸盐:
In [90]: pgs = PeakGroup.objects.filter(Q(compounds__name__iexact="citrate") | Q(compounds__name__iexact="isocitrate")).distinct()
In [91]: pgs.count()
Out[91]: 56
In [92]: for i in range(0, 56):
...: for compound in pgs[i].compounds.all():
...: print(", ".join(map(lambda s: str(s), [pgs[i].id, pgs[i].name, compound.id, compound.name])))
...:
4, citrate/isocitrate, 12, citrate
4, citrate/isocitrate, 28, isocitrate
11, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 28, isocitrate
...
上面的内容很好,我得到了我查询的所有内容,但如果这些显示在 112 个单独的行上(每个化合物都在模板中显示另一行,其中显示重复的 PeakGroup 数据。
这是我使用 through 时得到的结果(这就是我想要的):
仅查询柠檬酸盐:
In [83]: pgs = PeakGroup.compounds.through.objects.filter(Q(compound__name__iexact="citrate"))
In [84]: pgs.count()
Out[84]: 56
In [85]: for i in range(0, 56):
...: print(", ".join(map(lambda s: str(s), [pgs[i].peakgroup.id, pgs[i].peakgroup.name, pgs[i].compound.id, pgs[i].compound.name])))
...:
4, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 12, citrate
...
^^完美!
查询柠檬酸盐或异柠檬酸盐:
In [57]: pgs = PeakGroup.compounds.through.objects.filter(Q(compound__name__iexact="citrate") | Q(compound__name__iexact="isocitrate"))
In [58]: pgs.count()
Out[58]: 112
In [59]: for i in range(0, 112):
...: print(", ".join(map(lambda s: str(s), [pgs[i].peakgroup.id, pgs[i].peakgroup.name, pgs[i].compound.id, pgs[i].compound.name])))
...:
4, citrate/isocitrate, 12, citrate
4, citrate/isocitrate, 28, isocitrate
11, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 28, isocitrate
18, citrate/isocitrate, 12, citrate
18, citrate/isocitrate, 28, isocitrate
...
^^完美!
但是,我在这个堆栈答案下的评论中了解到,我应该能够在 Q
表达式仅提供给单个过滤器,而不引用 .compounds.through
。由于仅在 PeakGroup 的单个过滤器中完成所有这些工作将使我的重构工作大大减少,因此我想了解如何做我被告知可以做的事情。
我开始使用反复试验在 shell 中进行实验(因为我在文档中找不到描述此功能的任何内容)。我想象表达式可能看起来像这样:
PeakGroup.objects.filter(Q(through_peakgroup_compound__compound__name__exact="citrate") | Q(through_peakgroup_compound__compound__name__exact="isocitrate")).count()
但我一直无法弄清楚如何构造包含直通模型的“路径”...
每次尝试都会导致类似以下的错误:
FieldError: Cannot resolve keyword 'through_peakgroup_compound' into field.
也许在该链接堆栈的注释中答案是,存在沟通不畅,并且这个技巧实际上不可能通过直接应用于 PeakGroup.objects
的单个过滤器中的 Q 表达式来实现?那么,使用什么而不是“through_peakgroup_compound__compound__name__exact”的正确“路径”呢?
以下是模型关系:
class PeakGroup(Model):
compounds = models.ManyToManyField(
Compound,
related_name="peak_groups",
help_text="The compound(s) that this PeakGroup is presumed to represent.",
)
class Compound(Model):
name = models.CharField(
max_length=256,
unique=True,
)
动机
我希望具有复杂 Q 表达式的单个过滤器是可能的,因为我们有一个相当大且复杂的高级搜索界面,它使用 3 个复合视图,每个视图结合了大约十几个模型,其中包括一些米:米关系。用户可以使用任何模型中的术语使用 and
-groups 和 or
-groups 构建复杂的查询。目前,结果使用单行单元格中的分隔符组合来自这些 M:M 模型的记录,但下一版本的新要求是在这些 M:M 相关模型之一上拆分输出行,因此使用在上面的示例中,一行将在“化合物”列中显示“柠檬酸盐”,另一行将显示“异柠檬酸盐”,而不是当前在单行上显示“柠檬酸盐;异柠檬酸盐”。如果他们搜索“柠檬酸盐”,结果中不会出现包含“异柠檬酸盐”的行。
^ - 通过“关键路径”,我的意思是在路径中串在一起的外键,就像您提供给 .filter()
或 Q
表达式一样,例如modelBkey__modelCkey__modelCfieldname
部分: ModelA.objects.filter(modelBkey__modelCkey__modelCfieldname__exact="searchterm")
或包含相关的模板中的字段,例如 {{ queryset.modelBkey.modelCkey.modelCfieldname }}
。
^^ - 通过“过滤器的重新根版本”,我的意思是,我采用原始过滤器/Q 表达式,并更改关键路径以从 m2m 相关表开始。它工作得很好,但并不能解决我的全部问题。
TL;DR; What is the correct way to include a through
table (for a m2m related table field) in a Q
expression? E.g. instead of referencing the m2m field "compounds" in Q(compounds__name__iexact="citrate")
, how do I reference the hidden through table automatically created when you have a ManyToManyField? My guess was Q(through_peakgroup_compound__name__iexact="citrate")
, but that throws an exception.
Long version...
I learned yesterday that you can use a through
expression in a key path^ that's in a Q
expression. And I learned that when filtering using fields in M:M related tables, the resulting queryset is more like a true SQL join. What I mean by this is better explained in my examples below, but the gist is, if I use search terms on fields through m2m relations, I only get back m2m related table records that match those search terms instead of every linked record in that table.
I experimented in the shell and learned that I could accomplish this using ModelA.m2mModelB.through.filter()
.
For example, I have 56 PeakGroup records that link to the same 2 compounds (citrate and isocitrate) through a many-to-many relation.
This is the closest I could get to what I wanted when I wasn't using through
:
Querying for citrate only:
In [93]: pgs = PeakGroup.objects.filter(Q(compounds__name__iexact="citrate")).distinct()
In [94]: pgs.count()
Out[94]: 56
In [95]: for i in range(0, 56):
...: for compound in pgs[i].compounds.all():
...: print(", ".join(map(lambda s: str(s), [pgs[i].id, pgs[i].name, compound.id, compound.name])))
...:
4, citrate/isocitrate, 12, citrate
4, citrate/isocitrate, 28, isocitrate
11, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 28, isocitrate
...
Note that it gives me unwanted isocitrate records, which in SQL, would not be included in the left-joined rows. (I was able to get around this case by supplying a Prefetch
with a queryset using a re-rooted^^ version of the filter, but it re-introduced the problem with the next case, so I won't go into that.)
Querying for citrate or isocitrate:
In [90]: pgs = PeakGroup.objects.filter(Q(compounds__name__iexact="citrate") | Q(compounds__name__iexact="isocitrate")).distinct()
In [91]: pgs.count()
Out[91]: 56
In [92]: for i in range(0, 56):
...: for compound in pgs[i].compounds.all():
...: print(", ".join(map(lambda s: str(s), [pgs[i].id, pgs[i].name, compound.id, compound.name])))
...:
4, citrate/isocitrate, 12, citrate
4, citrate/isocitrate, 28, isocitrate
11, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 28, isocitrate
...
The above is good in that, I get everything I queried for, but I can't paginate this if these are displayed on 112 separate rows (each compound is on another row displayed with duplicate PeakGroup data) in the template.
Here's what I get when I do use through (this is what I want):
Querying for citrate only:
In [83]: pgs = PeakGroup.compounds.through.objects.filter(Q(compound__name__iexact="citrate"))
In [84]: pgs.count()
Out[84]: 56
In [85]: for i in range(0, 56):
...: print(", ".join(map(lambda s: str(s), [pgs[i].peakgroup.id, pgs[i].peakgroup.name, pgs[i].compound.id, pgs[i].compound.name])))
...:
4, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 12, citrate
...
^^perfect!
Querying for citrate or isocitrate:
In [57]: pgs = PeakGroup.compounds.through.objects.filter(Q(compound__name__iexact="citrate") | Q(compound__name__iexact="isocitrate"))
In [58]: pgs.count()
Out[58]: 112
In [59]: for i in range(0, 112):
...: print(", ".join(map(lambda s: str(s), [pgs[i].peakgroup.id, pgs[i].peakgroup.name, pgs[i].compound.id, pgs[i].compound.name])))
...:
4, citrate/isocitrate, 12, citrate
4, citrate/isocitrate, 28, isocitrate
11, citrate/isocitrate, 12, citrate
11, citrate/isocitrate, 28, isocitrate
18, citrate/isocitrate, 12, citrate
18, citrate/isocitrate, 28, isocitrate
...
^^perfect!
However, I learned in the comments under this stack answer, that I should be able to accomplish the same thing in the Q
expression supplied to a single filter only, without referencing .compounds.through
. Since accomplishing all this in a single filter off PeakGroup alone would make my refactor massively less involved, I'd like to learn how to do what I was told was possible.
I began to experiment in the shell using trial and error (since I could not find anything in the docs that describe this capability). I imagined the expression might look something like this:
PeakGroup.objects.filter(Q(through_peakgroup_compound__compound__name__exact="citrate") | Q(through_peakgroup_compound__compound__name__exact="isocitrate")).count()
but I have been unable to figure out how to construct the "path" that includes the through model...
Every attempt results in an error something like:
FieldError: Cannot resolve keyword 'through_peakgroup_compound' into field.
Perhaps in the comments of that linked stack answer, there was a miscommunication, and this trick is actually not possible via a Q-expression in a single filter applied directoly to PeakGroup.objects
? So what is the correct "path" to use instead of "through_peakgroup_compound__compound__name__exact"?
Here's the model relations:
class PeakGroup(Model):
compounds = models.ManyToManyField(
Compound,
related_name="peak_groups",
help_text="The compound(s) that this PeakGroup is presumed to represent.",
)
class Compound(Model):
name = models.CharField(
max_length=256,
unique=True,
)
Motivation
The reason I'd hoped a single filter with a complex Q expression was possible is because we have a rather large and sophisticated advanced search interface that uses 3 composite views that combine roughly a dozen models each, including a few M:M relations. Users can construct complex queries with and
-groups and or
-groups with terms from any of the models. Currently, the results combine records from those M:M models using a delimiter in a cell on a single row, but a new requirement for the next release is to split the output rows on one of those M:M related models, so using the example above, one row would display "citrate" and another row would display "isocitrate" in the "Compound" column, instead of currently "citrate; isocitrate" on a single row. And if they search for "citrate", no row containing "isocitrate" would be among the results.
^ - By "key path", I mean the foreign keys strung together in a path, like what you would supply to a .filter()
or a Q
expression, like the modelBkey__modelCkey__modelCfieldname
part of: ModelA.objects.filter(modelBkey__modelCkey__modelCfieldname__exact="searchterm")
or to include a related field in a template like {{ queryset.modelBkey.modelCkey.modelCfieldname }}
.
^^ - By "re-rooted version of the filter", I mean, I take the original filter/Q expressions and I change the key paths to start from the m2m related table. It works quite well, but it doesn't solve my whole problem.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
除非有人能够证明,否则似乎不可能在任一表中包含隐藏
through
表(我将其称为SQL“链接表”)Q()
表达式的.filter()
。只有显式定义通过
表似乎能够以这种方式查询,例如:然后您可以将该表包含在过滤器和Q表达式中,例如:
>但是,在这种情况下,您仍然仅将
PeakGroup
记录发送到模板,并且必须在嵌套for
中重新检索关联的测量/化合物记录循环,所以这不允许我做我希望它能做的事。请注意,在演示我想要的内容时,我有一个我没有意识到的缺陷。我没有意识到在
PeakGroup.compounds.through
情况下发送到模板的记录是链接表记录(即Measurement
记录),而不是PeakGroup
记录了我所忽略的内容。这就是为什么这是一个问题...如果您在视图中有多个链接表(就像我一样),您将在下一个表中遇到相同的问题(除非您发送多个查询集(我们现有的代码库确实这样做)不支持)),因此以这种方式使用直通/链接表不是一个可扩展的解决方案。它可以适用于一种 M:M 关系,但不能适用于更多关系。
我确实找到了一个解决方法,它确实解决了总体问题(想要显示和分页一组查询结果,这些结果通过多个链接表,就好像它是真正的 SQL 左连接一样) )。
使我的解决方法有效的基本前提是,在将查询集发送到模板之前,查询集可以访问完整/真实的连接数据。它将每个重复的 PeakGroup 记录与单个化合物记录链接起来,这正是我在模板中需要的。只有当您将结果查询集发送到模板时,您才会失去对这些关联的访问权限,这就是为什么每个人总是简单地在 Django 模板中使用嵌套的
for
循环来显示 M:M 相关记录。请注意,Django 会返回重复的 PeakGroup 记录,每个记录都连接到一个
Compound
记录(正如您在左连接中所期望的那样),并且您可以在将查询集发送到模板,使用F
表达式。解决方法
请注意,您无需显式定义
through
表即可实现此功能。< /em>利用对完整左连接数据的访问和将 M:M 关联嵌入到查询集中,以便可以在模板中重建它们,您需要做两件事:
.distinct(join_key_list)
,提供使连接不同的主键.annotate(**mm_key_values)
使用 M:M 相关表主键值创建注释(通过F
表达式)使用我的前 2 个示例,它看起来像这样:
仅查询柠檬酸盐:
查询柠檬酸盐或异柠檬酸盐:
令人烦恼的是,您必须循环过滤未使用的 M:M 相关记录,但这很有效,因为它允许我:
注意,我仍然使用向其提供
的
对象包含提供给.prefetch_lated()
(此处未显示) Prefetch()queryset
选项的原始过滤器的副本,只有过滤器中提供的路径才会“重新定位”^^。这提供了显着的性能提升。我还编写了一个名为
get_manytomany_rec
的模板标记 (simple_tag
),它采用 M:M 表记录 (pg.compounds.all
) 和注释字段值(例如pg.compound
),它检索该行相应的复合记录。我这样做的方式是,如果我将 M:M 表设置为不分割结果行,它只会返回提供的pgs.compounds.all
记录。模板标记的用法如下所示:(注意,我可以翻转一个设置,使 M:M 相关表在 1 行上显示为
;
分隔值,或者使它将 1 行分成多行。)我希望有一个比此解决方案更好/更简单的解决方案,但在有人提供更好的答案之前,这是我最好的解决方案可以做。
Unless someone is able to demonstrate otherwise, it does not appear to be possible to include a hidden
through
table (what I would call an SQL "linking table") in either a.filter()
ofQ()
expression. Only explicitly definedthrough
tables appear to be able to be queried in such a way, e.g.:Then you can include that table in filter and Q expressions, such as:
However, in this case, you still only are sending
PeakGroup
records to the template and the associated Measurement/Compound records must be retrieved anew in a nestedfor
loop, so this doesn't allow me to do what I'd hoped it would do.Note that in the demonstration of what I wanted, I had a flaw that I didn't realize. I didn't realize that the records that would be sent to the template in the
PeakGroup.compounds.through
case are the linking table records (i.e. theMeasurement
records), notPeakGroup
records as I had overlooked. Here's why that's a problem...If you have multiple linking tables in the view (as I do), you will run into the same problem at the next table, (unless you're sending multiple querysets (which our existing code base does not support)), so using a through/linking table in this manner is not a scalable solution. It could work for one M:M relation, but no more.
I did figure out a work-around, which does solve the overall problem (wanting to display and paginate a single set of query results that go through multiple linking tables as if it was a true SQL left-join).
The basic premise which makes my work-around work is that before you send a queryset to the template, a queryset has access to the full/true joined data. It links each duplicate PeakGroup record with a single Compound record, which is what I need in the template. You only lose access to those associations when you send the resulting queryset to the template, which is why everyone always simple uses nested
for
loops in templates in Django to display M:M related records.Note, Django will give you back duplicate PeakGroup records, each connected to a single
Compound
record (as you would expect in a left join) and you can access it, before sending the queryset to the template, using anF
expression.Work-around
Note, you don't need to explicitly define a
through
table for this to work.To take advantage of access to the full-left-join data and embed the M:M associations in the queryset so that they can be reconstructed in the template, you need 2 things:
.distinct(join_key_list)
, supplying the primary keys that make the join distinct.annotate(**mm_key_values)
that creates an annotation with the M:M related table primary key value (by way of theF
expression)Using my first 2 examples, it would look like this:
Querying only for citrate:
Querying for either citrate or isocitrate:
It's annoying that you have to loop to filter the unused M:M related records, but this works well, as it allows me to:
Note, I do still employ a
.prefetch_related()
(not shown here) to which I supplyPrefetch()
objects that contain a copy of the original filter supplied to thequeryset
option, only the paths supplied in the filter are "re-rooted"^^. This provides a significant performance boost.I also wrote a template tag (
simple_tag
) namedget_manytomany_rec
that takes the M:M table records (pg.compounds.all
) and the annotation field value (e.g.pg.compound
) and it retrieves the corresponding Compound record for that row. And I did it in a way that if I set that M:M table to not split the result rows, it just returns the suppliedpgs.compounds.all
records. The usage of the template tag looks like this:(Note, I have a setting I can flip to either make the M:M related table display as a
;
delimited value on 1 row or make it split the 1 row into multiple rows.)I would love to have a better/simpler solution than this work-around, but until someone provides a better answer, this was the best I could do.