我希望标题没有误导。
无论如何,我有两个模型,都与第三个模型有 M2M 关系。
class Model1: keywords = m2m(Keyword)
class Model2: keywords = m2m(Keyword)
给定 Model2 实例的关键字如下:
keywords2 = model2_instance.keywords.all()
我需要检索 Model1 实例,该实例至少具有 keywords2 中的关键字,例如:
Model1.objects.filter(keywords__in=keywords2)
并按匹配的关键字数量对它们进行排序(不认为可以通过“in”字段查找)。问题是,我该怎么做?
我正在考虑手动交互每个 Model1 实例,将它们附加到每场比赛的结果字典中,但我需要它进行扩展,例如数以万计的记录。我的想象是这样的:
result = {}
keywords2_ids = model2.keywords.all().values_list('id',flat=True)
for model1 in Model1.objects.all():
keywords_matched = model1.keywords.filter(id__in=keywords2_ids).count()
objs = result.get(str(keywords_matched), [])
result[str(keywords_matched)] = objs.append(obj)
必须有一种更快的方法来做到这一点。有什么想法吗?
I hope the title is not misleading.
Anyway, I have two models, both have m2m relationships with a third model.
class Model1: keywords = m2m(Keyword)
class Model2: keywords = m2m(Keyword)
Given the keywords for a Model2 instance like this:
keywords2 = model2_instance.keywords.all()
I need to retrieve the Model1 instances which have at least a keyword that is in keywords2, something like:
Model1.objects.filter(keywords__in=keywords2)
and sort them by the number of keywords that match (dont think its possible via 'in' field lookup). Question is, how do i do this?
I'm thinking of just manually interating through each of Model1 instances, appending them to a dictionary of results for every match, but I need this to scale, for say tens of thousands of records. Here is how I imagined it would be like:
result = {}
keywords2_ids = model2.keywords.all().values_list('id',flat=True)
for model1 in Model1.objects.all():
keywords_matched = model1.keywords.filter(id__in=keywords2_ids).count()
objs = result.get(str(keywords_matched), [])
result[str(keywords_matched)] = objs.append(obj)
There must be an faster way to do this. Any ideas?
发布评论
评论(1)
您可以切换到原始 SQL。您要做的就是为
manager >Model1
根据关键字匹配计数返回 Model1 对象的 id 排序集。 SQL 很简单,只需在关键字 id 上连接两个多对多表(Django 自动创建一个表来表示多对多关系),然后针对COUNT
sql 函数对 Model1 id 进行分组。然后对这些计数使用ORDER BY
子句将生成您需要的排序Model1
id 列表。在 MySQL 中,这里
model2_object_id
是model2_instance
id。这肯定会更快、更具可扩展性。You can just switch to raw SQL. What you have to do is to write a custom manager for
Model1
to return the sorted set of ids of Model1 objects based on the keyword match counts. The SQL is simple as joining the two many to many tables(Django automatically creates a table to represent a many to many relationship) on keyword ids and then grouping on Model1 ids forCOUNT
sql function. Then using anORDER BY
clause on those counts will produce the sortedModel1
id list you need. In MySQL,Here
model2_object_id
is themodel2_instance
id. This will definitely be faster and more scalable.