用于检索 mptt 查询集祖先查询集的高效函数
有人有一个有效的算法来检索 mptt 查询集的所有祖先吗?到目前为止我能想到的最好的方法是这样的:
def qs_ancestors(queryset):
if isinstance(queryset, EmptyQuerySet):
return queryset
queryset_aggs = queryset.values_list('tree_id', 'level').annotate(max_lft=Max('lft'), min_rght=Min('rght'))
new_queryset = queryset.none()
for tree_id, level, max_lft, min_rght in queryset_aggs:
ancestors = MyModel.objects.filter(
tree_id=tree_id,
level__lt=level,
lft__lte=max_lft,
rght__gte=min_rght,
)
new_queryset = ancestors | new_queryset
return new_queryset
这种方法有两个问题:
- 如果存在彼此不相邻的分支(即它实际上不起作用),它
- 就会失败它的效率非常低,因为它最终查询中会有 number_of_trees*number_of_levels 子句,这些子句可能会很快变得非常大,
我愿意在其他地方缓存祖先,但我想不出一种有效的方法。我考虑添加一个带有逗号分隔的祖先 id 列表的字段,然后在 extra 中执行 GROUP_CONCAT (我在 MySQL 中),但我认为这可能会变得巨大/缓慢。
Does anybody have an efficient algorithm to retrieve all ancestors of an mptt queryset? The best I could think of so far is something like this:
def qs_ancestors(queryset):
if isinstance(queryset, EmptyQuerySet):
return queryset
queryset_aggs = queryset.values_list('tree_id', 'level').annotate(max_lft=Max('lft'), min_rght=Min('rght'))
new_queryset = queryset.none()
for tree_id, level, max_lft, min_rght in queryset_aggs:
ancestors = MyModel.objects.filter(
tree_id=tree_id,
level__lt=level,
lft__lte=max_lft,
rght__gte=min_rght,
)
new_queryset = ancestors | new_queryset
return new_queryset
There are two problems with this approach:
- It fails if there are branches that aren't next to each other (ie it doesn't really work)
- It is highly inefficient because it ends up have
number_of_trees*number_of_levels
clauses in the final query, which can get very large very fast
I am open to caching the ancestors somewhere else, but I cannot think of a way to do efficiently. I considered adding a field with a comma separated list of ancestor's ids and then doing a GROUP_CONCAT
(I am in MySQL) inside an extra, but I think that could get huge/slow.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我曾经不得不编写一个类似的算法。我有一个显示 MPTT 树的视图,它是一棵非常大的树,因此我无法在 HTML 模板中加载它的所有数据。所以我在初始加载时只显示根节点,并使用Ajax加载其他节点。
它运行良好,直到我的老板要求我实现“搜索”选项。搜索必须查找所有节点,并在找到匹配项时分解树。我花了一段时间才弄清楚这一点,但我终于明白了。这是 a 提出的解决方案:
它只需要运行两个查询,作为参数传递的初始 qs 和函数返回的最终查询集。
tree_list
是一个字典,用于存储已添加到查询集中的节点,这是一种优化,对于算法的工作来说并不是必需的。但由于我正在处理一棵相对较大的树,所以我必须将其包括在内。我想你可以将此方法变成一个管理器并使其更加通用,即使其适用于任何 MPTT 模型,而不仅仅是
YourModel
I had to write a similar algorithm once. I had a view displaying a MPTT tree, it's was a very large tree so I couldn't load all of it's data in the HTML template. So I displayed only the root nodes in the initial load and used Ajax to load the other nodes.
It was working good until my boss asked me to implement a 'search' option. The search had to look in all nodes and explode the tree were it found a match. It took me a while to figure this out, but I finally got it. Here is the solution a came up with:
It needs only two queries to run, the initial
qs
passed as an argument and the final queryset returned by the function. Thetree_list
is a dictionary that stores nodes that were already added to the queryset, it's an optimization and it's not necessary for the algorithm to work. But since I was working with a relative big tree I had to include it.I guess you could turn this method into a manager and make it more generic, i.e. make it work for any MPTT model and not only
YourModel
怎么样:
仍然是 len(queryset) 子句。您可以通过执行预处理过程来减少子句的数量,该预处理过程会删除查询集中其他对象的祖先对象,例如:
虽然上面的代码片段只是一个示例,但如果满足以下条件,您可能需要使用 BST:您的查询集包含大量对象。
不过,您必须测试与较大的数据库查询相比,这是否会提高速度。
How about:
It's still len(queryset) clauses. You could potentially reduce the number of clauses a bit by doing a preprocess pass that removes objects that are ancestors of other objects in the queryset, something like:
Although the above snippet is only an example, you'll probably want to use a BST if your querset contains a significant amount of objects.
You'll have to test if this yeilds an increase in speed vs. the larger DB query, though.