用于检索 mptt 查询集祖先查询集的高效函数

发布于 2024-11-17 01:55:10 字数 895 浏览 12 评论 0原文

有人有一个有效的算法来检索 mptt 查询集的所有祖先吗？到目前为止我能想到的最好的方法是这样的：

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    queryset_aggs = queryset.values_list('tree_id', 'level').annotate(max_lft=Max('lft'), min_rght=Min('rght'))
    new_queryset = queryset.none()
    for tree_id, level, max_lft, min_rght in queryset_aggs:
        ancestors = MyModel.objects.filter(
           tree_id=tree_id,
           level__lt=level, 
           lft__lte=max_lft,
           rght__gte=min_rght,
        )
        new_queryset = ancestors | new_queryset
    return new_queryset

这种方法有两个问题：

如果存在彼此不相邻的分支（即它实际上不起作用），它
就会失败它的效率非常低，因为它最终查询中会有 number_of_trees*number_of_levels 子句，这些子句可能会很快变得非常大，

我愿意在其他地方缓存祖先，但我想不出一种有效的方法。我考虑添加一个带有逗号分隔的祖先 id 列表的字段，然后在 extra 中执行 GROUP_CONCAT （我在 MySQL 中），但我认为这可能会变得巨大/缓慢。

原文

Does anybody have an efficient algorithm to retrieve all ancestors of an mptt queryset? The best I could think of so far is something like this:

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    queryset_aggs = queryset.values_list('tree_id', 'level').annotate(max_lft=Max('lft'), min_rght=Min('rght'))
    new_queryset = queryset.none()
    for tree_id, level, max_lft, min_rght in queryset_aggs:
        ancestors = MyModel.objects.filter(
           tree_id=tree_id,
           level__lt=level, 
           lft__lte=max_lft,
           rght__gte=min_rght,
        )
        new_queryset = ancestors | new_queryset
    return new_queryset

There are two problems with this approach:

It fails if there are branches that aren't next to each other (ie it doesn't really work)
It is highly inefficient because it ends up have number_of_trees*number_of_levels clauses in the final query, which can get very large very fast

I am open to caching the ancestors somewhere else, but I cannot think of a way to do efficiently. I considered adding a field with a comma separated list of ancestor's ids and then doing a GROUP_CONCAT (I am in MySQL) inside an extra, but I think that could get huge/slow.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清引 2024-11-24 01:55:10

我曾经不得不编写一个类似的算法。我有一个显示 MPTT 树的视图，它是一棵非常大的树，因此我无法在 HTML 模板中加载它的所有数据。所以我在初始加载时只显示根节点，并使用Ajax加载其他节点。

它运行良好，直到我的老板要求我实现“搜索”选项。搜索必须查找所有节点，并在找到匹配项时分解树。我花了一段时间才弄清楚这一点，但我终于明白了。这是 a 提出的解决方案：

from django.db.models import Q

def get_parents(self, qs):
    tree_list = {}
    query = Q()
    for node in qs:
        if node.tree_id not in tree_list:
            tree_list[node.tree_id] = []

        parent =  node.parent.pk if node.parent is not None else None,

        if parent not in tree_list[node.tree_id]:
            tree_list[node.tree_id].append(parent)

            query |= Q(lft__lt=node.lft, rght__gt=node.rght, tree_id=node.tree_id)

    return YourModel.objects.filter(query)

它只需要运行两个查询，作为参数传递的初始 qs 和函数返回的最终查询集。 tree_list 是一个字典，用于存储已添加到查询集中的节点，这是一种优化，对于算法的工作来说并不是必需的。但由于我正在处理一棵相对较大的树，所以我必须将其包括在内。

我想你可以将此方法变成一个管理器并使其更加通用，即使其适用于任何 MPTT 模型，而不仅仅是 YourModel

I had to write a similar algorithm once. I had a view displaying a MPTT tree, it's was a very large tree so I couldn't load all of it's data in the HTML template. So I displayed only the root nodes in the initial load and used Ajax to load the other nodes.

It was working good until my boss asked me to implement a 'search' option. The search had to look in all nodes and explode the tree were it found a match. It took me a while to figure this out, but I finally got it. Here is the solution a came up with:

from django.db.models import Q

def get_parents(self, qs):
    tree_list = {}
    query = Q()
    for node in qs:
        if node.tree_id not in tree_list:
            tree_list[node.tree_id] = []

        parent =  node.parent.pk if node.parent is not None else None,

        if parent not in tree_list[node.tree_id]:
            tree_list[node.tree_id].append(parent)

            query |= Q(lft__lt=node.lft, rght__gt=node.rght, tree_id=node.tree_id)

    return YourModel.objects.filter(query)

It needs only two queries to run, the initial qs passed as an argument and the final queryset returned by the function. The tree_list is a dictionary that stores nodes that were already added to the queryset, it's an optimization and it's not necessary for the algorithm to work. But since I was working with a relative big tree I had to include it.

I guess you could turn this method into a manager and make it more generic, i.e. make it work for any MPTT model and not only YourModel

回复收藏 0 原文

梦幻的心爱 2024-11-24 01:55:10

怎么样：

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    new_queryset = queryset.none()
    for obj in queryset:
        new_queryset = new_queryset | obj.get_ancestors()
return new_queryset

仍然是 len(queryset) 子句。您可以通过执行预处理过程来减少子句的数量，该预处理过程会删除查询集中其他对象的祖先对象，例如：

min_obj_set = []
for obj in queryset.order_by('tree_id', '-level'):
    for obj2 in min_obj_set:
        if obj.is_ancestor_of(obj2):
            break
    else:
        min_obj_set.append(obj)

虽然上面的代码片段只是一个示例，但如果满足以下条件，您可能需要使用 BST：您的查询集包含大量对象。

不过，您必须测试与较大的数据库查询相比，这是否会提高速度。

How about:

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    new_queryset = queryset.none()
    for obj in queryset:
        new_queryset = new_queryset | obj.get_ancestors()
return new_queryset

It's still len(queryset) clauses. You could potentially reduce the number of clauses a bit by doing a preprocess pass that removes objects that are ancestors of other objects in the queryset, something like:

min_obj_set = []
for obj in queryset.order_by('tree_id', '-level'):
    for obj2 in min_obj_set:
        if obj.is_ancestor_of(obj2):
            break
    else:
        min_obj_set.append(obj)

Although the above snippet is only an example, you'll probably want to use a BST if your querset contains a significant amount of objects.

You'll have to test if this yeilds an increase in speed vs. the larger DB query, though.

回复收藏 0 原文

~没有更多了~