用于检索 mptt 查询集祖先查询集的高效函数

发布于 2024-11-17 01:55:10 字数 895 浏览 7 评论 0原文

有人有一个有效的算法来检索 mptt 查询集的所有祖先吗?到目前为止我能想到的最好的方法是这样的:

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    queryset_aggs = queryset.values_list('tree_id', 'level').annotate(max_lft=Max('lft'), min_rght=Min('rght'))
    new_queryset = queryset.none()
    for tree_id, level, max_lft, min_rght in queryset_aggs:
        ancestors = MyModel.objects.filter(
           tree_id=tree_id,
           level__lt=level, 
           lft__lte=max_lft,
           rght__gte=min_rght,
        )
        new_queryset = ancestors | new_queryset
    return new_queryset

这种方法有两个问题:

  1. 如果存在彼此不相邻的分支(即它实际上不起作用),它
  2. 就会失败它的效率非常低,因为它最终查询中会有 number_of_trees*number_of_levels 子句,这些子句可能会很快变得非常大,

我愿意在其他地方缓存祖先,但我想不出一种有效的方法。我考虑添加一个带有逗号分隔的祖先 id 列表的字段,然后在 extra 中执行 GROUP_CONCAT (我在 MySQL 中),但我认为这可能会变得巨大/缓慢。

Does anybody have an efficient algorithm to retrieve all ancestors of an mptt queryset? The best I could think of so far is something like this:

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    queryset_aggs = queryset.values_list('tree_id', 'level').annotate(max_lft=Max('lft'), min_rght=Min('rght'))
    new_queryset = queryset.none()
    for tree_id, level, max_lft, min_rght in queryset_aggs:
        ancestors = MyModel.objects.filter(
           tree_id=tree_id,
           level__lt=level, 
           lft__lte=max_lft,
           rght__gte=min_rght,
        )
        new_queryset = ancestors | new_queryset
    return new_queryset

There are two problems with this approach:

  1. It fails if there are branches that aren't next to each other (ie it doesn't really work)
  2. It is highly inefficient because it ends up have number_of_trees*number_of_levels clauses in the final query, which can get very large very fast

I am open to caching the ancestors somewhere else, but I cannot think of a way to do efficiently. I considered adding a field with a comma separated list of ancestor's ids and then doing a GROUP_CONCAT (I am in MySQL) inside an extra, but I think that could get huge/slow.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

清引 2024-11-24 01:55:10

我曾经不得不编写一个类似的算法。我有一个显示 MPTT 树的视图,它是一棵非常大的树,因此我无法在 HTML 模板中加载它的所有数据。所以我在初始加载时只显示根节点,并使用Ajax加载其他节点。

它运行良好,直到我的老板要求我实现“搜索”选项。搜索必须查找所有节点,并在找到匹配项时分解树。我花了一段时间才弄清楚这一点,但我终于明白了。这是 a 提出的解决方案:

from django.db.models import Q

def get_parents(self, qs):
    tree_list = {}
    query = Q()
    for node in qs:
        if node.tree_id not in tree_list:
            tree_list[node.tree_id] = []

        parent =  node.parent.pk if node.parent is not None else None,

        if parent not in tree_list[node.tree_id]:
            tree_list[node.tree_id].append(parent)

            query |= Q(lft__lt=node.lft, rght__gt=node.rght, tree_id=node.tree_id)

    return YourModel.objects.filter(query)

它只需要运行两个查询,作为参数传递的初始 qs 和函数返回的最终查询集。 tree_list 是一个字典,用于存储已添加到查询集中的节点,这是一种优化,对于算法的工作来说并不是必需的。但由于我正在处理一棵相对较大的树,所以我必须将其包括在内。

我想你可以将此方法变成一个管理器并使其更加通用,即使其适用于任何 MPTT 模型,而不仅仅是 YourModel

I had to write a similar algorithm once. I had a view displaying a MPTT tree, it's was a very large tree so I couldn't load all of it's data in the HTML template. So I displayed only the root nodes in the initial load and used Ajax to load the other nodes.

It was working good until my boss asked me to implement a 'search' option. The search had to look in all nodes and explode the tree were it found a match. It took me a while to figure this out, but I finally got it. Here is the solution a came up with:

from django.db.models import Q

def get_parents(self, qs):
    tree_list = {}
    query = Q()
    for node in qs:
        if node.tree_id not in tree_list:
            tree_list[node.tree_id] = []

        parent =  node.parent.pk if node.parent is not None else None,

        if parent not in tree_list[node.tree_id]:
            tree_list[node.tree_id].append(parent)

            query |= Q(lft__lt=node.lft, rght__gt=node.rght, tree_id=node.tree_id)

    return YourModel.objects.filter(query)

It needs only two queries to run, the initial qs passed as an argument and the final queryset returned by the function. The tree_list is a dictionary that stores nodes that were already added to the queryset, it's an optimization and it's not necessary for the algorithm to work. But since I was working with a relative big tree I had to include it.

I guess you could turn this method into a manager and make it more generic, i.e. make it work for any MPTT model and not only YourModel

梦幻的心爱 2024-11-24 01:55:10

怎么样:

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    new_queryset = queryset.none()
    for obj in queryset:
        new_queryset = new_queryset | obj.get_ancestors()
return new_queryset

仍然是 len(queryset) 子句。您可以通过执行预处理过程来减少子句的数量,该预处理过程会删除查询集中其他对象的祖先对象,例如:

min_obj_set = []
for obj in queryset.order_by('tree_id', '-level'):
    for obj2 in min_obj_set:
        if obj.is_ancestor_of(obj2):
            break
    else:
        min_obj_set.append(obj)

虽然上面的代码片段只是一个示例,但如果满足以下条件,您可能需要使用 BST:您的查询集包含大量对象。

不过,您必须测试与较大的数据库查询相比,这是否会提高速度。

How about:

def qs_ancestors(queryset):
    if isinstance(queryset, EmptyQuerySet):
        return queryset
    new_queryset = queryset.none()
    for obj in queryset:
        new_queryset = new_queryset | obj.get_ancestors()
return new_queryset

It's still len(queryset) clauses. You could potentially reduce the number of clauses a bit by doing a preprocess pass that removes objects that are ancestors of other objects in the queryset, something like:

min_obj_set = []
for obj in queryset.order_by('tree_id', '-level'):
    for obj2 in min_obj_set:
        if obj.is_ancestor_of(obj2):
            break
    else:
        min_obj_set.append(obj)

Although the above snippet is only an example, you'll probably want to use a BST if your querset contains a significant amount of objects.

You'll have to test if this yeilds an increase in speed vs. the larger DB query, though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文