查询集的缓存和重新评估
我将发布一些不完整的代码以使示例简单。我正在运行一个递归函数来计算分层结构上的一些指标。
class Category(models.Model):
parent = models.ForeignKey('self', null=True, blank=True, related_name='children', default=1)
def compute_metrics(self, shop_object, metric_queryset=None, rating_queryset=None)
if(metric_queryset == None):
metric_queryset = Metric.objects.all()
if(rating_queryset == None):
rating_queryset = Rating.objects.filter(shop_object=shop_object)
for child in self.children.all():
do stuff
child_score = child.compute_metrics(shop_object, metric_queryset, rating_queryset)
metrics_in_cat = metric_queryset.filter(category=self)
for metric in metrics_in_cat
do stuff
我希望有足够的代码来了解发生了什么。我在这里追求的是一个递归函数,它只会运行这些查询一次,然后将结果传递下来。现在看来这种情况并没有发生,而且它正在扼杀性能。如果是 PHP/MySQL(尽管我在使用 Django 后不喜欢它们!),我可以只运行一次查询并将它们传递下来。
根据我对 Django 查询集的理解,它们不会在我的 if queryset == None then queryset=stuff 部分中被评估。我怎样才能强迫这个?当我执行诸如 metric_queryset.filter(category=self) 之类的操作时,它会被重新评估吗?
我不关心数据的新鲜度。我只想从数据库中读取每个指标和评级一次,然后稍后对其进行过滤,而无需再次访问数据库。这是一个令人沮丧的问题,感觉它应该有一个非常简单的答案。 Pickling 看起来可以工作,但 Django 文档中没有很好地解释。
I'm going to post some incomplete code to make the example simple. I'm running a recursive function to compute some metrics on a hierarchical structure.
class Category(models.Model):
parent = models.ForeignKey('self', null=True, blank=True, related_name='children', default=1)
def compute_metrics(self, shop_object, metric_queryset=None, rating_queryset=None)
if(metric_queryset == None):
metric_queryset = Metric.objects.all()
if(rating_queryset == None):
rating_queryset = Rating.objects.filter(shop_object=shop_object)
for child in self.children.all():
do stuff
child_score = child.compute_metrics(shop_object, metric_queryset, rating_queryset)
metrics_in_cat = metric_queryset.filter(category=self)
for metric in metrics_in_cat
do stuff
I hope that's enough code to see what's going on. What I'm after here is a recursive function that is only going to run those queries once each, then pass the results down. That doesn't seem to be happening right now and it's killing performance. Were this PHP/MySQL (as much as I dislike them after working with Django!) I could just run the queries once and pass them down.
From what I understand of Django's querysets, they aren't going to be evaluated in my if queryset == None then queryset=stuff part. How can I force this? Will it be re-evaluated when I do things like metric_queryset.filter(category=self)
?
I don't care about data freshness. I just want to read from the DB once for each of metrics and rating, then filter on them later without hitting the DB again. It's a frustrating problem that feels like it should have a very simple answer. Pickling looks like it could work but it's not very well explained in the Django documentation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为这里的问题是您在递归调用之后才评估查询集。如果您使用
list()
强制评估查询集,那么它应该只访问数据库一次。请注意,您必须将metrics_in_cat
行更改为 python 级别过滤器,而不是使用查询集过滤器。I think the problem here is you are not evaluating the queryset until after your recursive call. If you use
list()
to force the evaluation of the queryset then it should only hit the database once. Note you will have to change themetrics_in_cat
line to a python level filter rather than using queryset filters.