Django注释＆＃x2B;过滤

发布于 2025-02-04 18:08:46 字数 2537 浏览 1 评论 0原文

我已经使用pgloader从sqlite到psql加载了一些数据，它或多或少都可以工作，但某些类型有点笨拙，所以这就是为什么我的日期在字符串中，但我认为这不是问题（可能在这里错了）

考虑以下模型：

class User(models.Model):
    hash_id = models.TextField(primary_key=True)
    first_name = models.TextField(blank=True, null=True)
    last_name = models.TextField(blank=True, null=True)

class Education(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)

class Job(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

我正在尝试从用户的第一次教育中获得X年后的所有工作。到目前为止

# This should return a date eg: '2000-01-01'
mba_start_subq = Educations.objects.filter(hash_code=OuterRef('hash_code'))
        .order_by('startdate')
        .values('startdate')[:1]
    )

jobs = (Job.objects.all()
        # .distinct('hash_code')
        .order_by('hash_code')
        .annotate(mba_start = Subquery(mba_start_subq))
       )

获得响应（基本上是无限的循环）

       # Filtering with date strings works
        jobs.filter(
            Q(jobstarts__lt = '2000-01-01'),
            Q(jobends__gt = '2002-01-01')
        )

       # this is the desired functionality, that doesn't work
        jobs.filter(
            Q(jobstarts__lt = Subquery(mba_start_subq)),
            Q(jobends__gt = Subquery(mba_start_subq))
        )

我也尝试在.filter（）注释到QuerySet之后，在.filter（）中使用F（Annotated_value）服务器的任何响应都需要年龄。

我想念什么？我获得教育开始日期的方式是错误的？有更有效的方法吗？

更新：这是我想在Django中实现的工作SQL查询

 SELECT 
    Organizations.id, 
    Organizations.hash_code, 
    educations.EducationDegree,
    Organizations.Role,
    Organizations.Industry, 
    Organizations.JObStarts, 
    Educations.StartDate, 
    Organizations.JobEnds 
 FROM educations
 INNER JOIN Organizations ON 
 Organizations.hash_code=Educations.hash_code
 WHERE (date(Educations.StartDate,'+1 year') 
 BETWEEN Organizations.JObStarts AND Organizations.JobEnds ) 
 GROUP BY Organizations.hash_code

原文

I've some data that I've loaded from SQLite to PSQL using pgloader, it works more or less but some types are a bit clunky, so that's why my dates are in string, but I don't think that's the issue ( might be wrong here)

Consider the following models:

class User(models.Model):
    hash_id = models.TextField(primary_key=True)
    first_name = models.TextField(blank=True, null=True)
    last_name = models.TextField(blank=True, null=True)

class Education(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)

class Job(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

I'm trying to get all the jobs after X years from the user's first Education.
So far I've got a Subquery, to get the first education's start date for each user:

# This should return a date eg: '2000-01-01'
mba_start_subq = Educations.objects.filter(hash_code=OuterRef('hash_code'))
        .order_by('startdate')
        .values('startdate')[:1]
    )

Then I append this to each Job via:

jobs = (Job.objects.all()
        # .distinct('hash_code')
        .order_by('hash_code')
        .annotate(mba_start = Subquery(mba_start_subq))
       )

So far so good, the issue is when I try to add a .filter() afterwards it takes ages to get the response (basically an infinite loop kind of thing)

       # Filtering with date strings works
        jobs.filter(
            Q(jobstarts__lt = '2000-01-01'),
            Q(jobends__gt = '2002-01-01')
        )

       # this is the desired functionality, that doesn't work
        jobs.filter(
            Q(jobstarts__lt = Subquery(mba_start_subq)),
            Q(jobends__gt = Subquery(mba_start_subq))
        )

I've also tried to use F(annotated_value) in the .filter() after I've annotated it to the queryset, but no luck, I don't get any response from the server at all, takes ages.

What did I miss? the way I'm getting the education start date is wrong? Is there a more efficient way?

UPDATE: Here is a working SQL query that I would like to achieve in Django

 SELECT 
    Organizations.id, 
    Organizations.hash_code, 
    educations.EducationDegree,
    Organizations.Role,
    Organizations.Industry, 
    Organizations.JObStarts, 
    Educations.StartDate, 
    Organizations.JobEnds 
 FROM educations
 INNER JOIN Organizations ON 
 Organizations.hash_code=Educations.hash_code
 WHERE (date(Educations.StartDate,'+1 year') 
 BETWEEN Organizations.JObStarts AND Organizations.JobEnds ) 
 GROUP BY Organizations.hash_code

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冷…雨湿花 2025-02-11 18:08:46

首先，我稍微修改了您的教育和工作模型，以使用Django用户，而不是要添加的额外模型。新的模型定义是：

class Education(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)

class Job(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

为了匹配您拥有的SQL查询，我将使用以下内容：

subq = Education.objects.filter(hash_code__id=OuterRef('hash_code__id'))
                        .order_by('startdate')

它使用用户对象的ID，而不是整个对象本身进行过滤。在之后，也可以添加值作为子问题的一部分。
然后，最终作业查询可以按以下方式运行：

jobs = Job.objects.filter(jobstarts__lt=Subquery(subq.values('startdate')[:1]),
                          jobends__gt=Subquery(subq.values('startdate'[:1]))
                  .order_by('hash_code__id')

这消除了订购和注释所有作业对象的需求，并应用了一个简单，更有效的过滤器。我希望这会有所帮助。让我知道这是否不是您所追求的。

Firstly, I have modified your Education and Job models slightly to use the Django User rather than the extra model you are adding. The new model definitions are:

class Education(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)

class Job(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

Then, in order to match the SQL query you have, I would use the following:

subq = Education.objects.filter(hash_code__id=OuterRef('hash_code__id'))
                        .order_by('startdate')

This uses the ID of the User object rather than the entire object itself for filtering. The values can also be added as part of the sub-query after.
The final jobs query can then be run as follows:

jobs = Job.objects.filter(jobstarts__lt=Subquery(subq.values('startdate')[:1]),
                          jobends__gt=Subquery(subq.values('startdate'[:1]))
                  .order_by('hash_code__id')

This removes the need to order and annotate all Jobs objects and applies a simple, more efficient filter instead. I hope this helps. Let me know if it isn't quite what you are after.

回复收藏 0 原文

~没有更多了~