Django注释+过滤

发布于 2025-02-04 18:08:46 字数 2537 浏览 1 评论 0原文

我已经使用pgloader从sqlite到psql加载了一些数据,它或多或少都可以工作,但某些类型有点笨拙,所以这就是为什么我的日期在字符串中,但我认为这不是问题(可能在这里错了)

考虑以下模型:

class User(models.Model):
    hash_id = models.TextField(primary_key=True)
    first_name = models.TextField(blank=True, null=True)
    last_name = models.TextField(blank=True, null=True)
class Education(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)
class Job(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

我正在尝试从用户的第一次教育中获得X年后的所有工作。 到目前为止

# This should return a date eg: '2000-01-01'
mba_start_subq = Educations.objects.filter(hash_code=OuterRef('hash_code'))
        .order_by('startdate')
        .values('startdate')[:1]
    )

jobs = (Job.objects.all()
        # .distinct('hash_code')
        .order_by('hash_code')
        .annotate(mba_start = Subquery(mba_start_subq))
       )

​ 获得响应(基本上是无限的循环)

       # Filtering with date strings works
        jobs.filter(
            Q(jobstarts__lt = '2000-01-01'),
            Q(jobends__gt = '2002-01-01')
        )

       # this is the desired functionality, that doesn't work
        jobs.filter(
            Q(jobstarts__lt = Subquery(mba_start_subq)),
            Q(jobends__gt = Subquery(mba_start_subq))
        )

我也尝试在.filter()注释到QuerySet之后,在.filter()中使用F(Annotated_value) 服务器的任何响应都需要年龄。

我想念什么?我获得教育开始日期的方式是错误的?有更有效的方法吗?

更新:这是我想在Django中实现的工作SQL查询

 SELECT 
    Organizations.id, 
    Organizations.hash_code, 
    educations.EducationDegree,
    Organizations.Role,
    Organizations.Industry, 
    Organizations.JObStarts, 
    Educations.StartDate, 
    Organizations.JobEnds 
 FROM educations
 INNER JOIN Organizations ON 
 Organizations.hash_code=Educations.hash_code
 WHERE (date(Educations.StartDate,'+1 year') 
 BETWEEN Organizations.JObStarts AND Organizations.JobEnds ) 
 GROUP BY Organizations.hash_code

I've some data that I've loaded from SQLite to PSQL using pgloader, it works more or less but some types are a bit clunky, so that's why my dates are in string, but I don't think that's the issue ( might be wrong here)

Consider the following models:

class User(models.Model):
    hash_id = models.TextField(primary_key=True)
    first_name = models.TextField(blank=True, null=True)
    last_name = models.TextField(blank=True, null=True)
class Education(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)
class Job(models.Model):
    hash_code = models.ForeignKey('Users', models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

I'm trying to get all the jobs after X years from the user's first Education.
So far I've got a Subquery, to get the first education's start date for each user:

# This should return a date eg: '2000-01-01'
mba_start_subq = Educations.objects.filter(hash_code=OuterRef('hash_code'))
        .order_by('startdate')
        .values('startdate')[:1]
    )

Then I append this to each Job via:

jobs = (Job.objects.all()
        # .distinct('hash_code')
        .order_by('hash_code')
        .annotate(mba_start = Subquery(mba_start_subq))
       )

So far so good, the issue is when I try to add a .filter() afterwards it takes ages to get the response (basically an infinite loop kind of thing)

       # Filtering with date strings works
        jobs.filter(
            Q(jobstarts__lt = '2000-01-01'),
            Q(jobends__gt = '2002-01-01')
        )

       # this is the desired functionality, that doesn't work
        jobs.filter(
            Q(jobstarts__lt = Subquery(mba_start_subq)),
            Q(jobends__gt = Subquery(mba_start_subq))
        )

I've also tried to use F(annotated_value) in the .filter() after I've annotated it to the queryset, but no luck, I don't get any response from the server at all, takes ages.

What did I miss? the way I'm getting the education start date is wrong? Is there a more efficient way?

UPDATE: Here is a working SQL query that I would like to achieve in Django

 SELECT 
    Organizations.id, 
    Organizations.hash_code, 
    educations.EducationDegree,
    Organizations.Role,
    Organizations.Industry, 
    Organizations.JObStarts, 
    Educations.StartDate, 
    Organizations.JobEnds 
 FROM educations
 INNER JOIN Organizations ON 
 Organizations.hash_code=Educations.hash_code
 WHERE (date(Educations.StartDate,'+1 year') 
 BETWEEN Organizations.JObStarts AND Organizations.JobEnds ) 
 GROUP BY Organizations.hash_code

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冷…雨湿花 2025-02-11 18:08:46

首先,我稍微修改了您的教育和工作模型,以使用Django用户,而不是要添加的额外模型。新的模型定义是:

class Education(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)

class Job(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

为了匹配您拥有的SQL查询,我将使用以下内容:

subq = Education.objects.filter(hash_code__id=OuterRef('hash_code__id'))
                        .order_by('startdate')

它使用用户对象的ID,而不是整个对象本身进行过滤。在之后,也可以添加值作为子问题的一部分。
然后,最终作业查询可以按以下方式运行:

jobs = Job.objects.filter(jobstarts__lt=Subquery(subq.values('startdate')[:1]),
                          jobends__gt=Subquery(subq.values('startdate'[:1]))
                  .order_by('hash_code__id')

这消除了订购和注释所有作业对象的需求,并应用了一个简单,更有效的过滤器。我希望这会有所帮助。让我知道这是否不是您所追求的。

Firstly, I have modified your Education and Job models slightly to use the Django User rather than the extra model you are adding. The new model definitions are:

class Education(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    startdate = models.TextField(blank=True, null=True)
    enddate = models.TextField(blank=True, null=True)

class Job(models.Model):
    hash_code = models.ForeignKey(settings.AUTH_USER_MODEL, models.CASCADE)
    jobends = models.TextField(blank=True, null=True)
    jobstarts = models.TextField(blank=True, null=True)

Then, in order to match the SQL query you have, I would use the following:

subq = Education.objects.filter(hash_code__id=OuterRef('hash_code__id'))
                        .order_by('startdate')

This uses the ID of the User object rather than the entire object itself for filtering. The values can also be added as part of the sub-query after.
The final jobs query can then be run as follows:

jobs = Job.objects.filter(jobstarts__lt=Subquery(subq.values('startdate')[:1]),
                          jobends__gt=Subquery(subq.values('startdate'[:1]))
                  .order_by('hash_code__id')

This removes the need to order and annotate all Jobs objects and applies a simple, more efficient filter instead. I hope this helps. Let me know if it isn't quite what you are after.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文