Django注释+过滤
我已经使用pgloader从sqlite到psql加载了一些数据,它或多或少都可以工作,但某些类型有点笨拙,所以这就是为什么我的日期在字符串中,但我认为这不是问题(可能在这里错了)
考虑以下模型:
class User(models.Model):
hash_id = models.TextField(primary_key=True)
first_name = models.TextField(blank=True, null=True)
last_name = models.TextField(blank=True, null=True)
class Education(models.Model):
hash_code = models.ForeignKey('Users', models.CASCADE)
startdate = models.TextField(blank=True, null=True)
enddate = models.TextField(blank=True, null=True)
class Job(models.Model):
hash_code = models.ForeignKey('Users', models.CASCADE)
jobends = models.TextField(blank=True, null=True)
jobstarts = models.TextField(blank=True, null=True)
我正在尝试从用户的第一次教育中获得X年后的所有工作。 到目前为止
# This should return a date eg: '2000-01-01'
mba_start_subq = Educations.objects.filter(hash_code=OuterRef('hash_code'))
.order_by('startdate')
.values('startdate')[:1]
)
jobs = (Job.objects.all()
# .distinct('hash_code')
.order_by('hash_code')
.annotate(mba_start = Subquery(mba_start_subq))
)
获得响应(基本上是无限的循环)
# Filtering with date strings works
jobs.filter(
Q(jobstarts__lt = '2000-01-01'),
Q(jobends__gt = '2002-01-01')
)
# this is the desired functionality, that doesn't work
jobs.filter(
Q(jobstarts__lt = Subquery(mba_start_subq)),
Q(jobends__gt = Subquery(mba_start_subq))
)
我也尝试在.filter()注释到QuerySet之后,在.filter()中使用F(Annotated_value) 服务器的任何响应都需要年龄。
我想念什么?我获得教育开始日期的方式是错误的?有更有效的方法吗?
更新:这是我想在Django中实现的工作SQL查询
SELECT
Organizations.id,
Organizations.hash_code,
educations.EducationDegree,
Organizations.Role,
Organizations.Industry,
Organizations.JObStarts,
Educations.StartDate,
Organizations.JobEnds
FROM educations
INNER JOIN Organizations ON
Organizations.hash_code=Educations.hash_code
WHERE (date(Educations.StartDate,'+1 year')
BETWEEN Organizations.JObStarts AND Organizations.JobEnds )
GROUP BY Organizations.hash_code
I've some data that I've loaded from SQLite to PSQL using pgloader, it works more or less but some types are a bit clunky, so that's why my dates are in string, but I don't think that's the issue ( might be wrong here)
Consider the following models:
class User(models.Model):
hash_id = models.TextField(primary_key=True)
first_name = models.TextField(blank=True, null=True)
last_name = models.TextField(blank=True, null=True)
class Education(models.Model):
hash_code = models.ForeignKey('Users', models.CASCADE)
startdate = models.TextField(blank=True, null=True)
enddate = models.TextField(blank=True, null=True)
class Job(models.Model):
hash_code = models.ForeignKey('Users', models.CASCADE)
jobends = models.TextField(blank=True, null=True)
jobstarts = models.TextField(blank=True, null=True)
I'm trying to get all the jobs after X years from the user's first Education.
So far I've got a Subquery, to get the first education's start date for each user:
# This should return a date eg: '2000-01-01'
mba_start_subq = Educations.objects.filter(hash_code=OuterRef('hash_code'))
.order_by('startdate')
.values('startdate')[:1]
)
Then I append this to each Job via:
jobs = (Job.objects.all()
# .distinct('hash_code')
.order_by('hash_code')
.annotate(mba_start = Subquery(mba_start_subq))
)
So far so good, the issue is when I try to add a .filter() afterwards it takes ages to get the response (basically an infinite loop kind of thing)
# Filtering with date strings works
jobs.filter(
Q(jobstarts__lt = '2000-01-01'),
Q(jobends__gt = '2002-01-01')
)
# this is the desired functionality, that doesn't work
jobs.filter(
Q(jobstarts__lt = Subquery(mba_start_subq)),
Q(jobends__gt = Subquery(mba_start_subq))
)
I've also tried to use F(annotated_value) in the .filter() after I've annotated it to the queryset, but no luck, I don't get any response from the server at all, takes ages.
What did I miss? the way I'm getting the education start date is wrong? Is there a more efficient way?
UPDATE: Here is a working SQL query that I would like to achieve in Django
SELECT
Organizations.id,
Organizations.hash_code,
educations.EducationDegree,
Organizations.Role,
Organizations.Industry,
Organizations.JObStarts,
Educations.StartDate,
Organizations.JobEnds
FROM educations
INNER JOIN Organizations ON
Organizations.hash_code=Educations.hash_code
WHERE (date(Educations.StartDate,'+1 year')
BETWEEN Organizations.JObStarts AND Organizations.JobEnds )
GROUP BY Organizations.hash_code
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,我稍微修改了您的教育和工作模型,以使用Django用户,而不是要添加的额外模型。新的模型定义是:
为了匹配您拥有的SQL查询,我将使用以下内容:
它使用用户对象的ID,而不是整个对象本身进行过滤。在之后,也可以添加值作为子问题的一部分。
然后,最终作业查询可以按以下方式运行:
这消除了订购和注释所有作业对象的需求,并应用了一个简单,更有效的过滤器。我希望这会有所帮助。让我知道这是否不是您所追求的。
Firstly, I have modified your Education and Job models slightly to use the Django User rather than the extra model you are adding. The new model definitions are:
Then, in order to match the SQL query you have, I would use the following:
This uses the ID of the User object rather than the entire object itself for filtering. The values can also be added as part of the sub-query after.
The final jobs query can then be run as follows:
This removes the need to order and annotate all Jobs objects and applies a simple, more efficient filter instead. I hope this helps. Let me know if it isn't quite what you are after.