sqlalchemy-限制加入的结果
模型:
class Team(Base):
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
players = relationship("Player", backref="team")
class Player(Base):
id = Column(Integer, primary_key=True)
name = Column(String(255), unique=True)
team_id = Column(Integer, ForeignKey("team.id"))
positions = relationship("Position", backref="player")
class Position(Base):
id = Column(Integer(), primary_key=True)
name = Column(String(255), unique=True)
player_id = Column(Integer, ForeignKey("player.id"))
goals = relationship("Goal", backref="position")
class Goal(Base):
id = Column(Integer(), primary_key=True)
distance = Column(Integer)
position_id = Column(Integer, ForeignKey("position.id"))
# Query to get all goals of all players of a team
query = (
select(Team)
.select_from(Player, Position, Goal)
.options(joinedload(Team.players))
.options(
joinedload(
Team.players,
Player.positions,
)
)
.options(
joinedload(
Team.players,
Player.positions,
Position.goals,
)
)
result = await db.execute(query)
response = result.scalar()
从上面查询的样本JSON输出,
{
"id": 3,
"players": [
{
"id": 3,
"positions": []
},
{
"id": 5,
"positions": [
{
"id": 7,
"goals": [
{
"id": 13,
}
]
}
]
},
{
"id": 1,
"positions": [
{
"id": 1,
"goals": [
{
"id": 16,
},
{
"id": 15,
},
{
"id": 14,
}
]
},
{
"id": 2,
"goals": [
{
"id": 4,
}
]
}
]
}
]
}
从示例JSON,我们可以清楚地看到播放器ID = 1的多个目标。
现在,我需要将查询限制为每个玩家的最后一个目标,而不是该玩家的所有目标。
因此,我尝试了
subquery = (
select(Goal)
.order_by(Goal.id.desc())
.limit(1)
.subquery()
.lateral()
)
query = (
select(Team)
.select_from(Player, Position, Goal)
.options(joinedload(Team.players))
.options(
joinedload(
Team.players,
Player.positions,
)
)
.outerjoin(subquery)
.options(
contains_eager(
Team.players,
Player.positions,
Position.goals,
alias=subquery,
)
)
result = await db.execute(query)
response = result.scalar()
从上面查询的JSON输出进行采样,
{
"id": 3,
"players": [
{
"id": 3,
"positions": []
},
{
"id": 5,
"positions": [
{
"id": 7,
"goals": [
{
"id": 16,
}
]
}
]
},
{
"id": 1,
"positions": [
{
"id": 1,
"goals": [
{
"id": 16,
}
]
},
{
"id": 2,
"goals": [
{
"id": 16,
}
]
}
]
}
]
}
这是任何玩家的最后一个目标,但不是相应球员的最后一个目标。
goal.position_id == position.id
之类的过滤器 in outerajoin
或subquery
不起作用或导致错误。
编辑:
看来我需要 populate_existing(),但在新选择
方法中不可用。
编辑2:
要简化这些查询,我还考虑创建last_goal_id
column position
表并更新position
表以存储最后的ID插入目标
。外国钥匙在两张桌子中彼此正常吗? 目标
将具有position_id
和位置
将具有last_goal_id
。
Models:
class Team(Base):
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
players = relationship("Player", backref="team")
class Player(Base):
id = Column(Integer, primary_key=True)
name = Column(String(255), unique=True)
team_id = Column(Integer, ForeignKey("team.id"))
positions = relationship("Position", backref="player")
class Position(Base):
id = Column(Integer(), primary_key=True)
name = Column(String(255), unique=True)
player_id = Column(Integer, ForeignKey("player.id"))
goals = relationship("Goal", backref="position")
class Goal(Base):
id = Column(Integer(), primary_key=True)
distance = Column(Integer)
position_id = Column(Integer, ForeignKey("position.id"))
# Query to get all goals of all players of a team
query = (
select(Team)
.select_from(Player, Position, Goal)
.options(joinedload(Team.players))
.options(
joinedload(
Team.players,
Player.positions,
)
)
.options(
joinedload(
Team.players,
Player.positions,
Position.goals,
)
)
result = await db.execute(query)
response = result.scalar()
Sample json output from above query,
{
"id": 3,
"players": [
{
"id": 3,
"positions": []
},
{
"id": 5,
"positions": [
{
"id": 7,
"goals": [
{
"id": 13,
}
]
}
]
},
{
"id": 1,
"positions": [
{
"id": 1,
"goals": [
{
"id": 16,
},
{
"id": 15,
},
{
"id": 14,
}
]
},
{
"id": 2,
"goals": [
{
"id": 4,
}
]
}
]
}
]
}
From sample json, we can clearly see that multiple goals are returned for a player id=1.
Now, I need to limit the query to last goal of each player, instead of all goals of that player.
So I tried,
subquery = (
select(Goal)
.order_by(Goal.id.desc())
.limit(1)
.subquery()
.lateral()
)
query = (
select(Team)
.select_from(Player, Position, Goal)
.options(joinedload(Team.players))
.options(
joinedload(
Team.players,
Player.positions,
)
)
.outerjoin(subquery)
.options(
contains_eager(
Team.players,
Player.positions,
Position.goals,
alias=subquery,
)
)
result = await db.execute(query)
response = result.scalar()
Sample json output from above query
{
"id": 3,
"players": [
{
"id": 3,
"positions": []
},
{
"id": 5,
"positions": [
{
"id": 7,
"goals": [
{
"id": 16,
}
]
}
]
},
{
"id": 1,
"positions": [
{
"id": 1,
"goals": [
{
"id": 16,
}
]
},
{
"id": 2,
"goals": [
{
"id": 16,
}
]
}
]
}
]
}
This fetches the last goal of any player, but not the last goal of respective player.
Filters like Goal.position_id == Position.id
in outerjoin
or subquery
either does not work or result in error.
Edit:
It looks like I need populate_existing(), but it's not available in new select
method.
Edit 2:
To simplify these queries, I am also thinking to create last_goal_id
column in position
table and update the position
table to store id of last inserted goal
. Are foreign keys to each other in 2 tables normal? goal
would have position_id
and position
would have last_goal_id
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
preamble
首先,我相信下面的行不应该是查询的一部分,因为它将创建笛卡尔产品。执行查询时,请查找SQLalchemy警告:
其次,您可以在某种程度上简化原始查询。下面的查询与您的问题相当:
contains_eager
作为加入load
也可以通过a)加入来实现。明确的相关表明确,b)向SQLalchemy暗示查询已经包含了所需关系:
解决方案:,
鉴于我们现在可以更明确地在关系上加入条件,因此实现查询的一种方法是下图:
生成以下
sql
(sqlite):替代解决方案:
您也可以做什么来创建
hybrid_property
具有计算的列,哪些指向最后一个goal.id
per位置
并使用它来定义一个关系,该关系仅包含列表中的最后一个目标:在这种情况下,您可以使用以下查询,但是您不应导航
position.goals
关系,因为它将加载整个列表。 JSON密钥的名称也将有所不同。注意:我个人最喜欢它,因为它是干净且明确的。
您甚至可以混合技术以获取双方:使用
.last_goals
关系,但是 trick sa认为这是满载的.goals
:Preamble
First, I believe that the line below should not be part of the query as it will create a cartesian product. Look for sqlalchemy warnings when executing the query:
Second, you can simplify your original query somewhat. Below produces a query equivalent to the one in your question:
contains_eager
as alternative tojoinedload
Above query can also be implemented differently by a) joining the related tables explicitly, and b) hinting to sqlalchemy that the query already contains the desired relationships:
SOLUTION:
Given that we can now be more explicit on the relationship join conditions, one way of implementing the query would be below:
which produces the following
SQL
(sqlite):alternative SOLUTIONs:
What you could also do it to create a
hybrid_property
to have computed column which points to the lastGoal.id
perPosition
and use it to define a relationship which will only contain last Goal in the list:In this case you could use the following query, but you should not navigate
Position.goals
relationship as it will load the whole list. Also the name of the json key will be different.NOTE: I personally like this the most as it is clean and explicit.
You could even mix the techniques to get both sides: use the
.last_goals
relationship, but trick SA to think it is the fully loaded.goals
:我认为通过使用
在
条款上删除从
>目标
对象检索到的重复行的条款
不同 3缺少,因为他没有目标。
您还可以在上的条款上使用
diftest of thround of的
转动查询:
因为我们现在是从
team
开始的,然后沿着player
降低任何目标
还包括以下数据:示例数据
I think what you want can be achieved by using a
DISTINCT ON
clause to remove the duplicate rows retrieved from theGoal
object:When using the sample date below this results in:
I think the issue here is that
Player
3 is missing as he has no goal.You can also turn the query with the
DISTINCT ON
clause around which results in:As we are now starting from
Team
and going down thePlayer
without anyGoal
is also included resulting in the following data:Sample data
考虑使用
等级
,它可能会做您需要的事情,尽管它需要一些查询/子查询,而不是一个大加入Load
。我将有一个子查询来对目标日期进行排名并按职位或玩家进行分区,并将其滤除到等级1的位置。这将为您提供每个位置的最新目标,您可以为此创建一个命令。通过您的主要查询,您可以使用位置ID使用该dict来找到最新的目标。
类似的事情:
当一个头btw-我曾经尝试
在所有内容上加入
,直到sqlalchemy的作者告诉我select> selectInload
应在可能的情况下使用,因为它只会提取您需要的数据,而加入可能会有大量的重复数据(例如,如果您的团队有20个球员,每个职位有5个职位,我认为加入这一切都会导致每个团队名称被发送2000次,并且每个播放器名称都发送100次)。编辑:
column_property
只是作为替代解决方案而想到的。不幸的是,我从来没有能够弄清楚如何映射实际目标模型,因此这不是完美的,但是这是如何将最新目标的ID直接添加到播放器模型中的一个示例。从查询的角度来看,它只是将其视为另一列,因此您可以选择和过滤。
Have a look into using
RANK
, it may do what you need, though it would require a few queries/subqueries instead of one bigjoinedload
.I'd have a subquery to rank the goal dates and partition them by the positions or players, and filter it to where rank equals 1. That will get you the latest goal for each position, which you can create a dict for. With your main query, you can then use the position ID to find the latest goal using that dict.
Something like this:
As a heads up btw - I used to attempt
joinedload
on everything until the author of SQLAlchemy told me thatselectinload
should be used when possible, because it fetches only the data you need, whereas joins may have a ton of duplicate data (eg. if your team has 20 players with 5 positions each and 20 goals each, then I think joining it all will result in each team name being sent 2000 times, and each player name being sent 100 times).Edit:
column_property
just came to mind as an alternative solution. Unfortunately I've never been able to figure how to map the actual Goal model, so this isn't perfect, but here's an example of how you could add the ID of the latest goal directly to the Player model.From the point of view of a query, it's just treated as another column, so you can select and filter by it.