在 Postgres 中选择每个用户的每 N 行
我正在使用这个 SQL 语句:
SELECT "dateId", "userId", "Salary"
FROM (
SELECT *,
(row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn
FROM user_table
) sa
WHERE sa.rn=1
AND "userId" = 789
AND "Salary" > 0;
但是每次表获取新行时,查询的结果都是不同的。
我错过了什么吗?
I was using this SQL statement:
SELECT "dateId", "userId", "Salary"
FROM (
SELECT *,
(row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn
FROM user_table
) sa
WHERE sa.rn=1
AND "userId" = 789
AND "Salary" > 0;
But every time the table gets new rows the result of the query is different.
Am I missing something?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
假设
("dateId", "userId")
是唯一的,并且新行始终具有更大的(稍后)dateId
。经过一些评论:
我认为你需要什么:
注意
分区
。这样,您就可以跳过每个userId
的每一个dateId
,并且其他(后面的)行到目前为止不会更改选择。此外,只要您选择单个
userId
行(WHERE "userId" = 789
),将谓词拉入子查询中,达到相同的效果(单个用户的稳定选择)。你不需要两者。子查询中的 WHERE 子句仅适用于单个用户,PARTITION BY 则适用于一个查询中任意数量的用户。
<子>是这样吗?是吗?
他们应该为此给我“侦探”徽章。
说真的。
Assuming that
("dateId", "userId")
is unique and new rows always have a bigger (later)dateId
.After some comments:
What I think you need:
Notice the
PARTITION BY
. This way you skip every seconddateId
for eachuserId
, and additional (later) rows don't change the selection so far.Also, as long as you are selecting rows for a single
userId
(WHERE "userId" = 789
), pull the predicate into the subquery, achieving the same effect (stable selection for a single user). You don't need both.The
WHERE
clause in the subquery only works for a single user,PARTITION BY
works for any number of users in one query.Is that it? Is it?
They should give me "detective" badge for this.
Seriously.
不,这似乎没问题。
您有新行,这些行会更改旧行以在排序后显示在不同位置。
No that seems to be OK.
You have new rows, those rows change the old rows to appear on different position after sorting.
如果有人插入一个 userId 低于 789 的新行,顺序将会改变。
例如,如果您有:
并且插入了 userId = 2 的行,则 rn 将更改:
为了选择每第 N 行,您需要一个带有序列或时间戳的列。
If someone insert a new row with a userId below 789 the order will change.
For example, if you have:
and you insert a row with userId = 2, the rn will change:
In order to select every Nth row you need a column with a sequence or a timestamp.