如何检查一个组的列中是否有三个连续值?

发布于 2025-01-12 17:42:35 字数 1531 浏览 1 评论 0原文

我有一个表 games ,其值如下:

+----------+------+
|  game    | year |
+----------+------+
| Football | 1999 |
| Football | 2000 |
| Football | 2001 |
| Football | 2002 |
| Cricket  | 1996 |
| Tennis   | 2001 |
| Tennis   | 2002 |
| Tennis   | 2003 |
| Tennis   | 2009 |
| Golf     | 1994 |
| Golf     | 1996 |
| Golf     | 1997 |
+----------+------+

我试图查看表中是否有一个游戏具有至少连续三年的条目。我的预期输出是:

+----------+
|  game    |
+----------+
| Football |
| Tennis   |
+----------+

因为:

  • 足球有四个条目,其中四个是连续年份 => 1999、2000、2001、2002
  • 网球有四届参赛,其中三届是连续年份=> 2001, 2002, 2003

为了找到至少包含三个连续条目的行,我首先在 game 上对表进行分区,然后检查当前行和下一行之间的差异:下面:

select game, year, case
    when (year - lag(year) over (partition by game order by year)) is null then 1
    else year - lag(year) over (partition by game order by year)
end as diff
from games

上述查询的输出:

+----------+------+------+
|  game    | year | diff |
+----------+------+------+
| Football | 1999 | 1    |
| Football | 2000 | 1    |
| Football | 2001 | 1    |
| Football | 2002 | 1    |
| Cricket  | 1996 | 1    |
| Tennis   | 2001 | 1    |
| Tennis   | 2002 | 1    |
| Tennis   | 2003 | 1    |
| Tennis   | 2009 | 6    |
| Golf     | 1994 | 1    |
| Golf     | 1996 | 2    |
| Golf     | 1997 | 1    |
+----------+------+------+

我无法从这里继续通过过滤每个游戏的数据及其差异来获取输出。

谁能告诉我我是否处于正确的实施轨道上?如果没有,我如何准备查询以获得预期的输出?

I have a table games with values such as:

+----------+------+
|  game    | year |
+----------+------+
| Football | 1999 |
| Football | 2000 |
| Football | 2001 |
| Football | 2002 |
| Cricket  | 1996 |
| Tennis   | 2001 |
| Tennis   | 2002 |
| Tennis   | 2003 |
| Tennis   | 2009 |
| Golf     | 1994 |
| Golf     | 1996 |
| Golf     | 1997 |
+----------+------+

I am trying to see if a game has an entry with a minimum three consecutive years in the table. My expected output is:

+----------+
|  game    |
+----------+
| Football |
| Tennis   |
+----------+

Because:

  • Football has four entries out of which four are consecutive years => 1999, 2000, 2001, 2002
  • Tennis has four entries out of which three are consecutive years => 2001, 2002, 2003

In order to find the rows with a minimum three consecutive entries I first partitioned the table on game and then checked difference between the current and the next row as below:

select game, year, case
    when (year - lag(year) over (partition by game order by year)) is null then 1
    else year - lag(year) over (partition by game order by year)
end as diff
from games

Output of the above query:

+----------+------+------+
|  game    | year | diff |
+----------+------+------+
| Football | 1999 | 1    |
| Football | 2000 | 1    |
| Football | 2001 | 1    |
| Football | 2002 | 1    |
| Cricket  | 1996 | 1    |
| Tennis   | 2001 | 1    |
| Tennis   | 2002 | 1    |
| Tennis   | 2003 | 1    |
| Tennis   | 2009 | 6    |
| Golf     | 1994 | 1    |
| Golf     | 1996 | 2    |
| Golf     | 1997 | 1    |
+----------+------+------+

I am not able to proceed from here on getting the output by filtering the data for each game with its difference.

Could anyone let me know if I am in the right track of the implementation? If not, how do I prepare the query to get the expected output?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

金橙橙 2025-01-19 17:42:36

使用 CTE(通用表表达式)和有用的 ROW_NUMBER 窗口函数可以轻松解决这个问题。

WITH CTE (name, RN) AS (
select name, ROW_NUMBER() OVER (PARTITION BY name order by year) RN
        from game)
Select Distinct name
from CTE 
Where RN >= 3

Using CTE(Common Table Expression) and the useful ROW_NUMBER window function this can be easily solved.

WITH CTE (name, RN) AS (
select name, ROW_NUMBER() OVER (PARTITION BY name order by year) RN
        from game)
Select Distinct name
from CTE 
Where RN >= 3
べ繥欢鉨o。 2025-01-19 17:42:35

您可以在此处使用自联接方法:

SELECT DISTINCT g1.Game
FROM games g1
INNER JOIN games g2
    ON g2.Game = g1.Game AND g2.Year = g1.Year + 1
INNER JOIN games g3
    ON g3.Game = g2.Game AND g3.Year = g2.Year + 1;

演示

上述查询要求任何匹配的游戏至少有一条记录,其年份可以在下一年和下一年找到。

You could use a self join approach here:

SELECT DISTINCT g1.Game
FROM games g1
INNER JOIN games g2
    ON g2.Game = g1.Game AND g2.Year = g1.Year + 1
INNER JOIN games g3
    ON g3.Game = g2.Game AND g3.Year = g2.Year + 1;

Demo

The above query requires any matching game to have at least one record whose year can be found in the following year, and the year after that as well.

缘字诀 2025-01-19 17:42:35

您可以使用 lag()lead() 并将它们与当前年份进行比较:

with u as
(select *, case
when lag(Year) over(partition by Game order by Year) = Year - 1
and lead(Year) over(partition by Game order by Year) = Year + 1
then 1 else 0
end as consec
from games)
select distinct Game
from u
where consec = 1;

小提琴

You can use lag() and lead() and compare them to the current Year:

with u as
(select *, case
when lag(Year) over(partition by Game order by Year) = Year - 1
and lead(Year) over(partition by Game order by Year) = Year + 1
then 1 else 0
end as consec
from games)
select distinct Game
from u
where consec = 1;

Fiddle

纵山崖 2025-01-19 17:42:35

是的,您最初的方法是正确的。事实上,你自己已经非常接近完全弄清楚了。

我要做的就是稍微改变一下 LAG:

year - LAG(year, 2) OVER (
    PARTITION BY game
    ORDER BY year
    ROWS BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
)

对于每一行,这将比较当前行的年份和(当前 - 2)行的年份之间的差异。

如果它是连续第三行,它将产生 2 ,您可以在 where 子句中过滤它。

如果您的数据包含重复项,您需要首先按游戏、年份进行分组。

Yes, your initial approach is correct. You were actually really close to fully figuring it out yourself.

What I would do is alter LAG a bit:

year - LAG(year, 2) OVER (
    PARTITION BY game
    ORDER BY year
    ROWS BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
)

For each row, this will compare the difference between the year from current row and the year from (current - 2)th row.

If it is the third consecutive row it will yield 2 which you can filter in where clause.

If your data contains duplicates you need to group by game, year first.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文