用于仅选择第一列中具有相同数据的行的第一次出现的 SQL 查询

发布于 2024-09-07 01:55:27 字数 516 浏览 6 评论 0原文

是否有一个简洁的 SQL 查询可以返回行,以便仅返回第一列中具有相同数据的第一次出现的行?也就是说,如果我有像

blah something
blah somethingelse
foo blah
bar blah
foo hello

查询这样的行,应该给我第一、第三和第四行(因为第一行是第一列中第一次出现“blah”,第三行是“foo”在第一列中第一次出现)第一列,第四行是第一列中第一次出现的“bar”)

。 ,如果这很重要的话,

更新:对于不清楚的表定义感到抱歉,这里更好;“blah”、“foo”等表示该行中第一列的值。

blah [rest of columns of first row]
blah [rest of columns of second row]
foo  [-""- third row]
bar  [-""- fourth row]
foo  [-""- fifth row]

Is there a neat SQL query that would return rows so that only first occurrences of rows, that have same data in the first column, would be returned? That is, if I have rows like

blah something
blah somethingelse
foo blah
bar blah
foo hello

the query should give me the first, third and fourth rows (because first row is the first occurrence of "blah" in the first column", third row is the first occurrence of "foo" in the first column, and fourth row is the first occurrence of "bar" in the first column).

I'm using H2 database engine, if that matters.

Update: sorry about the unclear table definition, here's it better; the "blah", "foo" etc. denote the value of the first column in the row.

blah [rest of columns of first row]
blah [rest of columns of second row]
foo  [-""- third row]
bar  [-""- fourth row]
foo  [-""- fifth row]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

懵少女 2024-09-14 01:55:27

如果您的意思是按字母顺序排列第 2 列,则可以使用以下 SQL 来获取这些行:

create table #tmp (
    c1 char(20),
    c2 char(20)
)
insert #tmp values ('blah','something')
insert #tmp values ('blah','somethingelse')
insert #tmp values ('foo','ahhhh')
insert #tmp values ('foo','blah')
insert #tmp values ('bar','blah')
insert #tmp values ('foo','hello')

select c1, min(c2) c2 from #tmp
group by c1

If you meant alphabetically on column 2, here is some SQL to get those rows:

create table #tmp (
    c1 char(20),
    c2 char(20)
)
insert #tmp values ('blah','something')
insert #tmp values ('blah','somethingelse')
insert #tmp values ('foo','ahhhh')
insert #tmp values ('foo','blah')
insert #tmp values ('bar','blah')
insert #tmp values ('foo','hello')

select c1, min(c2) c2 from #tmp
group by c1
月野兔 2024-09-14 01:55:27

分析请求可以解决这个问题。

Select *
from (
    Select rank(c1) over (partition by c1) as myRank, t.*
    from myTable t )
where myRank = 1

但这只是 V1.3.X 的优先级 2

http://www.h2database.com/html/roadmap.html?highlight=RANK&search=rank#firstFound

Analytic request could do the trick.

Select *
from (
    Select rank(c1) over (partition by c1) as myRank, t.*
    from myTable t )
where myRank = 1

But this is only a priority 2 for the V1.3.X

http://www.h2database.com/html/roadmap.html?highlight=RANK&search=rank#firstFound

红衣飘飘貌似仙 2024-09-14 01:55:27

我认为这符合你的要求,但我不是 100% 确定。 (也基于 MS SQL Server。)

create table #t
(
PKCol int identity(1,1),
Col1 varchar(200)
)

Insert Into #t
Values ('blah something')
Insert Into #t
Values ('blah something else')
Insert Into #t
Values ('foo blah')
Insert Into #t
Values ('bar blah')
Insert Into #t
Values ('foo hello')


Select t.*
From #t t
Join (
     Select min(PKCol) as 'IDToSelect'
     From #t
     Group By Left(Col1, CharIndex(space(1), col1))
)q on t.PKCol = q.IDToSelect

drop table #t

I think this does what you want but I'm not 100% sure. (Based on MS SQL Server too.)

create table #t
(
PKCol int identity(1,1),
Col1 varchar(200)
)

Insert Into #t
Values ('blah something')
Insert Into #t
Values ('blah something else')
Insert Into #t
Values ('foo blah')
Insert Into #t
Values ('bar blah')
Insert Into #t
Values ('foo hello')


Select t.*
From #t t
Join (
     Select min(PKCol) as 'IDToSelect'
     From #t
     Group By Left(Col1, CharIndex(space(1), col1))
)q on t.PKCol = q.IDToSelect

drop table #t
半山落雨半山空 2024-09-14 01:55:27

如果您对最快的查询感兴趣:在表的第一列上建立索引相对重要。这样查询处理器就可以扫描该索引中的值。然后,最快的解决方案可能是使用“外部”查询来获取不同的 c1 值,加上“内部”或嵌套查询来获取第二列的可能值之一:

drop table test;
create table test(c1 char(20), c2 char(20));
create index idx_c1 on test(c1);

-- insert some data (H2 specific)
insert into test select 'bl' || (x/1000), x from system_range(1, 100000); 

-- the fastest query (64 ms)
select c1, (select i.c2 from test i where i.c1=o.c1 limit 1) from test o group by c1;

-- the shortest query (385 ms)
select c1, min(c2) c2 from test group by c1;

If you are interested in the fastest possible query: It's relatively important to have an index on the first column of the table. That way the query processor can scan the values from that index. Then, the fastest solution is probably to use an 'outer' query to get the distinct c1 values, plus an 'inner' or nested query to get one of the possible values of the second column:

drop table test;
create table test(c1 char(20), c2 char(20));
create index idx_c1 on test(c1);

-- insert some data (H2 specific)
insert into test select 'bl' || (x/1000), x from system_range(1, 100000); 

-- the fastest query (64 ms)
select c1, (select i.c2 from test i where i.c1=o.c1 limit 1) from test o group by c1;

-- the shortest query (385 ms)
select c1, min(c2) c2 from test group by c1;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文