选择 ColA=ColB 的行,并添加一个新列,告知 ColA 中的原始数据是否重复
我试图弄清楚如何查询一个表(该表实际上是一个结果集,因此它将是一个子查询),按 ColA=ColB
对其进行分组(见下文),并创建一个计算的一步完成所有字段。
因此,如果我的测试数据看起来像
ColA ColB ColC
1 1 aaa 1 2 bbbb 1 3 cccc 2 2 dddd 3 3 eeee 3 4 ffff 3 5 gggg 3 6 hhhh 4 4 iiii 5 5 jjjj 6 6 kkkk 6 7 llll 6 8 mmmm
我只想检索 ColA=ColB
的行,并添加一个新列来告诉我 ColA
中的原始数据是否重复。见下文。
ColA ColB ColC multiples
1 1 aaaa yes 2 2 dddd no 3 3 eeee yes 4 4 iiii no 5 5 jjjj no 6 6 kkkk yes
有人可以帮我解决语法问题吗?我一直在使用 Group By 和 SubSelects 但无济于事。我是否需要对倍数字段使用 case 语句?
I am trying to figure out how to query a table (the table is actually a result set, so it will be a subquery), group it by ColA=ColB
(see below), and create a calculated field all in one step.
So, if my test data looks like
ColA ColB ColC
1 1 aaa 1 2 bbbb 1 3 cccc 2 2 dddd 3 3 eeee 3 4 ffff 3 5 gggg 3 6 hhhh 4 4 iiii 5 5 jjjj 6 6 kkkk 6 7 llll 6 8 mmmm
I would like to retrieve only rows where ColA=ColB
and also add a new column that tells me whether the original data in ColA
was repeated. See below.
ColA ColB ColC multiples
1 1 aaaa yes 2 2 dddd no 3 3 eeee yes 4 4 iiii no 5 5 jjjj no 6 6 kkkk yes
Can someone help me out with the syntax? I have been playing with Group By's and SubSelects to no avail. Do I need to use a case statement to for the multiples field?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
发布 create table 和 insert 语句而不是 Desc table 和 select * from table_name 更有帮助; http://tkyte.blogspot.com/2005/06/ how-to-ask-questions.html
您可以使用Oracle分析函数导致查看结果集以查看下一行的 colA 是否相同(在排序后..),例如..
从中选择*
(选择 colA、colb、
(当 colA = (lead(cola) 超过
(按 colA 分区,按 cola、colb 排序))
然后“是”
否则'不'
结束)倍数,
科尔克
来自测试重复)
其中 colA = colb
/
<前><代码> COLA COLB MUL COLC
<小时>
<前><代码> 1 1 是 aaa
2 2 无 dddd
3 3 是的 eeee
4 4 否 iii
5 5 没有 jjjj
6 6 是 kkkk
或者您可以获取 COLA 每个值的计数并比较它以查看是否有重复...
选择a.colA、a.colb、a.colc、
(当 (select count(*) from test_repeat t where t.cola = a.colA) > 1 时的情况
然后“是”
否则'不'
结束)重复
从 test_repeat 开始
其中 colA = colB
/
<前><代码> COLA COLB COLC REP
<小时>
<前><代码> 1 1 aaa 是
2 2 dddd 否
3 3 eeee 是
4 4 iiii 否
5 5 jjjj 否
6 6 kkkk 有
它们都同样简单,但是我建议使用分析函数方法,因为我发现它对于我过去使用过的所有查询通常都更快。
It's more helpful to post create table and insert statements instead of Desc table and select * from table_name; http://tkyte.blogspot.com/2005/06/how-to-ask-questions.html
You can use the Oracle analytic function Lead to look through your result set to see if colA is the same for the next row (after ordering it..) like..
select * from
(select colA, colb,
(case when colA = (lead(cola) over
(partition by colA order by cola, colb))
then 'Yes'
else 'No'
end) multiples,
colc
from test_repeat)
where colA = colb
/
Or you can get the count for each value of COLA and compare it to see if there are duplicates...
select a.colA, a.colb, a.colc,
(case when (select count(*) from test_repeat t where t.cola = a.colA) > 1
then 'Yes'
else 'No'
end) Repeat
from test_repeat a
where colA = colB
/
They are both equally simple, but I would suggest the analytic function approach as I have found it to be generally faster for all Queries I have worked with in the past.
在性能方面,此 SQL 大约等于 Rajesh 的第一个查询。所以你可以选择你更舒服的那一个。
Performancewise, this SQL is approximately equal to Rajesh' first query. So you can choose the one you are more comfortable with.
我认为您需要对
multiples
列进行子选择。不需要分组。写一些类似:I think you need a subselect for the
multiples
column. No grouping is needed. Write something like:我不知道引导功能,非常好。 Rob 和 Rajesh,感谢到目前为止的回答,我想我能够使用两个子查询和一个外连接部分地回答我自己的问题。
我猜我的查询不如其他人发布的查询有效,但我想知道人们是否可以对此查询发表评论。它总是有效吗?可以对其进行调整以提高效率吗? 对于倍数,它仍然需要输入“是”或“否”,现在它会输入倍数计数,或者对于 0,输入 null。
I was not aware of the lead function, very nice. Rob and Rajesh, thanks for the answers so far, I think I was able to partly answer my own question using two subqueries and an outer join.
I am guessing my query is not as efficient as the others posted, but I wonder if folks could comment on this query. Will it always work? Can it be tweaked to make it more efficient? It still needs to put in 'yes' or 'no' for multiples, right now it puts the multiple count or a null for 0.