使用 SQL DENSE_RANK 确定重复项
这是一个数据示例,我试图查找具有相同数量的所有订单,忽略 OrderID 列
Product Location Customer OrderID Quantity
Eggs Chicago XYZ 2011 10
Eggs Chicago XYZ 2012 10
Eggs Chicago XYZ 2013 15
所以,我在 SQL 中使用 DENSE_RANK 函数
Select Product,Location,Customer,OrderID,Quantity,
Ranking = DENSE_RANK() OVER (PARTITION BY Product,Location,Customer,Quantity
ORDER BY OrderID ASC)
FROM MyTable
来获取下面的数据
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
所以,根据我能够过滤的排名找出不同 orderID 中具有相同数量的记录并将它们视为一条记录。 到目前为止一切都很好,我很高兴。但是,另一个疯狂的要求是这种形式的聚合应该只针对数量的第一次变化进行。例如,如果上面的数据恰好如下所示,
Product Location Customer OrderID Quantity
Eggs Chicago XYZ 2011 10
Eggs Chicago XYZ 2012 10
Eggs Chicago XYZ 2013 15
Eggs Chicago XYZ 2014 15
Eggs Chicago XYZ 2015 15
相同的 SQL 会产生结果
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 2
Eggs Chicago XYZ 2013 15 3
但是,我需要结果为
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 1
请注意,在第一次数量变化后,所有记录的排名仍为 1。
是否可以调整我的 SQL 以获得上述行为?
感谢您的任何建议。
Here is an example of the data for which I am trying to find all the orders with same quantities, ignoring the OrderID column
Product Location Customer OrderID Quantity
Eggs Chicago XYZ 2011 10
Eggs Chicago XYZ 2012 10
Eggs Chicago XYZ 2013 15
So, I used DENSE_RANK function in the SQL
Select Product,Location,Customer,OrderID,Quantity,
Ranking = DENSE_RANK() OVER (PARTITION BY Product,Location,Customer,Quantity
ORDER BY OrderID ASC)
FROM MyTable
to get the data below
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
So, based on the ranking I was able to filter out the records that have the same quantity across different orderIDs and treat them as one.
So far everything is good and I am happy. But, one of another crazy requirement is this form of aggregation should be done only for the first change in quantity. For example, if the above data happens to be like one below
Product Location Customer OrderID Quantity
Eggs Chicago XYZ 2011 10
Eggs Chicago XYZ 2012 10
Eggs Chicago XYZ 2013 15
Eggs Chicago XYZ 2014 15
Eggs Chicago XYZ 2015 15
The same SQL would produce result
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 2
Eggs Chicago XYZ 2013 15 3
But, I would need the result to be
Product Location Customer OrderID Quantity Ranking
Eggs Chicago XYZ 2011 10 1
Eggs Chicago XYZ 2012 10 2
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 1
Eggs Chicago XYZ 2013 15 1
Please, note the ranking remains 1 for all the records after the first change in quantity.
Is it possible to tweak my SQL to get the above behavior?
Thanks for any suggestions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果我理解正确,您希望使用
DENSE_RANK()
消除数据中的重复行。看来你已经解决了你的问题。如果您想消除重复项,请使用与上面相同的 SQL 代码,并删除带有
Ranking > 的任何行。 1.
.这将为您留下具有相同唯一键的每一行的一个副本(例如Product
、Location
、Customer
、OrderID)。
If I understand you correctly, you want to use
DENSE_RANK()
to eliminate duplicate rows in your data.It seems you’ve already solved your problem. If you want to eliminate the duplicates, use the same SQL code you have above and delete any rows with
Ranking > 1
. This will leave you with one copy of each row with the same unique key (e.g.Product
,Location
,Customer
,OrderID
).这感觉有点脏,但我认为这是正确的:
请参阅 fiddle
This feels a bit dirty but I think it's correct:
See fiddle