每年通过销售来计算顶级产品
我有几年和副产品的销售数据,假设是这样的:
Year <- c(2010,2010,2010,2010,2010,2011,2011,2011,2011,2011,2012,2012,2012,2012,2012)
Model <- c("a","b","c","d","e","a","b","c","d","e","a","b","c","d","e")
Sale <- c("30","45","23","33","24","11","56","19","45","56","33","32","89","33","12")
df <- data.frame(Year, Model, Sale)
我想要按年来识别前2个产品的代码,并将所有其余产品汇总为“其他”类别。
I have the data about sales by years and by-products, let's say like this:
Year <- c(2010,2010,2010,2010,2010,2011,2011,2011,2011,2011,2012,2012,2012,2012,2012)
Model <- c("a","b","c","d","e","a","b","c","d","e","a","b","c","d","e")
Sale <- c("30","45","23","33","24","11","56","19","45","56","33","32","89","33","12")
df <- data.frame(Year, Model, Sale)
I want the code which identifies the TOP 2 products by years and summarises all the rest products as category "other".
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
类似于Akrun的解决方案:其他策略很高:
Similar to akrun's solution: Slighlty other strategy:
我们可以在
desc
结束订单中按'Year'安排 ,然后在按'Year'或另一个 之后根据Row_number更改'模型'的值选项是使用
slice_max
(with_ties = true
默认情况下)We could
arrange
by 'Year' and 'Sale' indesc
ending order and then change the values of 'Model' based on the row_number after grouping by 'Year'Or another option is to use
slice_max
(with_ties = TRUE
by default)另一个可能的解决方案(尽管我不确定OP是否正在寻找输出作为我的答案之一,还是其他答案之一):
Another possible solution (although I am not sure whether the OP is looking for an output as the one of my answer or as the one of the other answers):
您可以从
forcats
使用fct_lump_n()
来折叠级别,但是首先您需要nocount
您的数据。输出
(不确定为什么2012年保留4组而不是3组)。
You can use
fct_lump_n()
fromforcats
to collapse levels, but first you need touncount
your data.Output
(Not sure why 2012 keeps 4 groups rather than 3).