使用每个组的前 n 行对数据框进行子集化，并按变量排序

发布于 2024-11-08 21:44:13 字数 379 浏览 6 评论 0原文

我想对 n 行数据帧进行子集化，这些行按一个变量分组，并按另一个变量降序排序。通过一个例子就可以清楚地看出这一点：

    d1 <- data.frame(Gender = c("M", "M", "F", "F", "M", "M", "F", 
  "F"), Age = c(15, 38, 17, 35, 26, 24, 20, 26))

我想为每个性别获取 2 行，这些行按年龄降序排序。期望的输出是：

Gender  Age  
F   35  
F   26  
M   38  
M   26

我在这里寻找顺序、排序和其他解决方案，但找不到该问题的适当解决方案。我很感激你的帮助。

原文

I would like to subset a data frame for n rows, which are grouped by a variable and are sorted descending by another variable. This would be clear with an example:

    d1 <- data.frame(Gender = c("M", "M", "F", "F", "M", "M", "F", 
  "F"), Age = c(15, 38, 17, 35, 26, 24, 20, 26))

I would like to get 2 rows, which are sorted descending on Age, for each Gender. The desired output is:

Gender  Age  
F   35  
F   26  
M   38  
M   26

I looked for order, sort and other solutions here, but could not find an appropriate solution to this problem. I appreciate your help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

初见你 2024-11-15 21:44:13

一种使用 plyr 中的 ddply() 的解决方案

require(plyr)
ddply(d1, "Gender", function(x) head(x[order(x$Age, decreasing = TRUE) , ], 2))

One solution using ddply() from plyr

require(plyr)
ddply(d1, "Gender", function(x) head(x[order(x$Age, decreasing = TRUE) , ], 2))

回复收藏 0 原文

﹉夏雨初晴づ 2024-11-15 21:44:13

带有 data.table 包

require(data.table)
dt1<-data.table(d1)# to speedup you can add setkey(dt1,Gender)
dt1[,.SD[order(Age,decreasing=TRUE)[1:2]],by=Gender]

With data.table package

require(data.table)
dt1<-data.table(d1)# to speedup you can add setkey(dt1,Gender)
dt1[,.SD[order(Age,decreasing=TRUE)[1:2]],by=Gender]

回复收藏 0 原文

三月梨花 2024-11-15 21:44:13

我确信有更好的答案，但这是一种方法：

require(plyr)
ddply(d1, c("Gender", "-Age"))[c(1:2, 5:6),-1]

如果您的数据框比此处提供的数据框大，并且不想直观地检查要选择哪些行，则只需使用以下方法：

new.d1=ddply(d1, c("Gender", "-Age"))[,-1]
pos=match('M',new.d1$Gender) # pos wil show index of first entry of M
new.d1[c(1:2,pos:(pos+1)),]

I'm sure there is a better answer, but here is one way:

require(plyr)
ddply(d1, c("Gender", "-Age"))[c(1:2, 5:6),-1]

If you have a larger data frame than the one you provided here and don't want to inspect visually which rows to select, just use this:

new.d1=ddply(d1, c("Gender", "-Age"))[,-1]
pos=match('M',new.d1$Gender) # pos wil show index of first entry of M
new.d1[c(1:2,pos:(pos+1)),]

回复收藏 0 原文

苍白女子 2024-11-15 21:44:13

如果您只想进行排序，则比这更容易：

d1 <- transform(d1[order(d1$Age, decreasing=TRUE), ], Gender=as.factor(Gender))

然后您可以调用：

require(plyr)
d1 <- ddply(d1, .(Gender), head, n=2)

对每个性别子组的前两个进行子集化。

It is even easier than that if you just want to do the sorting:

d1 <- transform(d1[order(d1$Age, decreasing=TRUE), ], Gender=as.factor(Gender))

you can then call:

require(plyr)
d1 <- ddply(d1, .(Gender), head, n=2)

to subset the top two of each Gender subgroup.

回复收藏 0 原文

抚笙 2024-11-15 21:44:13

例如，如果您需要前 2 个女性和前 3 个男性，我有一个建议：

library(plyr)
m<-d1[order(d1$Age, decreasing = TRUE) , ] 
h<-mapply(function(x,y) head(x,y), split(m$Age,m$Gender),y=c(2,3)) 
ldply (h, data.frame)

您只需更改最终数据框的名称即可。

I have a suggestion if you need, for example, the first 2 females and the first 3 males:

library(plyr)
m<-d1[order(d1$Age, decreasing = TRUE) , ] 
h<-mapply(function(x,y) head(x,y), split(m$Age,m$Gender),y=c(2,3)) 
ldply (h, data.frame)

You just need to change the names of the final dataframe.

回复收藏 0 原文

╰ゝ天使的微笑 2024-11-15 21:44:13

d1 = d1[order(d1$Gender, -d1$Age),]  
d1 = d1[ave(d1$Age, d1$Gender, FUN = seq_along) <= 2, ]

遇到了类似的问题，发现在具有 150 万条记录的 data.frame 上使用此方法非常快

d1 = d1[order(d1$Gender, -d1$Age),]  
d1 = d1[ave(d1$Age, d1$Gender, FUN = seq_along) <= 2, ]

Had a similar problem and found this method really fast when used on a data.frame with 1.5 million records

回复收藏 0 原文

~没有更多了~

关于作者

把时间冻结

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

使用每个组的前 n 行对数据框进行子集化，并按变量排序

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

qq_QW8UFI

下壹個目標

横笛休吹塞上声

好久不见√

hsp

夕拾、秋藏

友情链接

使用每个组的前 n 行对数据框进行子集化，并按变量排序

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

qq_QW8UFI

下壹個目標

横笛休吹塞上声

好久不见√

hsp

夕拾、秋藏

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。