子选择数据框

发布于 2024-10-23 22:21:18 字数 628 浏览 5 评论 0原文

我想我有一个简单的问题。在我的数据框中,我想创建子集,其中列 Quality_score 等于:Perfect、Perfect*、Perfect*、Good、Good**好***

这在我的解决方案中:

>Quality_scoreComplete <- subset(completefile,Quality_score == "Perfect" | Quality_score=="Perfect***" | Quality_score=="Perfect****" | Quality_score=="Good" | Quality_score=="Good***" | Quality_score=="Good****") 

有没有办法简化这个方法?喜欢:

methods<-c('Perfect', 'Perfect***', 'Perfect****', 'Good', 'Good***','Good***')
Quality_scoreComplete <- subset(completefile,Quality_score==methods)

谢谢大家,

莉莎娜

I have a simple questioon I think. In my dataframe I would like to make subset where column Quality_score is equal to: Perfect, Perfect*, Perfect*, Good, Good** and Good***

This in my solution by now:

>Quality_scoreComplete <- subset(completefile,Quality_score == "Perfect" | Quality_score=="Perfect***" | Quality_score=="Perfect****" | Quality_score=="Good" | Quality_score=="Good***" | Quality_score=="Good****") 

Is there a way to simplify this method? Like:

methods<-c('Perfect', 'Perfect***', 'Perfect****', 'Good', 'Good***','Good***')
Quality_scoreComplete <- subset(completefile,Quality_score==methods)

Thank you all,

Lisanne

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

萌辣 2024-10-30 22:21:18

你甚至不需要子集,检查:?"["

Quality_scoreComplete <- completefile[completefile$Quality_score %in% methods,]

编辑:基于@Sacha Epskamp的善意评论:==表达式中的 给出了错误的结果,因此将其更正为 %in%。谢谢!

问题示例:

> x <- c(17, 19)
> cars[cars$speed==x,]
   speed dist
29    17   32
31    17   50
36    19   36
38    19   68
> cars[cars$speed %in% x,]
   speed dist
29    17   32
30    17   40
31    17   50
36    19   36
37    19   46
38    19   68

You do not even need subset, check: ?"["

Quality_scoreComplete <- completefile[completefile$Quality_score %in% methods,]

EDITED: based on kind comment of @Sacha Epskamp: == in the expression gives wrong results, so corrected it above to %in%. Thanks!

Example of the problem:

> x <- c(17, 19)
> cars[cars$speed==x,]
   speed dist
29    17   32
31    17   50
36    19   36
38    19   68
> cars[cars$speed %in% x,]
   speed dist
29    17   32
30    17   40
31    17   50
36    19   36
37    19   46
38    19   68
╰ゝ天使的微笑 2024-10-30 22:21:18

一个有效的方法是 grepl,它在字符串中搜索模式并返回一个逻辑指示它是否存在。您还可以在字符串中使用 | 运算符来指示 OR,并使用 ignore.case 来忽略区分大小写:

methods<-c('Perfect', 'Perfect*', 'Perfect*', 'Good', 'Good','Good*')

completefile <- data.frame( Quality_score = c( methods, "bad", "terrible", "abbysmal"), foo = 1)

subset(completefile,grepl("good|perfect",Quality_score,ignore.case=TRUE))
1       Perfect   1
2      Perfect*   1
3      Perfect*   1
4          Good   1
5          Good   1
6         Good*   1

编辑:我现在发现区分大小写不是问题,谢谢阅读障碍!你可以简化为:

subset(completefile,grepl("Good|Perfect",Quality_score))

One thing that works is grepl, this searches for a pattern in strings and returns a logical indicating if it is there. You can use the | operator in a string as well to indicate OR, and ignore.case to ignore case sensitivity:

methods<-c('Perfect', 'Perfect*', 'Perfect*', 'Good', 'Good','Good*')

completefile <- data.frame( Quality_score = c( methods, "bad", "terrible", "abbysmal"), foo = 1)

subset(completefile,grepl("good|perfect",Quality_score,ignore.case=TRUE))
1       Perfect   1
2      Perfect*   1
3      Perfect*   1
4          Good   1
5          Good   1
6         Good*   1

EDIT: I see now that case sensitivity was not an issue, thanks dyslexia! You could simplify then to:

subset(completefile,grepl("Good|Perfect",Quality_score))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文