是否有不同列的条件值的R函数?
假设您有一个看起来像这样的数据框架:
df <- tibble(PatientID = c(1,2,3,4,5),
Treat1 = c("R", "O", "C", "O", "C"),
Treat2 = c("O", "R", "R", NA, "O"),
Treat3 = c("C", NA, "O", NA, "R"),
Treat4 = c("H", NA, "H", NA, "H"),
Treat5 = c("H", NA, NA, NA, "H"))
Treat 1:Treat5是患者所拥有的不同治疗方法。我希望创建一个新的变量“化学疗法”,其中1个,是0,0否基于患者是否接受过“ C”的治疗。
我一直在使用if_else(),但是由于我的实际数据集中有10个不同的处理变量,而且我想每个治疗列创建这样的列,我想知道我是否可以在不写这么长时间的情况下做到这一点。有一个更简单的方法吗?
Suppose you have a dataframe that looks something like this:
df <- tibble(PatientID = c(1,2,3,4,5),
Treat1 = c("R", "O", "C", "O", "C"),
Treat2 = c("O", "R", "R", NA, "O"),
Treat3 = c("C", NA, "O", NA, "R"),
Treat4 = c("H", NA, "H", NA, "H"),
Treat5 = c("H", NA, NA, NA, "H"))
Treat 1:Treat5 are different treatments that a patient has had. I'm looking to create a new variable "Chemo" with 1 for yes, 0 for no based on whether a patient has had treatment "C".
I've been using if_else(), but as I have 10 different treatment variables in my actual dataset, and I would like to create such a column per treatment, i wonder if I can do it without writing such long if statements. Is there an easier way to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用
if_any
在start_with
'处理'的列上循环,创建一个使用in%
in% -if_any 返回
是/false
如果选择的任何列具有特定行的“ C”,则逻辑将使用+
转换为二进制(或as.integer)
)- 输出
或使用
base r
带有rowsums
Use
if_any
to loop over the columns thatstarts_with
'Treat', create a logical vector with%in%
-if_any
returnsTRUE/FALSE
if any of the columns selected have 'C' for a particular row, the logical is converted to binary with+
(oras.integer
)-output
Or using
base R
withrowSums
使用
str_detect
和的另一个选项,以确定
c
是否发生在每一行的任何一个c
列中。+
将逻辑转换为整数。输出
Another option using
str_detect
andany
to determine ifC
occurs in any of theTreat
columns for each row. The+
converts the logical to an integer.Output
替代
dplyr
方式:An alternative
dplyr
way: