Case_when R 多条件逻辑
我正在尝试利用 dplyr 中的 case_when 在数据框中创建分类向量,但我想看看是否可以以更简洁的方式完成。
示例数据帧
structure(list(Primary.column = c(1L, 0L, 1L, 0L, 1L, 1L), Other_column1 = c(1L,
1L, 0L, 0L, 0L, 0L), Other_column2 = c(0L, 0L, 1L, 1L, 0L, 0L
), Other_column3 = c(0L, 0L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
我想要这样设置,当主列 = 0 时,类别为 A。但是,如果主列为 1,那么我希望它根据其他列中的关键字“其他”查询其他列,以具有 1 的为准,决定类别。 类似于 Primary Column == 1 & (选择标题为“其他”的列中有 1 来决定类别)
对于这个,这是测试数据框中的逻辑,
If Primary Column is 1 and Other_column1 is 1 then the new column should have value B
If Primary Column is 1 and Other_column2 is 1 then the new column should have value C
If Primary Column is 1 and Other_column3 is 1 then the new column should have value D
这是一个简单的解决方案,因为列很少,可以这样解决,
test_df <- test_df %>%
mutate(new_column=case_when(Primary.column==0 ~ "A",
Primary.column==1 & Other_column1 ==1 ~ "B",
Primary.column==1 & Other_column2 ==1 ~ "C",
Primary.column==1 & Other_column3 ==1 ~ "D",
))
但是真正的数据框有数百个“其他”列,这不是一个干净的解决方案,我' d 对于这个单个变量有数百行代码。不是我想要的。
在这个例子中,我还有一个键,它告诉我如果主列为 1,其他列将采用哪些列。
键
structure(list(Column = c("Other_column1 ", "Other_column2",
"Other_column3"), Value = c("B", "C", "D")), class = "data.frame", row.names = c(NA,
-3L))
有没有办法利用该键来制作它,这样我就不会写出 100 行混乱的代码?或者有其他解决方案来保持清洁吗?
I'm trying to utilize case_when
from dplyr
to create a categorical vector in a dataframe, but I want to see if it can be done in a cleaner way.
Example dataframe
structure(list(Primary.column = c(1L, 0L, 1L, 0L, 1L, 1L), Other_column1 = c(1L,
1L, 0L, 0L, 0L, 0L), Other_column2 = c(0L, 0L, 1L, 1L, 0L, 0L
), Other_column3 = c(0L, 0L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA,
-6L))
I want to make it such that when Primary column = 0 the category is A. However, if Primary Column is 1 then I want it to query other columns based on the keyword "Other" in the other columns and whichever have 1, that determines the category.
Something like Primary Column == 1 & (select whichever column titled "Other" has 1 in it to decide category)
For this one this is the logic
If Primary Column is 1 and Other_column1 is 1 then the new column should have value B
If Primary Column is 1 and Other_column2 is 1 then the new column should have value C
If Primary Column is 1 and Other_column3 is 1 then the new column should have value D
In the test dataframe this is an easy solve because there are very few columns and this could be solved like so
test_df <- test_df %>%
mutate(new_column=case_when(Primary.column==0 ~ "A",
Primary.column==1 & Other_column1 ==1 ~ "B",
Primary.column==1 & Other_column2 ==1 ~ "C",
Primary.column==1 & Other_column3 ==1 ~ "D",
))
But the true dataframe has hundreds of "Other" columns and this is not a clean solution and I'd have hundreds of lines of code for this single variable. Not what I want.
in this example I also have a key that that tells me what columns the other columns take if primary column is 1.
Key
structure(list(Column = c("Other_column1 ", "Other_column2",
"Other_column3"), Value = c("B", "C", "D")), class = "data.frame", row.names = c(NA,
-3L))
Is there a way to utilize the key to make it so that I don't write out 100 lines of messy code? Or are there alternative solutions to keep things clean?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论