Case_when R 多条件逻辑

发布于 2025-01-18 18:28:46 字数 1566 浏览 0 评论 0原文

我正在尝试利用 dplyr 中的 case_when 在数据框中创建分类向量，但我想看看是否可以以更简洁的方式完成。

示例数据帧

structure(list(Primary.column = c(1L, 0L, 1L, 0L, 1L, 1L), Other_column1 = c(1L, 
1L, 0L, 0L, 0L, 0L), Other_column2 = c(0L, 0L, 1L, 1L, 0L, 0L
), Other_column3 = c(0L, 0L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-6L))

我想要这样设置，当主列 = 0 时，类别为 A。但是，如果主列为 1，那么我希望它根据其他列中的关键字“其他”查询其他列，以具有 1 的为准，决定类别。类似于 Primary Column == 1 & （选择标题为“其他”的列中有 1 来决定类别） 对于这个，这是测试数据框中的逻辑，

If Primary Column is 1 and Other_column1 is 1 then the new column should have value B
If Primary Column is 1 and Other_column2 is 1 then the new column should have value C
If Primary Column is 1 and Other_column3 is 1 then the new column should have value D

这是一个简单的解决方案，因为列很少，可以这样解决，

test_df <- test_df %>% 
  mutate(new_column=case_when(Primary.column==0 ~ "A",
                              Primary.column==1 & Other_column1 ==1 ~ "B", 
                              Primary.column==1 & Other_column2 ==1 ~ "C",
                              Primary.column==1 & Other_column3 ==1 ~ "D",
                                 ))

但是真正的数据框有数百个“其他”列，这不是一个干净的解决方案，我' d 对于这个单个变量有数百行代码。不是我想要的。

在这个例子中，我还有一个键，它告诉我如果主列为 1，其他列将采用哪些列。

键

structure(list(Column = c("Other_column1 ", "Other_column2", 
"Other_column3"), Value = c("B", "C", "D")), class = "data.frame", row.names = c(NA, 
-3L))

有没有办法利用该键来制作它，这样我就不会写出 100 行混乱的代码？或者有其他解决方案来保持清洁吗？

原文

I'm trying to utilize case_when from dplyr to create a categorical vector in a dataframe, but I want to see if it can be done in a cleaner way.

Example dataframe

structure(list(Primary.column = c(1L, 0L, 1L, 0L, 1L, 1L), Other_column1 = c(1L, 
1L, 0L, 0L, 0L, 0L), Other_column2 = c(0L, 0L, 1L, 1L, 0L, 0L
), Other_column3 = c(0L, 0L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-6L))

I want to make it such that when Primary column = 0 the category is A. However, if Primary Column is 1 then I want it to query other columns based on the keyword "Other" in the other columns and whichever have 1, that determines the category.
Something like Primary Column == 1 & (select whichever column titled "Other" has 1 in it to decide category)
For this one this is the logic

If Primary Column is 1 and Other_column1 is 1 then the new column should have value B
If Primary Column is 1 and Other_column2 is 1 then the new column should have value C
If Primary Column is 1 and Other_column3 is 1 then the new column should have value D

In the test dataframe this is an easy solve because there are very few columns and this could be solved like so

test_df <- test_df %>% 
  mutate(new_column=case_when(Primary.column==0 ~ "A",
                              Primary.column==1 & Other_column1 ==1 ~ "B", 
                              Primary.column==1 & Other_column2 ==1 ~ "C",
                              Primary.column==1 & Other_column3 ==1 ~ "D",
                                 ))

But the true dataframe has hundreds of "Other" columns and this is not a clean solution and I'd have hundreds of lines of code for this single variable. Not what I want.

in this example I also have a key that that tells me what columns the other columns take if primary column is 1.

Key

structure(list(Column = c("Other_column1 ", "Other_column2", 
"Other_column3"), Value = c("B", "C", "D")), class = "data.frame", row.names = c(NA, 
-3L))

Is there a way to utilize the key to make it so that I don't write out 100 lines of messy code? Or are there alternative solutions to keep things clean?

分享到QQ

分享到微博