变异,如果:如果列名称出现在第一列中,则替换为具有相同名称的不同数据帧的值
我有两个数据帧:一个我想替换其中的值(df_1),另一个我想从中获取替换值(df_2)。请考虑下面的示例数据:
数据
df_1 <- data.frame(
var=c("xAp", "xBp", "sCp", "sABp", "dBCp", "dCBp"),
A=NA,
B=NA,
C=NA)
df_2 <- data.frame(A=1, B=40, C=25)
所需的操作
如果在 df_1 中,列名称出现在第一列中,那么我想用 df_2 中的值(与该列名称对应的值)替换该列和行中的值。想象一下单元格 df_1[1,2]。列名称为 A。值 A 出现在第一列中(在 df_1[1,1] 中)。这意味着我想用 df_2 中属于 A 的值替换 NA 值,即 1。
如果列名没有出现在第一列中,我希望将其替换为零。
由于我想对每一行执行此操作,因此我一直在考虑将 mutate 与 across 结合起来。然而,当我尝试提取列名称并将它们与第一列中的值进行比较时,我已经陷入困境。
预期输出
data.frame(
var=c("xAp", "xBp", "sCp", "sABp", "dBCp", "dCBp"),
A=c(1, 0, 0, 1, 0, 0),
B=c(0, 40, 0, 40, 40, 40),
C=c(0, 0, 25, 0, 25, 25))
如果有人可以提供帮助,那就太好了。谢谢!
I have two dataframes: one in which I would like to replace values (df_1), the other one from which I would like to obtain the values for replacement (df_2). Please consider the example data below:
Data
df_1 <- data.frame(
var=c("xAp", "xBp", "sCp", "sABp", "dBCp", "dCBp"),
A=NA,
B=NA,
C=NA)
df_2 <- data.frame(A=1, B=40, C=25)
Desired action
If in df_1 the column name occurs in the first column, then I want to replace the value in that column and row by a value from df_2, the value that corresponds to this column name. So imagine cell df_1[1,2]. The column name is A. The value A occurs in the first column (in df_1[1,1]). This means I want to replace the NA value with the value that belongs to A in df_2, which is 1.
If the column name does not occur in the first column, I want it replaced by zero.
As I want to perform this action for every row, I have been thinking about a mutate combined with across. I am however stuck already when trying to extract column names and comparing them to values in the first column.
Expected output
data.frame(
var=c("xAp", "xBp", "sCp", "sABp", "dBCp", "dCBp"),
A=c(1, 0, 0, 1, 0, 0),
B=c(0, 40, 0, 40, 40, 40),
C=c(0, 0, 25, 0, 25, 25))
It would be great if someone can help out. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这里有一个选项 - 循环“df_2”的列名,创建一个条件,判断“var”列子字符串是否存在于 (
cur_column()
) 中,然后返回相应列的“df_2”值,否则在case_when
中返回 0- 检查 OP 的预期
Here is one option - loop
across
the column names of 'df_2', create a condition whether the 'var' column substring exists in (cur_column()
), then return the value of 'df_2' for that corresponding column or else return 0 incase_when
-checking with OP's expected