创建一个函数来识别缺失值
我正在尝试构建一个函数作为 R 中更大函数的一部分。有些部分工作正常,但其他部分则不然。这是给我带来问题的代码片段。
这部分函数旨在识别数据框中的变量是否丢失,然后生成一个新变量来记录该特定情况是否丢失或存在。我希望新变量具有后缀 .zero(q1 变为 q1_zero,q2 变为 q2_zero 等)。我可以毫无问题地生成后缀。创建新变量会导致一些问题。任何见解将不胜感激。
function1 <- function (x, data) {
# new variable name
temp <- paste (x, .zero, sep="", collapse = NULL)
temp
# is variable missing
# I don't know if I should use this method or ifelse()
data$temp [is.na (data$x)]<- 0
data$temp [!is.na (data$x)]<- 1
return (data$temp)
}
I am trying to build a function as part of a larger function in R. Some of the pieces are working fine but others are not. Here is the piece of the code that is giving me issues.
This part of the function is designed to identify if a variable in a dataframe is missing, then generate a new variable which records if that specific case is missing or present. I want the new variable to have the suffix .zero (q1 becomes q1_zero, q2 becomes q2_zero, etc.). I can generate the suffix without any issues. Creating the new variable is causing some problems. Any insight would be greatly appreciated.
function1 <- function (x, data) {
# new variable name
temp <- paste (x, .zero, sep="", collapse = NULL)
temp
# is variable missing
# I don't know if I should use this method or ifelse()
data$temp [is.na (data$x)]<- 0
data$temp [!is.na (data$x)]<- 1
return (data$temp)
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您遇到了一些问题
.zero
未定义,您需要带引号的字符串".zero"
$
列名称存储在字符串中。您需要使用data[[temp]]
而不是data$temp
。 如果您想了解更多信息,请参阅以下相关常见问题解答。我们还可以做一些简化,
paste0()
是paste(sep = "")
和as.integer(!is.na(data$)的快捷方式x))
是一种更干净、更有效的创造价值观的方式。将所有这些放在一起:
我想添加一点注释来说明
.zero
后缀对于值是否缺失而言并不是特别有用。更好的后缀可能类似于.present
—— 1 表示该值存在,0 表示不存在。同样,对于函数来说,
function1
绝对是一个糟糕的名称。使用描述性名称。add_present_column
会是一个更好的名字。 (通常最好给函数命名为动词。)由于我看到 Konrad 编辑了问题,我还会提到 R 函数中不需要
return()
。函数的最后一行将被返回,从风格上讲,许多人更喜欢函数的最后一行只是data
而不是return(data)
。You've got a few issues
.zero
isn't defined, you want the quoted string".zero"
$
with column names stored in strings. You need to usedata[[temp]]
notdata$temp
. Here's the related FAQ if you want to read more.We can also make some simplifications,
paste0()
is a shortcut forpaste(sep = "")
andas.integer(!is.na(data$x))
is a cleaner and more efficient way to create your values.Putting this all together:
I'd add a little commentary to say that the
.zero
suffix is not particularly informative for whether or not a value is missing. A better suffix might be something like.present
-- a 1 indicates the value is present, a 0 indicates it is not.Similarly,
function1
is an absolutely terrible name for a function. Use descriptive names.add_present_column
would be a much better name. (It's often nice to give functions names that are verbs.)Since I see Konrad editing the question, I'll also mention that
return()
isn't needed in R functions. The last line of the function will be returned, and stylistically many would prefer that the last line of the function just bedata
notreturn(data)
.