创建一个函数以迭代列,并在r中创建新列的每条迭代
偶尔,我将获得带有李克特秤的字符串项的调查数据,我需要更改为数字以计算基本的描述性统计数据。为了做到这一点,我通常使用case_时函数为每个项目创建一个新列,并为每个数据点分配一个数字值。我正在尝试编写一个可以一次为许多不同列执行此操作的函数,以便我不必保留复制和粘贴代码。我对此是相对较新的,因此任何帮助都将不胜感激:)
这是我以前在R中所做的:
#create data frame
df <- data.frame(v1 = c("Definitely True", "Somewhat True","Somewhat False","Definitely False"),
v2 = c("Definitely False","Somewhat False","Somewhat True","Definitely True"))
#Use case_when to add numeric columns to dataframe
df$v1n <- case_when((df$v1 == "Definitely True")==TRUE ~ "1",
(df$v1 == "Somewhat True")==TRUE ~ "2",
(df$v1 == "Somewhat False")==TRUE ~ "3",
(df$v1 == "Definitely False")==TRUE ~ "4")
df$v2n <- case_when((df$v2 == "Definitely True")==TRUE ~ "1",
(df$v2 == "Somewhat True")==TRUE ~ "2",
(df$v2 == "Somewhat False")==TRUE ~ "3",
(df$v2 == "Definitely False")==TRUE ~ "4")
如果我想用一个数字值替换每个字符串值并在现有列中覆盖数据:
for(i in colnames(data_x)) {
data_x[[i]] <- case_when((data_x[,i] == "Definitely True")==TRUE ~ "1",
(data_x[,i] == "Somewhat True")==TRUE ~ "2",
(data_x[,i] == "Somewhat False")==TRUE ~ "3",
(data_x[,i] == "Definitely False")==TRUE ~ "4")
}
但是我想要找到一种方法来为每次迭代创建新列,就像我对复制和粘贴版本一样。这是我尝试过的事情,但我没有任何成功。对此的任何帮助将不胜感激。
for(i in colnames(df)) {
df[[var[i]]] <- case_when((df[,i] == "Definitely True")==TRUE ~ "1",
(df[,i] == "Somewhat True")==TRUE ~ "2",
(df[,i] == "Somewhat False")==TRUE ~ "3",
(df[,i] == "Definitely False")==TRUE ~ "4")
}
On occassion I get survey data with likert scale string items that I need to change to numeric in order to calculate basic descriptive statistics. In order to do this, I usually use the case_when function to create a new column for each item and assign each data point a numeric value. I am trying to write a function that can do this for many different columns all at once, so that I don't have to keep copy and pasting code. I am relatively new to this so any help would be appreciated:)
Here is what I have done previously in R:
#create data frame
df <- data.frame(v1 = c("Definitely True", "Somewhat True","Somewhat False","Definitely False"),
v2 = c("Definitely False","Somewhat False","Somewhat True","Definitely True"))
#Use case_when to add numeric columns to dataframe
df$v1n <- case_when((df$v1 == "Definitely True")==TRUE ~ "1",
(df$v1 == "Somewhat True")==TRUE ~ "2",
(df$v1 == "Somewhat False")==TRUE ~ "3",
(df$v1 == "Definitely False")==TRUE ~ "4")
df$v2n <- case_when((df$v2 == "Definitely True")==TRUE ~ "1",
(df$v2 == "Somewhat True")==TRUE ~ "2",
(df$v2 == "Somewhat False")==TRUE ~ "3",
(df$v2 == "Definitely False")==TRUE ~ "4")
This works if I want to replace each string value with a numeric value and overwrite data in the existing columns:
for(i in colnames(data_x)) {
data_x[[i]] <- case_when((data_x[,i] == "Definitely True")==TRUE ~ "1",
(data_x[,i] == "Somewhat True")==TRUE ~ "2",
(data_x[,i] == "Somewhat False")==TRUE ~ "3",
(data_x[,i] == "Definitely False")==TRUE ~ "4")
}
But I would like to find a way to create a new column for each iteration as I did with the copy and paste version. Here is something I have tried but I haven't had any success. Any help on this would be appreciated.
for(i in colnames(df)) {
df[[var[i]]] <- case_when((df[,i] == "Definitely True")==TRUE ~ "1",
(df[,i] == "Somewhat True")==TRUE ~ "2",
(df[,i] == "Somewhat False")==TRUE ~ "3",
(df[,i] == "Definitely False")==TRUE ~ "4")
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
跨的dplyr
使我们能够在多个列中做一件事情。我们可以使用
starts_ptatch_with ,等等等。v1:v2
-syntax,或其他dplyr
选择器函数之一匹配
,rlang
式),在其中。
在每个列中替换为每个列数据迭代。例如,第一次评估此Tilde功能时,。
参考vectordf $ v1
。突变的默认操作(跨(...))
将是替换 ,我添加.names =
以控制命名结果数据。该符号使用胶水
-syntax,其中{。col}
被每次迭代中要评估的列的名称所代替。基础r
我将添加查找图的可选使用。
dplyr
across
gives us the ability to do one thing across multiple columns. We can usev1:v2
-syntax, or one of the otherdplyr
selector functions likematches
,starts_with
, etc.across
here is a tilde-function (rlang
-style), inside which.
is replaced with the column data each iteration. For instance, the first time that this tilde-function is evaluated, the.
references the vectordf$v1
.mutate(across(...))
will be to replace the columns, I add.names=
to control the naming of the resulting data. This notation usesglue
-syntax, where{.col}
is replaced by the name of the column being evaluated in each iteration.base R
I'll add the optional use of a lookup map.
我倾向于做不同的事情。如果将李克特秤列转换为
factor
,级别的级别正确,则可以使用as.integer(...)
直接在没有数字级别的情况下,没有所有这些case_when(...)
业务。这是一个使用
data.table
的示例I'd be inclined to do this differently. If you convert the Likert Scale columns to
factor
, with levels in the correct order, you can useas.integer(...)
to get the numeric levels directly, without all thiscase_when(...)
business.Here's an example using
data.table