创建一个函数以迭代列，并在r中创建新列的每条迭代

发布于 2025-02-01 10:23:34 字数 1796 浏览 2 评论 0原文

偶尔，我将获得带有李克特秤的字符串项的调查数据，我需要更改为数字以计算基本的描述性统计数据。为了做到这一点，我通常使用case_时函数为每个项目创建一个新列，并为每个数据点分配一个数字值。我正在尝试编写一个可以一次为许多不同列执行此操作的函数，以便我不必保留复制和粘贴代码。我对此是相对较新的，因此任何帮助都将不胜感激:)

这是我以前在R中所做的：

#create data frame
df <- data.frame(v1 = c("Definitely True", "Somewhat True","Somewhat False","Definitely False"),
                 v2 = c("Definitely False","Somewhat False","Somewhat True","Definitely True"))

#Use case_when to add numeric columns to dataframe
df$v1n <- case_when((df$v1 == "Definitely True")==TRUE ~ "1",
                         (df$v1 == "Somewhat True")==TRUE ~ "2",
                         (df$v1 == "Somewhat False")==TRUE ~ "3",
                         (df$v1 == "Definitely False")==TRUE ~ "4")
df$v2n <- case_when((df$v2 == "Definitely True")==TRUE ~ "1",
                         (df$v2 == "Somewhat True")==TRUE ~ "2",
                         (df$v2 == "Somewhat False")==TRUE ~ "3",
                         (df$v2 == "Definitely False")==TRUE ~ "4")

如果我想用一个数字值替换每个字符串值并在现有列中覆盖数据：

for(i in colnames(data_x)) {
  data_x[[i]] <- case_when((data_x[,i] == "Definitely True")==TRUE ~ "1",
                         (data_x[,i] == "Somewhat True")==TRUE ~ "2",
                         (data_x[,i] == "Somewhat False")==TRUE ~ "3",
                         (data_x[,i] == "Definitely False")==TRUE ~ "4")
}

但是我想要找到一种方法来为每次迭代创建新列，就像我对复制和粘贴版本一样。这是我尝试过的事情，但我没有任何成功。对此的任何帮助将不胜感激。

for(i in colnames(df)) {
  df[[var[i]]] <- case_when((df[,i] == "Definitely True")==TRUE ~ "1",
                         (df[,i] == "Somewhat True")==TRUE ~ "2",
                         (df[,i] == "Somewhat False")==TRUE ~ "3",
                         (df[,i] == "Definitely False")==TRUE ~ "4")
}

原文

On occassion I get survey data with likert scale string items that I need to change to numeric in order to calculate basic descriptive statistics. In order to do this, I usually use the case_when function to create a new column for each item and assign each data point a numeric value. I am trying to write a function that can do this for many different columns all at once, so that I don't have to keep copy and pasting code. I am relatively new to this so any help would be appreciated:)

Here is what I have done previously in R:

#create data frame
df <- data.frame(v1 = c("Definitely True", "Somewhat True","Somewhat False","Definitely False"),
                 v2 = c("Definitely False","Somewhat False","Somewhat True","Definitely True"))

#Use case_when to add numeric columns to dataframe
df$v1n <- case_when((df$v1 == "Definitely True")==TRUE ~ "1",
                         (df$v1 == "Somewhat True")==TRUE ~ "2",
                         (df$v1 == "Somewhat False")==TRUE ~ "3",
                         (df$v1 == "Definitely False")==TRUE ~ "4")
df$v2n <- case_when((df$v2 == "Definitely True")==TRUE ~ "1",
                         (df$v2 == "Somewhat True")==TRUE ~ "2",
                         (df$v2 == "Somewhat False")==TRUE ~ "3",
                         (df$v2 == "Definitely False")==TRUE ~ "4")

This works if I want to replace each string value with a numeric value and overwrite data in the existing columns:

for(i in colnames(data_x)) {
  data_x[[i]] <- case_when((data_x[,i] == "Definitely True")==TRUE ~ "1",
                         (data_x[,i] == "Somewhat True")==TRUE ~ "2",
                         (data_x[,i] == "Somewhat False")==TRUE ~ "3",
                         (data_x[,i] == "Definitely False")==TRUE ~ "4")
}

But I would like to find a way to create a new column for each iteration as I did with the copy and paste version. Here is something I have tried but I haven't had any success. Any help on this would be appreciated.

for(i in colnames(df)) {
  df[[var[i]]] <- case_when((df[,i] == "Definitely True")==TRUE ~ "1",
                         (df[,i] == "Somewhat True")==TRUE ~ "2",
                         (df[,i] == "Somewhat False")==TRUE ~ "3",
                         (df[,i] == "Definitely False")==TRUE ~ "4")
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

冰雪梦之恋 2025-02-08 10:23:34

跨的dplyr

df %>%
  mutate(across(v1:v2, ~ case_when(
    . == "Definitely True" ~ "1", 
    . == "Somewhat True" ~ "2", 
    . == "Somewhat False" ~ "3", 
    TRUE ~ "4"
    ), .names = "{.col}n")
  )
#                 v1               v2 v1n v2n
# 1  Definitely True Definitely False   1   4
# 2    Somewhat True   Somewhat False   2   3
# 3   Somewhat False    Somewhat True   3   2
# 4 Definitely False  Definitely True   4   1

使我们能够在多个列中做一件事情。我们可以使用v1：v2 -syntax，或其他dplyr选择器函数之一匹配， starts_ptatch_with ，等等等。
的第二个参数此处是tilde-function（rlang式），在其中。在每个列中替换为每个列数据迭代。例如，第一次评估此Tilde功能时，。参考vector df $ v1。
因为突变的默认操作（跨（...））将是替换，我添加.names =以控制命名结果数据。该符号使用胶水 -syntax，其中{。col}被每次迭代中要评估的列的名称所代替。

基础r

我将添加查找图的可选使用。

lookup <- c("Definitely True" = "1", "Somewhat True" = "2", "Somewhat False" = "3", "Definitely False" = "4")
df <- cbind(df, setNames(lapply(df[,1:2], function(z) lookup[z]), paste0(names(df[,1:2]), "n")))
rownames(df) <- NULL
df
#                 v1               v2 v1n v2n
# 1  Definitely True Definitely False   1   4
# 2    Somewhat True   Somewhat False   2   3
# 3   Somewhat False    Somewhat True   3   2
# 4 Definitely False  Definitely True   4   1

dplyr

df %>%
  mutate(across(v1:v2, ~ case_when(
    . == "Definitely True" ~ "1", 
    . == "Somewhat True" ~ "2", 
    . == "Somewhat False" ~ "3", 
    TRUE ~ "4"
    ), .names = "{.col}n")
  )
#                 v1               v2 v1n v2n
# 1  Definitely True Definitely False   1   4
# 2    Somewhat True   Somewhat False   2   3
# 3   Somewhat False    Somewhat True   3   2
# 4 Definitely False  Definitely True   4   1

across gives us the ability to do one thing across multiple columns. We can use v1:v2-syntax, or one of the other dplyr selector functions like matches, starts_with, etc.
the second argument for across here is a tilde-function (rlang-style), inside which . is replaced with the column data each iteration. For instance, the first time that this tilde-function is evaluated, the . references the vector df$v1.
because the default action of mutate(across(...)) will be to replace the columns, I add .names= to control the naming of the resulting data. This notation uses glue-syntax, where {.col} is replaced by the name of the column being evaluated in each iteration.

base R

I'll add the optional use of a lookup map.

lookup <- c("Definitely True" = "1", "Somewhat True" = "2", "Somewhat False" = "3", "Definitely False" = "4")
df <- cbind(df, setNames(lapply(df[,1:2], function(z) lookup[z]), paste0(names(df[,1:2]), "n")))
rownames(df) <- NULL
df
#                 v1               v2 v1n v2n
# 1  Definitely True Definitely False   1   4
# 2    Somewhat True   Somewhat False   2   3
# 3   Somewhat False    Somewhat True   3   2
# 4 Definitely False  Definitely True   4   1

回复收藏 0 原文

在你怀里撒娇 2025-02-08 10:23:34

我倾向于做不同的事情。如果将李克特秤列转换为factor，级别的级别正确，则可以使用as.integer（...）直接在没有数字级别的情况下，没有所有这些case_when（...）业务。

这是一个使用data.table的示例

library(data.table)
likertScale <- c("Definitely True", "Somewhat True","Somewhat False","Definitely False")
cols        <- names(df)
setDT(df)[, c(cols):=lapply(.SD, factor, levels=likertScale)]
df[, paste0(cols, 'n'):=lapply(.SD, as.integer), .SDcols=cols]
df
##                  v1               v2 v1n v2n
## 1:  Definitely True Definitely False   1   4
## 2:    Somewhat True   Somewhat False   2   3
## 3:   Somewhat False    Somewhat True   3   2
## 4: Definitely False  Definitely True   4   1

I'd be inclined to do this differently. If you convert the Likert Scale columns to factor, with levels in the correct order, you can use as.integer(...) to get the numeric levels directly, without all this case_when(...) business.

Here's an example using data.table

library(data.table)
likertScale <- c("Definitely True", "Somewhat True","Somewhat False","Definitely False")
cols        <- names(df)
setDT(df)[, c(cols):=lapply(.SD, factor, levels=likertScale)]
df[, paste0(cols, 'n'):=lapply(.SD, as.integer), .SDcols=cols]
df
##                  v1               v2 v1n v2n
## 1:  Definitely True Definitely False   1   4
## 2:    Somewhat True   Somewhat False   2   3
## 3:   Somewhat False    Somewhat True   3   2
## 4: Definitely False  Definitely True   4   1

回复收藏 0 原文

~没有更多了~