从另一个变量在数据集中创建新变量

发布于 2025-01-11 16:40:46 字数 281 浏览 0 评论 0原文

我想知道如何从数据集的变量中创建另一个变量，该变量从另一个变量中包含的值中获取值。也就是说，我有一个名为“age”的可用变量，其中包含人们年龄的整数值。因此，我想在此数据集中创建一个名为“教育”的变量，这样如果年龄小于 7，教育的值为“小学教育”。如果年龄在 7 至 12 岁之间，则教育的值为“中等教育”。知道我该怎么做吗？

我尝试做类似以下的事情，但没有得到结果

if ((df$age) < 7){
  df$education="primary education"
}

原文

I would like to know how from the variables of a dataset I can create another variable that takes values from the value contained in another variable. That is to say, I have an available variable called "age" that contains integral values with the ages of people. Therefore, I want to create a variable in this dataset called "education", so that if age is less than 7, education takes the value "primary education". If age is between 7 and 12, education takes the value "secondary education". Any idea how I can do this?

I've tried to do something like the following but I do not get results

if ((df$age) < 7){
  df$education="primary education"
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清风夜微凉 2025-01-18 16:40:46

使用基本 R，您可以使用 ifelse 命令来执行逐元素条件：

df$education <- ifelse(df$age < 7, "primary education", "secondary education")

您可以嵌套 ifelse 语句来获取更多级别（尽管不是很优雅）：

df$education <- ifelse(df$age < 7, "primary education",
                       ifelse(df$age >= 7 & df$age < 12, "secondary education"), "other")

With base R, you can use the ifelse command for element-wise conditions:

df$education <- ifelse(df$age < 7, "primary education", "secondary education")

You can nest ifelse statements to obtain more levels (although not very elegantly):

df$education <- ifelse(df$age < 7, "primary education",
                       ifelse(df$age >= 7 & df$age < 12, "secondary education"), "other")

回复收藏 0 原文

雨的味道风的声音 2025-01-18 16:40:46

这是使用 tidyverse 和更具体的 dplyr 包的解决方案

library(dplyr)

# create example data by sampling random ages
df <- data.frame(age = sample(x = 5:21, size = 100, replace = TRUE))

# classify age into education col
df <- dplyr::mutate(df, education = dplyr::case_when(age < 7 ~ "primary education", 
                                                     age >= 7 & age < 12 ~ "secondary education", 
                                                     age >= 12 ~ "other"))

Here is a solution using the tidyverse and more specifc the dplyr package

library(dplyr)

# create example data by sampling random ages
df <- data.frame(age = sample(x = 5:21, size = 100, replace = TRUE))

# classify age into education col
df <- dplyr::mutate(df, education = dplyr::case_when(age < 7 ~ "primary education", 
                                                     age >= 7 & age < 12 ~ "secondary education", 
                                                     age >= 12 ~ "other"))

回复收藏 0 原文

~没有更多了~