如何在R中将一个变量分成两个变量?

发布于 2025-01-18 20:30:02 字数 772 浏览 1 评论 0原文

我有一个变量 x ,它可以取五个值(0,1,2,3,4)。我想将变量分为两个变量。变量 1 应该包含值 0,变量 2 应该包含值 1、2、3 和 4。 我确信这很容易,但我不知道我需要做什么。

我的数据是什么样的:

|variable x|
|-----------|
|0|
|1|
|0|
|4|
|3|
|0|
|0|
|2|

所以我得到了表:

01234
12534141515

但我希望我的数据看起来像这个

变量 1
125
变量 2
78

所以变量 1 应该包含 0 在我的数据中出现的频率数据

和变量 2 应该包含我的数据中 1、2、3 和 4 出现频率的总和

I have a variable x which can take five values (0,1,2,3,4). I want to divide the variable into two variables. Variable 1 is supposed to contain the value 0 and variable two is supposed to contain the values 1,2,3 and 4.
I'm sure this is easy but I can't find out what i need to do.

what my data looks like:

|variable x|
|-----------|
|0|
|1|
|0|
|4|
|3|
|0|
|0|
|2|

so i get the table:

01234
12534141515

But I want my data to look like this

variable 1
125
variable 2
78

So variable 1 is supposed to contain how often 0 is in my data

and variable 2 is supposed to contain the sum of how often 1,2,3 and 4 are in my data

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

深爱成瘾 2025-01-25 20:30:02

您可以通过测试x == 0是否需要确切的标题来将变量转换为逻辑

x <- c(0, 1, 0, 4, 3, 0, 0, 2)

table(x)
#> x
#> 0 1 2 3 4 
#> 4 1 1 1 1 

table(x == 0)
#> FALSE  TRUE 
#>     4     4 

,请执行:

setNames(table(x == 0), c(0, paste(unique(sort(x[x != 0])), collapse = ","))
#>     0   1,2,3,4 
#>     4         4 

如果要将变量更改为可以执行的因素:

c("zero", "not zero")[1 + (x != 0)]
#>          x
#> 1     zero
#> 2 not zero
#> 3     zero
#> 4 not zero
#> 5 not zero
#> 6     zero
#> 7     zero
#> 8 not zero

>由

You can convert the variable to logical by testing whether x == 0

x <- c(0, 1, 0, 4, 3, 0, 0, 2)

table(x)
#> x
#> 0 1 2 3 4 
#> 4 1 1 1 1 

table(x == 0)
#> FALSE  TRUE 
#>     4     4 

If you want the exact headings, you can do:

setNames(table(x == 0), c(0, paste(unique(sort(x[x != 0])), collapse = ","))
#>     0   1,2,3,4 
#>     4         4 

And if you want to change the variable to a factor you could do:

c("zero", "not zero")[1 + (x != 0)]
#>          x
#> 1     zero
#> 2 not zero
#> 3     zero
#> 4 not zero
#> 5 not zero
#> 6     zero
#> 7     zero
#> 8 not zero

Created on 2022-04-02 by the reprex package (v2.0.1)

纵性 2025-01-25 20:30:02

基本 R

您可以使用cbind

x = sample(0:5, 200, replace = T)
table(x)
# x
#  0  1  2  3  4  5 
# 29 38 41 35 27 30

cbind(`0` = table(x)[1], `1,2,3,4` = sum(table(x)[2:5]))
#    0 1,2,3,4
# 0 29     141

tidyverse

library(tidyverse)
ta = as.data.frame(t(as.data.frame.array(table(x))))
ta %>% 
  mutate(!!paste(names(.[-1]), collapse = ",") := sum(c_across(`1`:`5`)), .keep = "unused")

#    0 1,2,3,4,5
# 1 29       171

base R

You can use cbind:

x = sample(0:5, 200, replace = T)
table(x)
# x
#  0  1  2  3  4  5 
# 29 38 41 35 27 30

cbind(`0` = table(x)[1], `1,2,3,4` = sum(table(x)[2:5]))
#    0 1,2,3,4
# 0 29     141

tidyverse

library(tidyverse)
ta = as.data.frame(t(as.data.frame.array(table(x))))
ta %>% 
  mutate(!!paste(names(.[-1]), collapse = ",") := sum(c_across(`1`:`5`)), .keep = "unused")

#    0 1,2,3,4,5
# 1 29       171
指尖上的星空 2025-01-25 20:30:02

从向量开始,我们可以从table中获取频率,然后将其放入数据帧中。然后,我们可以创建一个名称折叠的新列(即 1,2,3,4),并获取除第一列之外的所有列的行总和。

library(tidyverse)

tab <- data.frame(value=c(0, 1, 2, 3, 4), 
              freq=c(125,   34, 14, 15, 15))
x <- rep(tab$value, tab$freq)

output <- data.frame(rbind(table(x))) %>%
  rename_with(~str_remove(., 'X')) %>%
  mutate(!!paste0(names(.)[-1], collapse = ",") := rowSums(select(., -1))) %>%
  select(1, last_col())

输出

    0 1,2,3,4
1 125      78

然后,要在 2 个数据框中创建 2 个变量,您可以将列拆分为列表,更改名称,然后放入全局环境中。

list2env(setNames(
  split.default(output, seq_along(output)),
  c("variable 1", "variable 2")
), envir = .GlobalEnv)

或者你可以直接子集:

variable1 <- data.frame(`variable 1` = output

从向量开始,我们可以从table中获取频率,然后将其放入数据帧中。然后,我们可以创建一个名称折叠的新列(即 1,2,3,4),并获取除第一列之外的所有列的行总和。

library(tidyverse)

tab <- data.frame(value=c(0, 1, 2, 3, 4), 
              freq=c(125,   34, 14, 15, 15))
x <- rep(tab$value, tab$freq)

output <- data.frame(rbind(table(x))) %>%
  rename_with(~str_remove(., 'X')) %>%
  mutate(!!paste0(names(.)[-1], collapse = ",") := rowSums(select(., -1))) %>%
  select(1, last_col())

输出

    0 1,2,3,4
1 125      78

然后,要在 2 个数据框中创建 2 个变量,您可以将列拆分为列表,更改名称,然后放入全局环境中。

list2env(setNames(
  split.default(output, seq_along(output)),
  c("variable 1", "variable 2")
), envir = .GlobalEnv)

或者你可以直接子集:

0`, check.names = FALSE) variable2 <- data.frame(`variable 2` = output

从向量开始,我们可以从table中获取频率,然后将其放入数据帧中。然后,我们可以创建一个名称折叠的新列(即 1,2,3,4),并获取除第一列之外的所有列的行总和。

library(tidyverse)

tab <- data.frame(value=c(0, 1, 2, 3, 4), 
              freq=c(125,   34, 14, 15, 15))
x <- rep(tab$value, tab$freq)

output <- data.frame(rbind(table(x))) %>%
  rename_with(~str_remove(., 'X')) %>%
  mutate(!!paste0(names(.)[-1], collapse = ",") := rowSums(select(., -1))) %>%
  select(1, last_col())

输出

    0 1,2,3,4
1 125      78

然后,要在 2 个数据框中创建 2 个变量,您可以将列拆分为列表,更改名称,然后放入全局环境中。

list2env(setNames(
  split.default(output, seq_along(output)),
  c("variable 1", "variable 2")
), envir = .GlobalEnv)

或者你可以直接子集:

1,2,3,4`, check.names = FALSE)

Beginning with the vector, we can get the frequency from table then put it into a dataframe. Then, we can create a new column with the names collapsed (i.e., 1,2,3,4) and get the row sum for all columns except the first one.

library(tidyverse)

tab <- data.frame(value=c(0, 1, 2, 3, 4), 
              freq=c(125,   34, 14, 15, 15))
x <- rep(tab$value, tab$freq)

output <- data.frame(rbind(table(x))) %>%
  rename_with(~str_remove(., 'X')) %>%
  mutate(!!paste0(names(.)[-1], collapse = ",") := rowSums(select(., -1))) %>%
  select(1, last_col())

Output

    0 1,2,3,4
1 125      78

Then, to create the 2 variables in 2 dataframes, you can split the columns into a list, change the names, then put into the global environment.

list2env(setNames(
  split.default(output, seq_along(output)),
  c("variable 1", "variable 2")
), envir = .GlobalEnv)

Or you could just subset:

variable1 <- data.frame(`variable 1` = output

Beginning with the vector, we can get the frequency from table then put it into a dataframe. Then, we can create a new column with the names collapsed (i.e., 1,2,3,4) and get the row sum for all columns except the first one.

library(tidyverse)

tab <- data.frame(value=c(0, 1, 2, 3, 4), 
              freq=c(125,   34, 14, 15, 15))
x <- rep(tab$value, tab$freq)

output <- data.frame(rbind(table(x))) %>%
  rename_with(~str_remove(., 'X')) %>%
  mutate(!!paste0(names(.)[-1], collapse = ",") := rowSums(select(., -1))) %>%
  select(1, last_col())

Output

    0 1,2,3,4
1 125      78

Then, to create the 2 variables in 2 dataframes, you can split the columns into a list, change the names, then put into the global environment.

list2env(setNames(
  split.default(output, seq_along(output)),
  c("variable 1", "variable 2")
), envir = .GlobalEnv)

Or you could just subset:

0`, check.names = FALSE) variable2 <- data.frame(`variable 2` = output

Beginning with the vector, we can get the frequency from table then put it into a dataframe. Then, we can create a new column with the names collapsed (i.e., 1,2,3,4) and get the row sum for all columns except the first one.

library(tidyverse)

tab <- data.frame(value=c(0, 1, 2, 3, 4), 
              freq=c(125,   34, 14, 15, 15))
x <- rep(tab$value, tab$freq)

output <- data.frame(rbind(table(x))) %>%
  rename_with(~str_remove(., 'X')) %>%
  mutate(!!paste0(names(.)[-1], collapse = ",") := rowSums(select(., -1))) %>%
  select(1, last_col())

Output

    0 1,2,3,4
1 125      78

Then, to create the 2 variables in 2 dataframes, you can split the columns into a list, change the names, then put into the global environment.

list2env(setNames(
  split.default(output, seq_along(output)),
  c("variable 1", "variable 2")
), envir = .GlobalEnv)

Or you could just subset:

1,2,3,4`, check.names = FALSE)
¢蛋碎的人ぎ生 2025-01-25 20:30:02

更新:删除第一个答案:

df[paste(names(df[2:5]), collapse = ",")] <- rowSums(df[2:5])
df[, c(1,6)]
# A tibble: 1 × 2
    `0` `1,2,3,4`
  <dbl>     <dbl>
1   125        78

数据:

df <- structure(list(`0` = 125, `1` = 34, `2` = 14, `3` = 15, `4` = 15), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L))

Update: deleted first answer:

df[paste(names(df[2:5]), collapse = ",")] <- rowSums(df[2:5])
df[, c(1,6)]
# A tibble: 1 × 2
    `0` `1,2,3,4`
  <dbl>     <dbl>
1   125        78

data:

df <- structure(list(`0` = 125, `1` = 34, `2` = 14, `3` = 15, `4` = 15), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文