如何在 R 中创建新列计算按变量(因子)分组的数值变量的平均值(前 3 个分组行)?

发布于 2025-01-14 06:45:36 字数 1552 浏览 0 评论 0原文

我尝试过使用 rollapply 但无法获得所需的结果。

这些是我想要进行计算的数据集的列(样本)。

structure(list(LeagueROUND = structure(c(1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("1", "2", 
"3", "4"), class = "factor"), League = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Portugal2", class = "factor"), 
    Season = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L), .Label = "2021/2022", class = "factor"), 
    DRAWmarginODDS = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L), .Label = c("No", "Yes"), class = "factor"), 
    DRAWnumODDS = c(0L, NA, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 
    0L, 0L, NA, 0L)), .Names = c("LeagueROUND", "League", "Season", 
"DRAWmarginODDS", "DRAWnumODDS"), class = "data.frame", row.names = c(NA, 
-15L))

输入图片此处描述

所需结果

在此处输入图像描述

分组依据(LeagueROUND,League,Season,DRAWmarginODDS)

(DRAWnumODDS) 的平均值前 3 个 LeagueROUND

即:

联赛第 1 轮(是)在 3(分组)行中添加 1(DRAWnumODDS)。

联赛第 2 轮(是)在 4(分组)行中添加 2(DRAWnumODDS)

联赛第 3 轮(是)在 3(分组)行中添加 0(DRAWnumODDS)

期望:

在联赛第 4 轮中(是)(前 3 轮联赛的平均值) = 3(DRAWnumODDS) in 10(grouped) rows = 平均值 0,3

League Round (No) = NA

3 First LeagueROUND ->不适用

I have tried using rollapply but I can't get the desired result.

These are the columns(sample) of the dataset on which I want to do the calculations.

structure(list(LeagueROUND = structure(c(1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), .Label = c("1", "2", 
"3", "4"), class = "factor"), League = structure(c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Portugal2", class = "factor"), 
    Season = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L), .Label = "2021/2022", class = "factor"), 
    DRAWmarginODDS = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L), .Label = c("No", "Yes"), class = "factor"), 
    DRAWnumODDS = c(0L, NA, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 
    0L, 0L, NA, 0L)), .Names = c("LeagueROUND", "League", "Season", 
"DRAWmarginODDS", "DRAWnumODDS"), class = "data.frame", row.names = c(NA, 
-15L))

enter image description here

Desired result

enter image description here

Group by( LeagueROUND,League,Season,DRAWmarginODDS)

Average(mean) of (DRAWnumODDS) of 3 previous LeagueROUNDs

That is:

League Round 1 (Yes) adds 1(DRAWnumODDS) in 3(grouped) rows.

League Round 2 (Yes) adds 2(DRAWnumODDS) in 4(grouped) rows

League Round 3 (Yes) adds 0(DRAWnumODDS) in 3(grouped) rows

Desired:

In League Round 4(Yes) (average of 3 previous League Round) = 3(DRAWnumODDS) in 10(grouped) rows = mean 0,3

League Round (No) = NA

3 first LeagueROUND -> NA

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

海夕 2025-01-21 06:45:36
library(tidyverse)

data <- structure(list(
  LeagueROUND = structure(c(
    1L, 1L, 1L, 1L, 2L,
    2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 4L
  ), .Label = c(
    "1", "2",
    "3", "4"
  ), class = "factor"), League = structure(c(
    1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
  ), .Label = "Portugal2", class = "factor"),
  Season = structure(c(
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L
  ), .Label = "2021/2022", class = "factor"),
  DRAWmarginODDS = structure(c(
    2L, 1L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L
  ), .Label = c("No", "Yes"), class = "factor"),
  DRAWnumODDS = c(
    0L, NA, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L,
    0L, 0L, NA, 0L
  )
), .Names = c(
  "LeagueROUND", "League", "Season",
  "DRAWmarginODDS", "DRAWnumODDS"
), class = "data.frame", row.names = c(
  NA,
  -15L
))


data %>%
  mutate(LeagueROUND = as.integer(LeagueROUND)) %>%
  group_by(DRAWmarginODDS, LeagueROUND) %>%
  summarise(DRAWnumODDS = sum(DRAWnumODDS, na.rm = TRUE)) %>%
  ungroup() %>%
  filter(DRAWmarginODDS == "Yes") %>%
  arrange(LeagueROUND) %>%
  mutate(
    n_observations = LeagueROUND %>% map_int(~ {
      data %>%
        mutate(LeagueROUND = as.integer(LeagueROUND)) %>%
        filter(LeagueROUND < .x & DRAWmarginODDS == "Yes") %>%
        nrow()
    }),
    mean_last_3_DRAWnumODDS = (lag(DRAWnumODDS, 1) + lag(DRAWnumODDS, 2) + lag(DRAWnumODDS, 3)) / n_observations
  ) %>%
  mutate(across(everything(), as.character)) %>%
  right_join(data %>% mutate(across(everything(), as.character))) %>%
  type_convert()
#> `summarise()` has grouped output by 'DRAWmarginODDS'. You can override using the `.groups` argument.Joining, by = c("DRAWmarginODDS", "LeagueROUND", "DRAWnumODDS")
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   DRAWmarginODDS = col_character(),
#>   LeagueROUND = col_double(),
#>   DRAWnumODDS = col_double(),
#>   n_observations = col_double(),
#>   mean_last_3_DRAWnumODDS = col_double(),
#>   League = col_character(),
#>   Season = col_character()
#> )
#> # A tibble: 15 × 7
#>    DRAWmarginODDS LeagueROUND DRAWnumODDS n_observations mean_last_3_DRA… League
#>    <chr>                <dbl>       <dbl>          <dbl>            <dbl> <chr> 
#>  1 Yes                      1           1              0             NA   Portu…
#>  2 Yes                      3           0              7             NA   Portu…
#>  3 Yes                      3           0              7             NA   Portu…
#>  4 Yes                      3           0              7             NA   Portu…
#>  5 Yes                      4           0             10              0.3 Portu…
#>  6 Yes                      4           0             10              0.3 Portu…
#>  7 Yes                      4           0             10              0.3 Portu…
#>  8 Yes                      1           0             NA             NA   Portu…
#>  9 No                       1          NA             NA             NA   Portu…
#> 10 Yes                      1           0             NA             NA   Portu…
#> 11 Yes                      2           0             NA             NA   Portu…
#> 12 Yes                      2           0             NA             NA   Portu…
#> 13 Yes                      2           1             NA             NA   Portu…
#> 14 Yes                      2           1             NA             NA   Portu…
#> 15 No                       4          NA             NA             NA   Portu…
#> # … with 1 more variable: Season <chr>

reprex 包 (v2.0.0) 创建于 2022 年 3 月 15 日

library(tidyverse)

data <- structure(list(
  LeagueROUND = structure(c(
    1L, 1L, 1L, 1L, 2L,
    2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 4L
  ), .Label = c(
    "1", "2",
    "3", "4"
  ), class = "factor"), League = structure(c(
    1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
  ), .Label = "Portugal2", class = "factor"),
  Season = structure(c(
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    1L, 1L, 1L, 1L, 1L, 1L
  ), .Label = "2021/2022", class = "factor"),
  DRAWmarginODDS = structure(c(
    2L, 1L, 2L, 2L, 2L, 2L, 2L,
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L
  ), .Label = c("No", "Yes"), class = "factor"),
  DRAWnumODDS = c(
    0L, NA, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L,
    0L, 0L, NA, 0L
  )
), .Names = c(
  "LeagueROUND", "League", "Season",
  "DRAWmarginODDS", "DRAWnumODDS"
), class = "data.frame", row.names = c(
  NA,
  -15L
))


data %>%
  mutate(LeagueROUND = as.integer(LeagueROUND)) %>%
  group_by(DRAWmarginODDS, LeagueROUND) %>%
  summarise(DRAWnumODDS = sum(DRAWnumODDS, na.rm = TRUE)) %>%
  ungroup() %>%
  filter(DRAWmarginODDS == "Yes") %>%
  arrange(LeagueROUND) %>%
  mutate(
    n_observations = LeagueROUND %>% map_int(~ {
      data %>%
        mutate(LeagueROUND = as.integer(LeagueROUND)) %>%
        filter(LeagueROUND < .x & DRAWmarginODDS == "Yes") %>%
        nrow()
    }),
    mean_last_3_DRAWnumODDS = (lag(DRAWnumODDS, 1) + lag(DRAWnumODDS, 2) + lag(DRAWnumODDS, 3)) / n_observations
  ) %>%
  mutate(across(everything(), as.character)) %>%
  right_join(data %>% mutate(across(everything(), as.character))) %>%
  type_convert()
#> `summarise()` has grouped output by 'DRAWmarginODDS'. You can override using the `.groups` argument.Joining, by = c("DRAWmarginODDS", "LeagueROUND", "DRAWnumODDS")
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#>   DRAWmarginODDS = col_character(),
#>   LeagueROUND = col_double(),
#>   DRAWnumODDS = col_double(),
#>   n_observations = col_double(),
#>   mean_last_3_DRAWnumODDS = col_double(),
#>   League = col_character(),
#>   Season = col_character()
#> )
#> # A tibble: 15 × 7
#>    DRAWmarginODDS LeagueROUND DRAWnumODDS n_observations mean_last_3_DRA… League
#>    <chr>                <dbl>       <dbl>          <dbl>            <dbl> <chr> 
#>  1 Yes                      1           1              0             NA   Portu…
#>  2 Yes                      3           0              7             NA   Portu…
#>  3 Yes                      3           0              7             NA   Portu…
#>  4 Yes                      3           0              7             NA   Portu…
#>  5 Yes                      4           0             10              0.3 Portu…
#>  6 Yes                      4           0             10              0.3 Portu…
#>  7 Yes                      4           0             10              0.3 Portu…
#>  8 Yes                      1           0             NA             NA   Portu…
#>  9 No                       1          NA             NA             NA   Portu…
#> 10 Yes                      1           0             NA             NA   Portu…
#> 11 Yes                      2           0             NA             NA   Portu…
#> 12 Yes                      2           0             NA             NA   Portu…
#> 13 Yes                      2           1             NA             NA   Portu…
#> 14 Yes                      2           1             NA             NA   Portu…
#> 15 No                       4          NA             NA             NA   Portu…
#> # … with 1 more variable: Season <chr>

Created on 2022-03-15 by the reprex package (v2.0.0)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文