在R中合并一对多数据以用于分类变量

发布于 2025-02-06 08:31:10 字数 186 浏览 3 评论 0原文

我有2个数据集。一个与事故相关的,另一个与涉及的车辆有关。现在,1事故可能涉及多个车辆。合并这些时,我需要一排事故 - >因此,我正在尝试考虑采取同样的策略。对于车辆中的数值变量,我计划采用平均值(例如驾驶员,发动机​​电源等年龄),但是,如何将3-4行分类变量合并为1行?就像假设是否有两名男性和两名女性用于驾驶员的性别,这是我要选择的是基于频率的?

I have 2 datasets. One associated with Accidents and one with the vehicles involved in the same. Now 1 accident can have more than one vehicles involved. While merging these, I need a single row of accident -> vehicle and hence, I am trying to think of a strategy to do the same. For numerical variables in vehicles, I plan to take the average (like Age of driver, engine power, etc.), however, how exactly can I merge 3-4 rows of categorical variables into 1 row? Like let's say if there are two males and two females for sex of driver, which one to choose if I am looking at frequency-based?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

七月上 2025-02-13 08:31:10

这样的东西?

library(tidyverse)

vehicles <- tribble(
  ~vehicle_id, ~owner, ~age, ~sex,
  1, "A", 18, "male",
  2, "B", 45, "male",
  3, "C", 38, "female"
)

accidents <- tribble(
  ~accident_id, ~vehicle_id,
  1, 1,
  1, 2,
  2, 1,
  3, 3
)

results <-
  accidents %>%
  left_join(vehicles) %>%
  group_by(accident_id) %>%
  summarise(
    vehicle_ids = list(vehicle_id),
    owners = list(owner),
    mean_age = mean(age),
    sexes = list(sex)
  ) %>%
  mutate(
    n_sexes = sexes %>% map(~ .x %>%
      table() %>%
      enframe() %>%
      mutate(value = value %>% as.numeric()))
  ) %>%
  unnest(n_sexes) %>%
  pivot_wider(values_fill = list(value = 0))
#> Joining, by = "vehicle_id"

results
#> # A tibble: 3 × 7
#>   accident_id vehicle_ids owners    mean_age sexes      male female
#>         <dbl> <list>      <list>       <dbl> <list>    <dbl>  <dbl>
#> 1           1 <dbl [2]>   <chr [2]>     31.5 <chr [2]>     2      0
#> 2           2 <dbl [1]>   <chr [1]>     18   <chr [1]>     1      0
#> 3           3 <dbl [1]>   <chr [1]>     38   <chr [1]>     0      1

results$owners[[1]]
#> [1] "A" "B"

Something like this?

library(tidyverse)

vehicles <- tribble(
  ~vehicle_id, ~owner, ~age, ~sex,
  1, "A", 18, "male",
  2, "B", 45, "male",
  3, "C", 38, "female"
)

accidents <- tribble(
  ~accident_id, ~vehicle_id,
  1, 1,
  1, 2,
  2, 1,
  3, 3
)

results <-
  accidents %>%
  left_join(vehicles) %>%
  group_by(accident_id) %>%
  summarise(
    vehicle_ids = list(vehicle_id),
    owners = list(owner),
    mean_age = mean(age),
    sexes = list(sex)
  ) %>%
  mutate(
    n_sexes = sexes %>% map(~ .x %>%
      table() %>%
      enframe() %>%
      mutate(value = value %>% as.numeric()))
  ) %>%
  unnest(n_sexes) %>%
  pivot_wider(values_fill = list(value = 0))
#> Joining, by = "vehicle_id"

results
#> # A tibble: 3 × 7
#>   accident_id vehicle_ids owners    mean_age sexes      male female
#>         <dbl> <list>      <list>       <dbl> <list>    <dbl>  <dbl>
#> 1           1 <dbl [2]>   <chr [2]>     31.5 <chr [2]>     2      0
#> 2           2 <dbl [1]>   <chr [1]>     18   <chr [1]>     1      0
#> 3           3 <dbl [1]>   <chr [1]>     38   <chr [1]>     0      1

results$owners[[1]]
#> [1] "A" "B"

Created on 2022-06-10 by the reprex package (v2.0.0)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文