长度格式的组合

发布于 2025-01-27 03:27:09 字数 465 浏览 2 评论 0原文

因此,我有一个父母及其子女的数据集以下形式,

Children_id   Parent_id
10            1
11            1
12            1
13            2
14            2

我想要的是每个孩子的兄弟姐妹长期的数据集,即,

id   sibling_id
10   11
10   12
11   10
11   12
12   10
12   11
13   14
14   13

最好使用DataTable,最好的方法是什么?

示例数据:

df< - data.frame(“ children_id” = c(10,11,12,13,14),“ parent_id” = c(1,1,1, 2,2))

So I have a dataset of parents and their children of the following form

Children_id   Parent_id
10            1
11            1
12            1
13            2
14            2

What I want is a dataset of each child's siblings in long format, i.e.,

id   sibling_id
10   11
10   12
11   10
11   12
12   10
12   11
13   14
14   13

What's the best way to achieve this, preferably using datatable?

Example data:

df <- data.frame("Children_id" = c(10,11,12,13,14), "Parent_id" = c(1,1,1,2,2))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

没有伤那来痛 2025-02-03 03:27:09

外面的图表专家可能会有更好的解决方案,但这是data.table解决方案:

library(data.table)

setDT(df)[df,on=.(Parent_id), allow.cartesian=T] %>% 
  .[Children_id!=i.Children_id, .(id = i.Children_id, sibling=Children_id)]

输出:输出:

      id sibling
   <num>   <num>
1:    10      11
2:    10      12
3:    11      10
4:    11      12
5:    12      10
6:    12      11
7:    13      14
8:    14      13

The graph experts out there will probably have better solutions, but here is a data.table solution:

library(data.table)

setDT(df)[df,on=.(Parent_id), allow.cartesian=T] %>% 
  .[Children_id!=i.Children_id, .(id = i.Children_id, sibling=Children_id)]

Output:

      id sibling
   <num>   <num>
1:    10      11
2:    10      12
3:    11      10
4:    11      12
5:    12      10
6:    12      11
7:    13      14
8:    14      13
自找没趣 2025-02-03 03:27:09

基础r中,我们可以在explive.grid.grid之后split ting

out <- do.call(rbind, lapply(split(df$Children_id, df$Parent_id), \(x) 
     subset(expand.grid(x, x), Var1 != Var2)[2:1]))
row.names(out) <- NULL
colnames(out) <- c("id", "sibling_id")

-output

> out
  id sibling_id
1 10         11
2 10         12
3 11         10
4 11         12
5 12         10
6 12         11
7 13         14
8 14         13

或使用data.table.table代码> CJ

library(data.table)
setDT(df)[, CJ(id = Children_id, sibling_id = Children_id),
    Parent_id][id != sibling_id, .(id, sibling_id)]
      id sibling_id
   <num>      <num>
1:    10         11
2:    10         12
3:    11         10
4:    11         12
5:    12         10
6:    12         11
7:    13         14
8:    14         13

In base R, we can use expand.grid after splitting

out <- do.call(rbind, lapply(split(df$Children_id, df$Parent_id), \(x) 
     subset(expand.grid(x, x), Var1 != Var2)[2:1]))
row.names(out) <- NULL
colnames(out) <- c("id", "sibling_id")

-output

> out
  id sibling_id
1 10         11
2 10         12
3 11         10
4 11         12
5 12         10
6 12         11
7 13         14
8 14         13

Or using data.table with CJ

library(data.table)
setDT(df)[, CJ(id = Children_id, sibling_id = Children_id),
    Parent_id][id != sibling_id, .(id, sibling_id)]
      id sibling_id
   <num>      <num>
1:    10         11
2:    10         12
3:    11         10
4:    11         12
5:    12         10
6:    12         11
7:    13         14
8:    14         13
々眼睛长脚气 2025-02-03 03:27:09

dplyr带有innion_join的解决方案:

library(dplyr)
inner_join(df, df, by = "Parent_id") %>% 
  select(id = Children_id.x, siblings = Children_id.y) %>% 
  filter(id != siblings)

  id siblings
1 10       11
2 10       12
3 11       10
4 11       12
5 12       10
6 12       11
7 13       14
8 14       13

或其他策略:

library(dplyr)
df %>% 
  group_by(Parent_id) %>% 
  mutate(siblings = list(unique(Children_id))) %>% 
  unnest(siblings) %>% 
  filter(Children_id != siblings)

A dplyr solution with inner_join:

library(dplyr)
inner_join(df, df, by = "Parent_id") %>% 
  select(id = Children_id.x, siblings = Children_id.y) %>% 
  filter(id != siblings)

  id siblings
1 10       11
2 10       12
3 11       10
4 11       12
5 12       10
6 12       11
7 13       14
8 14       13

or another strategy:

library(dplyr)
df %>% 
  group_by(Parent_id) %>% 
  mutate(siblings = list(unique(Children_id))) %>% 
  unnest(siblings) %>% 
  filter(Children_id != siblings)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文