我可以使用dplyr管而不是在r中的列表上使用lapply吗？

发布于 2025-02-13 16:38:44 字数 1211 浏览 0 评论 0原文

我想知道我是否可以使用Tidyverse来完成我到目前为止使用列表的任务。我每个图都有一个物种丰度的矩阵，我想用 vegdist 从 vegan 包装中计算差异指数。之后，我想以较长的格式删除自动比较等。它的工作方式像 dplyr 的魅力一样，很容易：

library(tidyverse)
library(vegan)
df <- data.frame(spec1=sample.int(50,10,replace=T),
             spec2=sample.int(75,10,replace=T),
             spec3=sample.int(10,10,replace=T),
             spec4=sample.int(40,10,replace=T),
             spec5=sample.int(50,10,replace=T),
             spec6=sample.int(5,10,replace=T))

 df%>%
  vegdist() %>%
  as.matrix() %>%
  as_tibble(rownames= "rownames") %>%
  pivot_longer(-rownames) %>%
  filter(rownames < name)

现在我想做同样的事情，但是该物种属于不同的类别，每个类别都必须获得自己的距离矩阵，只有在它才能放回原处之后到单个长格式数据框架或tibble。

cat <- data.frame(spec=c("spec1","spec2","spec3","spec4","spec5","spec6"),
                  group=c("a","b","c","b","a","c"))
df%>%
  pivot_longer(cols = everything(),values_to="abundance",names_to="spec")%>%
  left_join(cat, by="spec")

开始非常简单，但是在我用来通过列 group 将数据拆分为列表的点，我正在努力寻找解决方案。我尝试了 group_by + pivot_wider + vegdist 或 group_split 的组合一个工作解决方案。有人是否有建议，还是我应该坚持此类案例的列表？

原文

I'd like to know if I can use tidyverse for tasks I used lists in R so far.
I got a matrix of species abundances per plot for which I want to calculate dissimilarity indices with vegdist from the vegan package. After that I'd like to put it in long format remove auto comparisons etc.
It works like charm with dplyr for an easy example:

library(tidyverse)
library(vegan)
df <- data.frame(spec1=sample.int(50,10,replace=T),
             spec2=sample.int(75,10,replace=T),
             spec3=sample.int(10,10,replace=T),
             spec4=sample.int(40,10,replace=T),
             spec5=sample.int(50,10,replace=T),
             spec6=sample.int(5,10,replace=T))

 df%>%
  vegdist() %>%
  as.matrix() %>%
  as_tibble(rownames= "rownames") %>%
  pivot_longer(-rownames) %>%
  filter(rownames < name)

Now I want to do the same however the species belong to different categories and each category has to get its own distance matrix and only after it can be put back to a single long format data frame or tibble.

cat <- data.frame(spec=c("spec1","spec2","spec3","spec4","spec5","spec6"),
                  group=c("a","b","c","b","a","c"))
df%>%
  pivot_longer(cols = everything(),values_to="abundance",names_to="spec")%>%
  left_join(cat, by="spec")

The beginning is pretty straight forward but at the point where I am used to split the data to a list by the column group I am struggling to find a solution. I tried combinations of group_by + pivot_wider + vegdist or also group_split but unfortunately wasn't able to come up with a working solution. Does anybody have suggestions, or should I stick with lists for such cases?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

べ繥欢鉨o。 2025-02-20 16:38:44

我们可以split“ group”'spec'''基于list的元素的列，并应用vegdist

library(vegan)
library(purrr)
library(dplyr)
library(tidyr)
out <- map_dfr(split(cat$spec, cat$group), 
   ~  df %>%
         select(all_of(.x)) %>% 
         vegdist() %>% 
         as.matrix %>% 
         as_tibble(rownames = "rownames") %>% 
        pivot_longer(-rownames) %>% 
        filter(rownames < name), .id = 'group')

-Output，

> out
# A tibble: 135 × 4
   group rownames name   value
   <chr> <chr>    <chr>  <dbl>
 1 a     1        2     0.268 
 2 a     1        3     0.35  
 3 a     1        4     0.208 
 4 a     1        5     0.0682
 5 a     1        6     0.328 
 6 a     1        7     0.196 
 7 a     1        8     0.189 
 8 a     1        9     0.178 
 9 a     1        10    0.178 
10 a     2        3     0.0822
# … with 125 more rows

如果我们想要宽的格式，

cat %>% 
  pivot_wider(names_from = group, values_from = spec, values_fn = list) %>%
  summarise(across(everything(), ~
    df %>%
     select(all_of(unlist(.x))) %>% 
     vegdist() %>%
     as.matrix %>%
     as_tibble(rownames = "rownames") %>% 
     pivot_longer(-rownames) %>%
     filter(rownames < name))) %>% 
     unnest(where(is.list),  names_sep = "_")

We may split the 'spec' by 'group' into a list, loop over the list and then select the columns based on the elements from the list and apply the vegdist

library(vegan)
library(purrr)
library(dplyr)
library(tidyr)
out <- map_dfr(split(cat$spec, cat$group), 
   ~  df %>%
         select(all_of(.x)) %>% 
         vegdist() %>% 
         as.matrix %>% 
         as_tibble(rownames = "rownames") %>% 
        pivot_longer(-rownames) %>% 
        filter(rownames < name), .id = 'group')

-output

> out
# A tibble: 135 × 4
   group rownames name   value
   <chr> <chr>    <chr>  <dbl>
 1 a     1        2     0.268 
 2 a     1        3     0.35  
 3 a     1        4     0.208 
 4 a     1        5     0.0682
 5 a     1        6     0.328 
 6 a     1        7     0.196 
 7 a     1        8     0.189 
 8 a     1        9     0.178 
 9 a     1        10    0.178 
10 a     2        3     0.0822
# … with 125 more rows

If we want a wide format,

cat %>% 
  pivot_wider(names_from = group, values_from = spec, values_fn = list) %>%
  summarise(across(everything(), ~
    df %>%
     select(all_of(unlist(.x))) %>% 
     vegdist() %>%
     as.matrix %>%
     as_tibble(rownames = "rownames") %>% 
     pivot_longer(-rownames) %>%
     filter(rownames < name))) %>% 
     unnest(where(is.list),  names_sep = "_")

回复收藏 0 原文

~没有更多了~