R:如何根据参数中传递的列表中包含的dataFrames创建命名dataframes的函数?

发布于 2025-01-24 23:03:16 字数 1770 浏览 0 评论 0原文

我制作了一个将数据帧作为参数的函数,并根据其中一列的阈值值创建两个数据框。这两个输出数据范围是根据原始输入数据框架命名的。

spliteOverUnder <- function(res){
  nm <-deparse(substitute(res))
  assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),], pos=1)
  assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),], pos=1)
}

该功能正常工作。 我想在此函数上使用循环,以便根据我的标准给出2个数据框,因此我创建了一个包含我的数据范围的列表:

listRes <- list(DJ21_T0, DJ24_T0, DJ29_T0, DJ32_T0,
                DJ24_DJ21, DJ29_DJ21, DJ32_DJ21,
                DJ21_DJ24, DJ29_DJ24, DJ32_DJ24,
                DJ21_DJ29, DJ24_DJ29, DJ32_DJ29,
                DJ21_DJ32, DJ24_DJ32, DJ29_DJ32,
                Rec2_T0, Rec6_T0, Rec9_T0,
                Rec2_DJ32, Rec6_DJ32, Rec9_DJ32,
                Rec6_Rec2, Rec9_Rec2,
                Rec2_Rec6, Rec9_Rec6,
                Rec2_Rec9, Rec6_Rec9)

并使用以下代码:

for (i in 1:length(listRes)){
  spliteOverUnder(listRes[[i]])
}

但是,此列表将我返回对象> listres [[i]] _ Overexprlistrec [[i]] underexpr 当我这样做循环时,我会遇到相同的问题:

for (i in listRes){
  spliteOverUnder(i)
}

它为我提供了对象i_overexpri_underexpr

lapply(listers,spliteoverunder)也不起作用...

如何正确循环我的功能并获取对应于我的数据框的对象? (dj21_t0_overexprdj21_t0_underexprdj24_t0_overexprdj24_t0_underexpr dj24_t0_underexpr ,... rec6_rec9_underexpr

我认为我的功能中使用的技巧deparse(替代(res))是有问题的,从代码> listres [[i]] ,而不是在我的listres dataframe列表中列出位置i的dataframe名称。

欢迎任何帮助。

谢谢

I made a function that takes a dataframe as argument, and creates two dataframes in output according to a threshold value of one of the columns. These 2 output dataframes are named according to the original input dataframe.

spliteOverUnder <- function(res){
  nm <-deparse(substitute(res))
  assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),], pos=1)
  assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),], pos=1)
}

The function works correctly.
I would like to use a loop on this function so that each of my dataframes gives 2 dataframes according to my criteria, so I created a list that contains my dataframes:

listRes <- list(DJ21_T0, DJ24_T0, DJ29_T0, DJ32_T0,
                DJ24_DJ21, DJ29_DJ21, DJ32_DJ21,
                DJ21_DJ24, DJ29_DJ24, DJ32_DJ24,
                DJ21_DJ29, DJ24_DJ29, DJ32_DJ29,
                DJ21_DJ32, DJ24_DJ32, DJ29_DJ32,
                Rec2_T0, Rec6_T0, Rec9_T0,
                Rec2_DJ32, Rec6_DJ32, Rec9_DJ32,
                Rec6_Rec2, Rec9_Rec2,
                Rec2_Rec6, Rec9_Rec6,
                Rec2_Rec9, Rec6_Rec9)

and used the following code:

for (i in 1:length(listRes)){
  spliteOverUnder(listRes[[i]])
}

But this one returns me the objects listRes[[i]]_Overexpr and listRec[[i]]Underexpr
I encounter the same problem when I do the loop like this:

for (i in listRes){
  spliteOverUnder(i)
}

Which gives me the objects i_Overexpr and i_Underexpr.

lapply(listRes, spliteOverUnder) doesn't work either...

How to loop correctly my function and get the objects corresponding to my dataframes ? (DJ21_T0_Overexpr, DJ21_T0_Underexpr, DJ24_T0_Overexpr, DJ24_T0_Underexpr, ... , Rec6_Rec9_Overexpr, Rec6_Rec9_Underexpr)

I think the trick deparse(substitute(res)) used in my function is problematic, giving the created objects the name i or listRes[[i]] rather than giving the name of the dataframe at position i in my listRes dataframe list.

Any help is welcome.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

猫弦 2025-01-31 23:03:16

这是一个整理解决方案,它可以避免使用MAP而明确编写循环。请注意,首先,您可能可以使用分组或嵌套的数据帧来完成整个操作,从而避免需要创建对象。但是,如果您确实要创建对象(或者也许启动DF具有不同数量的列,即使它们都具有log2FoldChange列),那么您可以执行以下操作。

首先,一些设置使示例可重现。

library(tidyverse)
set.seed(42722)

## Names of the example data frames we'll create
## are df_1 ... df5
df_names <- paste0("df_", 1:5) %>% 
  set_names()

## We'll make the new dfs by sampling from mtcars
base_df <- as_tibble(mtcars, rownames = "model") %>% 
  select(model, cyl, hp)

## Create 5 new data frame objects in our environment 
df_names %>% 
  walk(~ assign(x = .x,         # each element of df_names in turn
                value = sample_n(base_df, 10), 
                envir = .GlobalEnv))

## Now we have, e.g.
df_1
#> # A tibble: 10 × 3
#>    model               cyl    hp
#>    <chr>             <dbl> <dbl>
#>  1 Chrysler Imperial     8   230
#>  2 Mazda RX4 Wag         6   110
#>  3 Merc 450SE            8   180
#>  4 Porsche 914-2         4    91
#>  5 Toyota Corona         4    97
#>  6 Ford Pantera L        8   264
#>  7 Toyota Corolla        4    65
#>  8 Merc 280C             6   123
#>  9 Duster 360            8   245
#> 10 Merc 230              4    95

接下来,获取这五个数据框架并将其放入列表中,这是问题开始的地方。

df_list <- map(df_names, get)

现在,使用此数据框架列表,我们可以将每个列表分为上/下。如果拆分标准更为复杂,我们可以编写一个函数来执行此操作。但是在这里,我们使用if_else根据cyl的阈值在每个数据框架中创建一个新列。

## - a. Create an over_under column in each df in the list, 
##      based on whether `cyl` in that particular df is < 5 or not
## - b. Split on this new column.
## - c. Put all the results into a new list called `split_list`

split_list <- df_list %>% 
  map(~ mutate(., 
               over_under = if_else(.$cyl>5, "over", "under"))) %>% 
    map(~ split(., as.factor(.$over_under))) 

现在我们有了一个嵌套列表。 df_1 to df_5的每一个都被分为上下表。我们可以通过这很方便地查看它们,


split_list$df_3$under

#> # A tibble: 6 × 4
#>   model                cyl    hp over_under
#>   <chr>              <dbl> <dbl> <chr>     
#> 1 Hornet 4 Drive         6   110 under     
#> 2 Hornet Sportabout      8   175 under     
#> 3 Maserati Bora          8   335 under     
#> 4 Valiant                6   105 under     
#> 5 Mazda RX4 Wag          6   110 under     
#> 6 Cadillac Fleetwood     8   205 under

因为我们可以使用IDE中的选项卡完成来调查列表中的表。

我们可以与这样的列表合作。或者,假设它们都具有相同的列,则可以将它们绑在一个大的DF中。但是,OP希望它们作为带有后缀_OVER_UNDER的单独数据框架OBJEC。因此,例如,要提取所有“ over” dfs,并使它们具有名称df_1_over等的对象,我们

split_list %>% 
  map("over") %>%                               # subset to "over" dfs only
  set_names(nm = ~ paste0(.x, "_over")) %>%     # name each list element
  walk2(.x = names(.), #                        # write out each df with its name
        .y = .,
        .f = ~ assign(x = .x,
                value = as_tibble(.y),
                envir = .GlobalEnv))

现在可以在我们的环境中进行操作,例如,

df_5_over

#> # A tibble: 3 × 4
#>   model            cyl    hp over_under
#>   <chr>          <dbl> <dbl> <chr>     
#> 1 Porsche 914-2      4    91 over      
#> 2 Toyota Corona      4    97 over      
#> 3 Toyota Corolla     4    65 over

我们可以以相同的方式将DFS作为对象。

同样,根据所需的内容,从开始到完成整个事情可能会更有意义,并根据需要将数据分组。或者,如果我们知道原始DF都具有相同的列布局,请通过行将它们绑定到由其名称索引的DF,如下:

df_all <- bind_rows(df_list, .id = "id")

df_all

#> # A tibble: 50 × 4
#>    id    model               cyl    hp
#>    <chr> <chr>             <dbl> <dbl>
#>  1 df_1  Chrysler Imperial     8   230
#>  2 df_1  Mazda RX4 Wag         6   110
#>  3 df_1  Merc 450SE            8   180
#>  4 df_1  Porsche 914-2         4    91
#>  5 df_1  Toyota Corona         4    97
#>  6 df_1  Ford Pantera L        8   264
#>  7 df_1  Toyota Corolla        4    65
#>  8 df_1  Merc 280C             6   123
#>  9 df_1  Duster 360            8   245
#> 10 df_1  Merc 230              4    95
#> # … with 40 more rows

从那里,您可以通过id> id将大df分组/根据措施等

Here's a tidyverse solution that avoids the need to explicitly write loops, using map instead. Note at the outset that you could probably do the whole thing using grouped or nested data frames, thus avoiding the need to create the objects. But if you do want to create the objects (or perhaps the starting dfs have different numbers of columns, even if they all have the log2FoldChange column) then you could do something like the following.

First, some setup to make the example reproducible.

library(tidyverse)
set.seed(42722)

## Names of the example data frames we'll create
## are df_1 ... df5
df_names <- paste0("df_", 1:5) %>% 
  set_names()

## We'll make the new dfs by sampling from mtcars
base_df <- as_tibble(mtcars, rownames = "model") %>% 
  select(model, cyl, hp)

## Create 5 new data frame objects in our environment 
df_names %>% 
  walk(~ assign(x = .x,         # each element of df_names in turn
                value = sample_n(base_df, 10), 
                envir = .GlobalEnv))

## Now we have, e.g.
df_1
#> # A tibble: 10 × 3
#>    model               cyl    hp
#>    <chr>             <dbl> <dbl>
#>  1 Chrysler Imperial     8   230
#>  2 Mazda RX4 Wag         6   110
#>  3 Merc 450SE            8   180
#>  4 Porsche 914-2         4    91
#>  5 Toyota Corona         4    97
#>  6 Ford Pantera L        8   264
#>  7 Toyota Corolla        4    65
#>  8 Merc 280C             6   123
#>  9 Duster 360            8   245
#> 10 Merc 230              4    95

Next, get these five data frames and put them in a list, which is where the question starts from.

df_list <- map(df_names, get)

Now, working with this list of data frame, we can split each one into the over/under. If the split criteria were more complex we could write a function to do it. But here we use if_else to create a new column in each data frame based on a threshold value of cyl.

## - a. Create an over_under column in each df in the list, 
##      based on whether `cyl` in that particular df is < 5 or not
## - b. Split on this new column.
## - c. Put all the results into a new list called `split_list`

split_list <- df_list %>% 
  map(~ mutate(., 
               over_under = if_else(.$cyl>5, "over", "under"))) %>% 
    map(~ split(., as.factor(.$over_under))) 

Now we have a nested list. Each of df_1 to df_5 is split into an over or under table. We can look at them by e.g.


split_list$df_3$under

#> # A tibble: 6 × 4
#>   model                cyl    hp over_under
#>   <chr>              <dbl> <dbl> <chr>     
#> 1 Hornet 4 Drive         6   110 under     
#> 2 Hornet Sportabout      8   175 under     
#> 3 Maserati Bora          8   335 under     
#> 4 Valiant                6   105 under     
#> 5 Mazda RX4 Wag          6   110 under     
#> 6 Cadillac Fleetwood     8   205 under

This is handy because we can use tab completion in our IDE to investigate the tables in the list.

We could just work with the list like this. Or we could bind them into a big df, by row, assuming they all have the same columns. But the OP wanted them as separate data frame objecs with a suffix _over or _under. So, e.g. to extract all the "over" dfs and make them objects with names df_1_over etc, we can do

split_list %>% 
  map("over") %>%                               # subset to "over" dfs only
  set_names(nm = ~ paste0(.x, "_over")) %>%     # name each list element
  walk2(.x = names(.), #                        # write out each df with its name
        .y = .,
        .f = ~ assign(x = .x,
                value = as_tibble(.y),
                envir = .GlobalEnv))

Now in our environment we have e.g.

df_5_over

#> # A tibble: 3 × 4
#>   model            cyl    hp over_under
#>   <chr>          <dbl> <dbl> <chr>     
#> 1 Porsche 914-2      4    91 over      
#> 2 Toyota Corona      4    97 over      
#> 3 Toyota Corolla     4    65 over

We can get the "under" dfs as objects in the same way.

Again, depending on what was needed it might make more sense to do the whole thing from start to finish using a single tibble and grouping the data as needed. Or, if we know the original dfs all have the same columnar layout, bind them by row into a df indexed by their name, like this:

df_all <- bind_rows(df_list, .id = "id")

df_all

#> # A tibble: 50 × 4
#>    id    model               cyl    hp
#>    <chr> <chr>             <dbl> <dbl>
#>  1 df_1  Chrysler Imperial     8   230
#>  2 df_1  Mazda RX4 Wag         6   110
#>  3 df_1  Merc 450SE            8   180
#>  4 df_1  Porsche 914-2         4    91
#>  5 df_1  Toyota Corona         4    97
#>  6 df_1  Ford Pantera L        8   264
#>  7 df_1  Toyota Corolla        4    65
#>  8 df_1  Merc 280C             6   123
#>  9 df_1  Duster 360            8   245
#> 10 df_1  Merc 230              4    95
#> # … with 40 more rows

From there you can group the big df by id make the over/under measures etc.

聚集的泪 2025-01-31 23:03:16

最后,主要的问题是将对象及其名称之间的区别区分,而不是忘记创建数据范围列表会删除这些数据范围的名称。因此,使用name()函数非常有用。请小心以与列表中包含的顺序相同的顺序命名对象。

  1. 创建包含dataFrames的列表列表
listRes <- list(DJ21_T0, DJ24_T0, DJ29_T0, DJ32_T0,
                DJ24_DJ21, DJ29_DJ21, DJ32_DJ21,
                DJ21_DJ24, DJ29_DJ24, DJ32_DJ24,
                DJ21_DJ29, DJ24_DJ29, DJ32_DJ29,
                DJ21_DJ32, DJ24_DJ32, DJ29_DJ32,
                Rec2_T0, Rec6_T0, Rec9_T0,
                Rec2_DJ32, Rec6_DJ32, Rec9_DJ32,
                Rec6_Rec2, Rec9_Rec2,
                Rec2_Rec6, Rec9_Rec6,
                Rec2_Rec9, Rec6_Rec9)
  1. 列表中的数据范围
names(listRes) <- c("DJ21_T0", "DJ24_T0", "DJ29_T0", "DJ32_T0",
                    "DJ24_DJ21", "DJ29_DJ21", "DJ32_DJ21",
                    "DJ21_DJ24", "DJ29_DJ24", "DJ32_DJ24",
                    "DJ21_DJ29", "DJ24_DJ29", "DJ32_DJ29",
                    "DJ21_DJ32", "DJ24_DJ32", "DJ29_DJ32",
                    "Rec2_T0", "Rec6_T0", "Rec9_T0",
                    "Rec2_DJ32", "Rec6_DJ32", "Rec9_DJ32",
                    "Rec6_Rec2", "Rec9_Rec2",
                    "Rec2_Rec6", "Rec9_Rec6",
                    "Rec2_Rec9", "Rec6_Rec9")
  1. 定义该函数(在此处使用.CSV中的导出)
spliteOverUnder <- function(res, nm){
  out1 <- assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),], pos=1)
  out2 <- assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),], pos=1)
  PATH <- "/your/pathway/"
  write.table(out1, file = paste(PATH,nm,"_Overexpr.csv", sep=""), row.names=FALSE , col.names=TRUE, sep="\t", dec=".", quote=FALSE)
  write.table(out2, file = paste(PATH,nm,"_Underexpr.csv", sep=""), row.names=FALSE , col.names=TRUE, sep="\t", dec=".", quote=FALSE)
}

  1. 调用该函数以for loop
for (i in 1:length(listRes)){
  nm <- names(listRes[i])
  spliteOverUnder(listRes[[i]],nm)
}

Finally, the main problem was to make the distinction between the object and its name, and not to forget that creating a list of dataframes erases the name of these dataframes. The use of the names() function is therefore very useful. Be careful to name the objects in the same order as they are contained in the list.

  1. create the list containing the dataframes
listRes <- list(DJ21_T0, DJ24_T0, DJ29_T0, DJ32_T0,
                DJ24_DJ21, DJ29_DJ21, DJ32_DJ21,
                DJ21_DJ24, DJ29_DJ24, DJ32_DJ24,
                DJ21_DJ29, DJ24_DJ29, DJ32_DJ29,
                DJ21_DJ32, DJ24_DJ32, DJ29_DJ32,
                Rec2_T0, Rec6_T0, Rec9_T0,
                Rec2_DJ32, Rec6_DJ32, Rec9_DJ32,
                Rec6_Rec2, Rec9_Rec2,
                Rec2_Rec6, Rec9_Rec6,
                Rec2_Rec9, Rec6_Rec9)
  1. name the dataframes in the list
names(listRes) <- c("DJ21_T0", "DJ24_T0", "DJ29_T0", "DJ32_T0",
                    "DJ24_DJ21", "DJ29_DJ21", "DJ32_DJ21",
                    "DJ21_DJ24", "DJ29_DJ24", "DJ32_DJ24",
                    "DJ21_DJ29", "DJ24_DJ29", "DJ32_DJ29",
                    "DJ21_DJ32", "DJ24_DJ32", "DJ29_DJ32",
                    "Rec2_T0", "Rec6_T0", "Rec9_T0",
                    "Rec2_DJ32", "Rec6_DJ32", "Rec9_DJ32",
                    "Rec6_Rec2", "Rec9_Rec2",
                    "Rec2_Rec6", "Rec9_Rec6",
                    "Rec2_Rec9", "Rec6_Rec9")
  1. define the function (here with export in .csv)
spliteOverUnder <- function(res, nm){
  out1 <- assign(paste(nm,"_Overexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) > 1),], pos=1)
  out2 <- assign(paste(nm,"_Underexpr", sep=""), res[which(as.numeric(as.character(res$log2FoldChange)) < -1),], pos=1)
  PATH <- "/your/pathway/"
  write.table(out1, file = paste(PATH,nm,"_Overexpr.csv", sep=""), row.names=FALSE , col.names=TRUE, sep="\t", dec=".", quote=FALSE)
  write.table(out2, file = paste(PATH,nm,"_Underexpr.csv", sep=""), row.names=FALSE , col.names=TRUE, sep="\t", dec=".", quote=FALSE)
}

  1. call the function in for loop
for (i in 1:length(listRes)){
  nm <- names(listRes[i])
  spliteOverUnder(listRes[[i]],nm)
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文