如何使用R将Summarize（）函数的结果放入数据框中？

发布于 2025-01-28 00:35:38 字数 3057 浏览 5 评论 0原文

，我认为我认为我没有很好地传达我的问题。因此，我添加了更多详细信息。

我做了一个最小的可重现示例，但是我的真实数据确实很大

a_p_ <-c(0.1, 0.3, 0.03, 0.03)
b_p_ <-c(0.2, 0.003, 0.1, 0.00001)
c_2<-c(1,2,5,23)
c_p_<-c(0.001, 0.002,0.002,0.00001)
results_1<-data.frame(a_p_,b_p_,c_2,c_p_)

a_p_ <-c(0.3, 0.02, 0.43, 0.44)
b_p_ <-c(0.00002, 0.3, 0.8, 0.005)
c_2 <-c(88,4,55,88)
c_p_<-c(0.1, 0.002,0.002,0.1)

results_2<-data.frame(a_p_,b_p_,c_2,c_p_)

，所以我有两个数据集。一个是“结果_1”，另一个是“结果_2” 但是，这只是一个可再现的数据集。在我的真实数据集中，我有200个结果文件。（从“结果_1”到“结果_200”）

然后，我想创建新的dataframe（数据帧名称为type1error）其中包含以下示例。

更具体地说，我希望这是我的新DataFrame（type1error）的第一行

>   results_1 %>%
+     summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  a_p_ b_p_ c_p_
1  0.5  0.5    0

，这是我的数据框架的第二行（type 1错误），

> results_2 %>%
+     summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  a_p_ b_p_ c_p_
1 0.75  0.5  0.5

所以我做的是..

# make empty holder

type1error<-as.data.frame(matrix(nrow = 2))

for(i in 1:2){
  # read the data 
  if(i==1){
    results<-results_1
  }
  if(i==2){
    results<-results_2
  }
  

  
  # mean() You can use mean() to get the proportion of TRUE of a logical vector.
  type1error[i,]<-results %>%
    summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  
  type1error$conditions[i] <- i 
  
}

但是我收到了这样的警告消息，结果似乎不是我所期望的（总结每一行的结果）

Warning messages:
1: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.5, b_p_ = 0.5,  :
  provided 3 variables to replace 2 variables
2: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.75, b_p_ = 0.5,  :
  provided 3 variables to replace 2 variables

如何解决此问题？

以下代码不是此示例数据集，而是我的真实数据集会产生相同的错误。

#FYI, Not reproducible, but the code that I did use for my real, huge,data is as follows:

ncond<-200

#empty holder 

type1error<-as.data.frame(matrix(nrow = ncond))

for(i in 1:ncond){
# read the data 
results <- read.csv(paste0("model_results/results_",i,".csv"))
 

# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
  summarise(across(contains("_p_"), ~ mean(.x > 0.05)))

type1error$conditions[i] <- i 

}
# one csv file in type 1 error rate 
# fixed
write.csv(type1error,"type1error/type1error.csv")

#and this code chunk did not work well.

我感谢上一个问题页中的所有答案！

在上一个问题网页的答案中，这全是 “结果_1”和“结果_2”，因为我可重复的示例只有两个数据集。

但是，实际上，我有200个数据集（从“结果_1”到“结果_200” ..），

我必须制作一个新的数据框架，而不是列表。

原文

This question is from (how to put the results of summarise() function into the dataframe in r)

in the previous question, I think I did not convey my question well.
so, I added more details.

I made a minimal reproducible example, but my real data is really huge

a_p_ <-c(0.1, 0.3, 0.03, 0.03)
b_p_ <-c(0.2, 0.003, 0.1, 0.00001)
c_2<-c(1,2,5,23)
c_p_<-c(0.001, 0.002,0.002,0.00001)
results_1<-data.frame(a_p_,b_p_,c_2,c_p_)

a_p_ <-c(0.3, 0.02, 0.43, 0.44)
b_p_ <-c(0.00002, 0.3, 0.8, 0.005)
c_2 <-c(88,4,55,88)
c_p_<-c(0.1, 0.002,0.002,0.1)

results_2<-data.frame(a_p_,b_p_,c_2,c_p_)

so, I have two dataset. the one is "results_1" and the other is "results_2"
But, this is just an reproducible dataset.
In my real dataset, I have 200 results files.
(from "results_1" to "results_200")

and then, I want to create new dataframe (data frame name is type1error)
that contains the following examples.

More specific, I want this to be the first row of my new dataframe (type1error)

>   results_1 %>%
+     summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  a_p_ b_p_ c_p_
1  0.5  0.5    0

and this to be my second row of my dataframe (type 1 error)

> results_2 %>%
+     summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  a_p_ b_p_ c_p_
1 0.75  0.5  0.5

so what I did is..

# make empty holder

type1error<-as.data.frame(matrix(nrow = 2))

for(i in 1:2){
  # read the data 
  if(i==1){
    results<-results_1
  }
  if(i==2){
    results<-results_2
  }
  

  
  # mean() You can use mean() to get the proportion of TRUE of a logical vector.
  type1error[i,]<-results %>%
    summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  
  type1error$conditions[i] <- i 
  
}

but I got warning message like this, and the results does not seems to be what I was expected
(summarise results for each row)

Warning messages:
1: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.5, b_p_ = 0.5,  :
  provided 3 variables to replace 2 variables
2: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.75, b_p_ = 0.5,  :
  provided 3 variables to replace 2 variables

How can I fix this?

The below code is not for this example dataset, but for my real dataset
which generates the same error.

#FYI, Not reproducible, but the code that I did use for my real, huge,data is as follows:

ncond<-200

#empty holder 

type1error<-as.data.frame(matrix(nrow = ncond))

for(i in 1:ncond){
# read the data 
results <- read.csv(paste0("model_results/results_",i,".csv"))
 

# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
  summarise(across(contains("_p_"), ~ mean(.x > 0.05)))

type1error$conditions[i] <- i 

}
# one csv file in type 1 error rate 
# fixed
write.csv(type1error,"type1error/type1error.csv")

#and this code chunk did not work well.

I appreciate all the answers in the previous question page!

In the answer from the previous question webpage, it is all for
"results_1" and "results_2",becuase my reproducible example have only two dataset.

However, in reality, I have 200 dataset
(from "results_1" to "results_200"..),

and I have to make a new dataframe, not a list.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

水晶透心 2025-02-04 00:35:38

您可以使用MAP和BIND_ROWS来将列表和输出作为数据框架使用。

MAP（purrr package）获取列表/矢量对其有些函数，然后输出列表，然后bind_rows（dplyr）可以将元素作为数据框架附加。

ResultList <-list(results_1, results_2)

sumit <- function(x) {
  summarise(x, across(contains("_p_"), ~ mean(.x > 0.05)))
}

FinalResult <- map(ResultList, ~sumit(.x))

Type1Error <- bind_rows(FinalResult)

您也可以在映射中作为单线操作：map（结果列表，〜汇总（.x，tocress（contains（“ _ p_”），〜表示（.x＆gt; 0.05）））

为了将所有文件放入列表格式，您可以使用地图或lapply。

被编辑为从链接解决方案中包括修改版本，以将CSV文件纳入列表，假设您在R项目目录中包含所有文件的文件夹称为“数据”。

setwd("./Data")
filenames <- list.files(full.names=TRUE)  
ResultList <- lapply(filenames,function(i){
read.csv(i)})

用于将CSV文件读取到列表

You can use map and bind_rows in order to work with a list and output as a dataframe.

Map (purrr package) takes a list/vector does some function to it and then outputs a list, and then bind_rows (dplyr) can append the elements as a dataframe.

ResultList <-list(results_1, results_2)

sumit <- function(x) {
  summarise(x, across(contains("_p_"), ~ mean(.x > 0.05)))
}

FinalResult <- map(ResultList, ~sumit(.x))

Type1Error <- bind_rows(FinalResult)

You can also do it as a one-liner in map: map(ResultList, ~summarise(.x, across(contains("_p_"), ~ mean(.x > 0.05))))

In order to get all of your files into list format you could use map or lapply.

Edited to include modified version from the linked solution to get csv files into a list assuming you have a folder called "Data" in your R project directory that contains all the files.

setwd("./Data")
filenames <- list.files(full.names=TRUE)  
ResultList <- lapply(filenames,function(i){
read.csv(i)})

Solution for reading csv files into a list

回复收藏 0 原文

~没有更多了~

关于作者

时光暖心i

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

如何使用R将Summarize（）函数的结果放入数据框中？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

陪我终i

别忘他

野心澎湃

蒲公英的约定

。

旧时模样

友情链接

如何使用R将Summarize（）函数的结果放入数据框中？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

陪我终i

别忘他

野心澎湃

蒲公英的约定

。

旧时模样

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。