如何使用R将Summarize()函数的结果放入数据框中?
这个问题来自(如何将summarize()函数的结果放入r 中)
,我认为我认为我没有很好地传达我的问题。 因此,我添加了更多详细信息。
我做了一个最小的可重现示例,但是我的真实数据确实很大
a_p_ <-c(0.1, 0.3, 0.03, 0.03)
b_p_ <-c(0.2, 0.003, 0.1, 0.00001)
c_2<-c(1,2,5,23)
c_p_<-c(0.001, 0.002,0.002,0.00001)
results_1<-data.frame(a_p_,b_p_,c_2,c_p_)
a_p_ <-c(0.3, 0.02, 0.43, 0.44)
b_p_ <-c(0.00002, 0.3, 0.8, 0.005)
c_2 <-c(88,4,55,88)
c_p_<-c(0.1, 0.002,0.002,0.1)
results_2<-data.frame(a_p_,b_p_,c_2,c_p_)
,所以我有两个数据集。一个是“结果_1”,另一个是“结果_2” 但是,这只是一个可再现的数据集。 在我的真实数据集中,我有200个结果文件。 (从“结果_1”到“结果_200”)
然后,我想创建新的dataframe(数据帧名称为type1error) 其中包含以下示例。
更具体地说,我希望这是我的新DataFrame(type1error)的第一行
> results_1 %>%
+ summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
a_p_ b_p_ c_p_
1 0.5 0.5 0
,这是我的数据框架的第二行(type 1错误),
> results_2 %>%
+ summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
a_p_ b_p_ c_p_
1 0.75 0.5 0.5
所以我做的是..
# make empty holder
type1error<-as.data.frame(matrix(nrow = 2))
for(i in 1:2){
# read the data
if(i==1){
results<-results_1
}
if(i==2){
results<-results_2
}
# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
type1error$conditions[i] <- i
}
但是我收到了这样的警告消息,结果似乎不是我所期望的 (总结每一行的结果)
Warning messages:
1: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.5, b_p_ = 0.5, :
provided 3 variables to replace 2 variables
2: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.75, b_p_ = 0.5, :
provided 3 variables to replace 2 variables
如何解决此问题?
以下代码不是此示例数据集,而是我的真实数据集 会产生相同的错误。
#FYI, Not reproducible, but the code that I did use for my real, huge,data is as follows:
ncond<-200
#empty holder
type1error<-as.data.frame(matrix(nrow = ncond))
for(i in 1:ncond){
# read the data
results <- read.csv(paste0("model_results/results_",i,".csv"))
# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
type1error$conditions[i] <- i
}
# one csv file in type 1 error rate
# fixed
write.csv(type1error,"type1error/type1error.csv")
#and this code chunk did not work well.
我感谢上一个问题页中的所有答案!
在上一个问题网页的答案中,这全是 “结果_1”和“结果_2”,因为我可重复的示例只有两个数据集。
但是,实际上,我有200个数据集 (从“结果_1”到“结果_200” ..),
我必须制作一个新的数据框架,而不是列表。
This question is from (how to put the results of summarise() function into the dataframe in r)
in the previous question, I think I did not convey my question well.
so, I added more details.
I made a minimal reproducible example, but my real data is really huge
a_p_ <-c(0.1, 0.3, 0.03, 0.03)
b_p_ <-c(0.2, 0.003, 0.1, 0.00001)
c_2<-c(1,2,5,23)
c_p_<-c(0.001, 0.002,0.002,0.00001)
results_1<-data.frame(a_p_,b_p_,c_2,c_p_)
a_p_ <-c(0.3, 0.02, 0.43, 0.44)
b_p_ <-c(0.00002, 0.3, 0.8, 0.005)
c_2 <-c(88,4,55,88)
c_p_<-c(0.1, 0.002,0.002,0.1)
results_2<-data.frame(a_p_,b_p_,c_2,c_p_)
so, I have two dataset. the one is "results_1" and the other is "results_2"
But, this is just an reproducible dataset.
In my real dataset, I have 200 results files.
(from "results_1" to "results_200")
and then, I want to create new dataframe (data frame name is type1error)
that contains the following examples.
More specific, I want this to be the first row of my new dataframe (type1error)
> results_1 %>%
+ summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
a_p_ b_p_ c_p_
1 0.5 0.5 0
and this to be my second row of my dataframe (type 1 error)
> results_2 %>%
+ summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
a_p_ b_p_ c_p_
1 0.75 0.5 0.5
so what I did is..
# make empty holder
type1error<-as.data.frame(matrix(nrow = 2))
for(i in 1:2){
# read the data
if(i==1){
results<-results_1
}
if(i==2){
results<-results_2
}
# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
type1error$conditions[i] <- i
}
but I got warning message like this, and the results does not seems to be what I was expected
(summarise results for each row)
Warning messages:
1: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.5, b_p_ = 0.5, :
provided 3 variables to replace 2 variables
2: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.75, b_p_ = 0.5, :
provided 3 variables to replace 2 variables
How can I fix this?
The below code is not for this example dataset, but for my real dataset
which generates the same error.
#FYI, Not reproducible, but the code that I did use for my real, huge,data is as follows:
ncond<-200
#empty holder
type1error<-as.data.frame(matrix(nrow = ncond))
for(i in 1:ncond){
# read the data
results <- read.csv(paste0("model_results/results_",i,".csv"))
# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
type1error$conditions[i] <- i
}
# one csv file in type 1 error rate
# fixed
write.csv(type1error,"type1error/type1error.csv")
#and this code chunk did not work well.
I appreciate all the answers in the previous question page!
In the answer from the previous question webpage, it is all for
"results_1" and "results_2",becuase my reproducible example have only two dataset.
However, in reality, I have 200 dataset
(from "results_1" to "results_200"..),
and I have to make a new dataframe, not a list.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用MAP和BIND_ROWS来将列表和输出作为数据框架使用。
MAP(purrr package)获取列表/矢量对其有些函数,然后输出列表,然后bind_rows(dplyr)可以将元素作为数据框架附加。
您也可以在映射中作为单线操作:
map(结果列表,〜汇总(.x,tocress(contains(“ _ p_”),〜表示(.x&gt; 0.05)))
为了将所有文件放入列表格式,您可以使用地图或lapply。
被编辑为从链接解决方案中包括修改版本,以将CSV文件纳入列表,假设您在R项目目录中包含所有文件的文件夹称为“数据”。
用于将CSV文件读取到列表
You can use map and bind_rows in order to work with a list and output as a dataframe.
Map (purrr package) takes a list/vector does some function to it and then outputs a list, and then bind_rows (dplyr) can append the elements as a dataframe.
You can also do it as a one-liner in map:
map(ResultList, ~summarise(.x, across(contains("_p_"), ~ mean(.x > 0.05))))
In order to get all of your files into list format you could use map or lapply.
Edited to include modified version from the linked solution to get csv files into a list assuming you have a folder called "Data" in your R project directory that contains all the files.
Solution for reading csv files into a list