使用Sapply同时计算几列的统计数据

发布于 2025-01-23 20:17:13 字数 1193 浏览 1 评论 0原文

我有一个数据框,如下所示:

# A tibble: 6 x 4
   Placebo    High  Medium      Low
     <dbl>   <dbl>   <dbl>    <dbl>
1  0.0400  -0.04    0.0100  0.0100 
2  0.04     0      -0.0100  0.04   
3  0.0200  -0.1    -0.05   -0.0200 
4  0.03    -0.0200  0.03   -0.00700
5 -0.00500 -0.0100  0.0200  0.0100 
6  0.0300  -0.0100 NA      NA  

您可以使用cohen.d()函数从Effsize软件包中获得两个列的CoHensD:

df <- data.frame(Placebo = c(0.0400, 0.04, 0.0200, 0.03, -0.00500, 0.0300),
                 Low = c(-0.04, 0, -0.1, -0.0200,  -0.0100, -0.0100),
                 Medium = c(0.0100, -0.0100, -0.05, 0.03,  0.0200, NA ),
                 High = c(0.0100, 0.04, -0.0200, -0.00700, 0.0100, NA))

library(effsize)
cohen.d(as.vector(na.omit(df$Placebo)), as.vector(na.omit(df$High)))

有趣的是,我在此代码中遇到以下错误:

数据中的错误[,组]:不正确的尺寸数量

但是,我想创建一个函数,使您可以在其中一个列和其余部分之间获得所有cohensd。

为了使所有列的cohensd违反安慰剂,我们会使用类似的东西:

sapply(df, function(i) cohen.d(pull(df, as.vector(na.omit(!!Placebo))), as.vector(na.omit(i))))

但是我不确定这是否有效。

编辑:我不想删除完整的行,因为可以针对不同的长度向量计算Cohens D。理想情况下,我想独立删除每列的NA统计数据

I have a dataframe as follows:

# A tibble: 6 x 4
   Placebo    High  Medium      Low
     <dbl>   <dbl>   <dbl>    <dbl>
1  0.0400  -0.04    0.0100  0.0100 
2  0.04     0      -0.0100  0.04   
3  0.0200  -0.1    -0.05   -0.0200 
4  0.03    -0.0200  0.03   -0.00700
5 -0.00500 -0.0100  0.0200  0.0100 
6  0.0300  -0.0100 NA      NA  

You could get the cohensD for two of the columns using the cohen.d() function from the effsize package:

df <- data.frame(Placebo = c(0.0400, 0.04, 0.0200, 0.03, -0.00500, 0.0300),
                 Low = c(-0.04, 0, -0.1, -0.0200,  -0.0100, -0.0100),
                 Medium = c(0.0100, -0.0100, -0.05, 0.03,  0.0200, NA ),
                 High = c(0.0100, 0.04, -0.0200, -0.00700, 0.0100, NA))

library(effsize)
cohen.d(as.vector(na.omit(df$Placebo)), as.vector(na.omit(df$High)))

Interestingly enough, I'm getting the following error with this code:

Error in data[, group] : incorrect number of dimensions

However, I would like to create a function that allows you to obtain all the cohensd between one of the columns and the rest of them.

In order to get the cohensD of all columns against the Placebo we would use something like:

sapply(df, function(i) cohen.d(pull(df, as.vector(na.omit(!!Placebo))), as.vector(na.omit(i))))

But I'm not sure this would work anyway.

Edit: I don't want to erase the full row, as cohens d can be computed for different length vectors. Ideally, I would like to get the stat with the NA removed for each column independetly

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

囚你心 2025-01-30 20:17:13

可以分别删除每个列上的na

library(dplyr)
library(effsize)
df %>%   
  summarise(across(Low:High, ~ list({
             i1 <- complete.cases(Placebo)& complete.cases(.x)
             cohen.d(Placebo[i1], .x[i1])})))

最好通过创建逻辑索引和“安慰剂”,或者我们想使用lapply/sapply , 以外的列

lapply(df[-1], function(x) {
          x1 <- na.omit(cbind(df$Placebo, x))
          cohen.d(x1[,1], x1[,2])
})

安慰剂输出

$Low

Cohen's d

d estimate: 1.947312 (large)
95 percent confidence interval:
    lower     upper 
0.3854929 3.5091319 


$Medium

Cohen's d

d estimate: 0.9622504 (large)
95 percent confidence interval:
     lower      upper 
-0.5782851  2.5027860 


$High

Cohen's d

d estimate: 0.8884639 (large)
95 percent confidence interval:
     lower      upper 
-0.6402419  2.4171697 

It may be better to remove the NA on each of the columns separately by creating a logical index along with 'Placebo'

library(dplyr)
library(effsize)
df %>%   
  summarise(across(Low:High, ~ list({
             i1 <- complete.cases(Placebo)& complete.cases(.x)
             cohen.d(Placebo[i1], .x[i1])})))

Or if we want to use lapply/sapply, loop over the columns other than Placebo

lapply(df[-1], function(x) {
          x1 <- na.omit(cbind(df$Placebo, x))
          cohen.d(x1[,1], x1[,2])
})

-output

$Low

Cohen's d

d estimate: 1.947312 (large)
95 percent confidence interval:
    lower     upper 
0.3854929 3.5091319 


$Medium

Cohen's d

d estimate: 0.9622504 (large)
95 percent confidence interval:
     lower      upper 
-0.5782851  2.5027860 


$High

Cohen's d

d estimate: 0.8884639 (large)
95 percent confidence interval:
     lower      upper 
-0.6402419  2.4171697 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文