使用Sapply同时计算几列的统计数据

发布于 2025-01-23 20:17:13 字数 1193 浏览 1 评论 0原文

我有一个数据框，如下所示：

# A tibble: 6 x 4
   Placebo    High  Medium      Low
     <dbl>   <dbl>   <dbl>    <dbl>
1  0.0400  -0.04    0.0100  0.0100 
2  0.04     0      -0.0100  0.04   
3  0.0200  -0.1    -0.05   -0.0200 
4  0.03    -0.0200  0.03   -0.00700
5 -0.00500 -0.0100  0.0200  0.0100 
6  0.0300  -0.0100 NA      NA

您可以使用cohen.d（）函数从Effsize软件包中获得两个列的CoHensD：

df <- data.frame(Placebo = c(0.0400, 0.04, 0.0200, 0.03, -0.00500, 0.0300),
                 Low = c(-0.04, 0, -0.1, -0.0200,  -0.0100, -0.0100),
                 Medium = c(0.0100, -0.0100, -0.05, 0.03,  0.0200, NA ),
                 High = c(0.0100, 0.04, -0.0200, -0.00700, 0.0100, NA))

library(effsize)
cohen.d(as.vector(na.omit(df$Placebo)), as.vector(na.omit(df$High)))

有趣的是，我在此代码中遇到以下错误：

数据中的错误[，组]：不正确的尺寸数量

但是，我想创建一个函数，使您可以在其中一个列和其余部分之间获得所有cohensd。

为了使所有列的cohensd违反安慰剂，我们会使用类似的东西：

sapply(df, function(i) cohen.d(pull(df, as.vector(na.omit(!!Placebo))), as.vector(na.omit(i))))

但是我不确定这是否有效。

编辑：我不想删除完整的行，因为可以针对不同的长度向量计算Cohens D。理想情况下，我想独立删除每列的NA统计数据

原文

I have a dataframe as follows:

# A tibble: 6 x 4
   Placebo    High  Medium      Low
     <dbl>   <dbl>   <dbl>    <dbl>
1  0.0400  -0.04    0.0100  0.0100 
2  0.04     0      -0.0100  0.04   
3  0.0200  -0.1    -0.05   -0.0200 
4  0.03    -0.0200  0.03   -0.00700
5 -0.00500 -0.0100  0.0200  0.0100 
6  0.0300  -0.0100 NA      NA

You could get the cohensD for two of the columns using the cohen.d() function from the effsize package:

df <- data.frame(Placebo = c(0.0400, 0.04, 0.0200, 0.03, -0.00500, 0.0300),
                 Low = c(-0.04, 0, -0.1, -0.0200,  -0.0100, -0.0100),
                 Medium = c(0.0100, -0.0100, -0.05, 0.03,  0.0200, NA ),
                 High = c(0.0100, 0.04, -0.0200, -0.00700, 0.0100, NA))

library(effsize)
cohen.d(as.vector(na.omit(df$Placebo)), as.vector(na.omit(df$High)))

Interestingly enough, I'm getting the following error with this code:

Error in data[, group] : incorrect number of dimensions

However, I would like to create a function that allows you to obtain all the cohensd between one of the columns and the rest of them.

In order to get the cohensD of all columns against the Placebo we would use something like:

sapply(df, function(i) cohen.d(pull(df, as.vector(na.omit(!!Placebo))), as.vector(na.omit(i))))

But I'm not sure this would work anyway.

Edit: I don't want to erase the full row, as cohens d can be computed for different length vectors. Ideally, I would like to get the stat with the NA removed for each column independetly

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

囚你心 2025-01-30 20:17:13

可以分别删除每个列上的na

library(dplyr)
library(effsize)
df %>%   
  summarise(across(Low:High, ~ list({
             i1 <- complete.cases(Placebo)& complete.cases(.x)
             cohen.d(Placebo[i1], .x[i1])})))

最好通过创建逻辑索引和“安慰剂”，或者我们想使用lapply/sapply ，以外的列

lapply(df[-1], function(x) {
          x1 <- na.omit(cbind(df$Placebo, x))
          cohen.d(x1[,1], x1[,2])
})

安慰剂输出

$Low

Cohen's d

d estimate: 1.947312 (large)
95 percent confidence interval:
    lower     upper 
0.3854929 3.5091319 


$Medium

Cohen's d

d estimate: 0.9622504 (large)
95 percent confidence interval:
     lower      upper 
-0.5782851  2.5027860 


$High

Cohen's d

d estimate: 0.8884639 (large)
95 percent confidence interval:
     lower      upper 
-0.6402419  2.4171697

It may be better to remove the NA on each of the columns separately by creating a logical index along with 'Placebo'

library(dplyr)
library(effsize)
df %>%   
  summarise(across(Low:High, ~ list({
             i1 <- complete.cases(Placebo)& complete.cases(.x)
             cohen.d(Placebo[i1], .x[i1])})))

Or if we want to use lapply/sapply, loop over the columns other than Placebo

lapply(df[-1], function(x) {
          x1 <- na.omit(cbind(df$Placebo, x))
          cohen.d(x1[,1], x1[,2])
})

-output

$Low

Cohen's d

d estimate: 1.947312 (large)
95 percent confidence interval:
    lower     upper 
0.3854929 3.5091319 


$Medium

Cohen's d

d estimate: 0.9622504 (large)
95 percent confidence interval:
     lower      upper 
-0.5782851  2.5027860 


$High

Cohen's d

d estimate: 0.8884639 (large)
95 percent confidence interval:
     lower      upper 
-0.6402419  2.4171697

回复收藏 0 原文

~没有更多了~