通过使用dplyr/tidyverse分组变量来计算项目的内部一致性

发布于 2025-02-13 02:44:08 字数 4406 浏览 1 评论 0原文

我想通过对变量进行分组（例如，age和ratertype）来计算项目的内部一致性（Alpha和Omega）。理想情况下，我可以使用dplyr/tidyverse进行此操作。我的问题类似于另一个问题（使用dplyr嵌套或组两个变量，然后执行cronbach的alpha函数或数据），但是，在我的情况下，我无法获得解决方案。

这是一个最小示例：

library("tidyverse")
library("psych")
library("MBESS")

mydata <- expand.grid(ID = 1:100,
                      age = 1:5,
                      raterType = c("self",
                                    "friend",
                                    "parent"))

set.seed(12345)

mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)

mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA

itemNames <- paste("item", 1:6, sep = "")

要计算整个数据集的内部一致性，我将通过以下代码分别计算Alpha和Omega：

alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est

但是，我想计算age age>的每种组合的Alpha和Omega的每个组合。和ratertype。

这是我的尝试：

mydata %>%
  pivot_longer(cols = c(-age, -raterType, -ID)) %>%
  select(-ID) %>% 
  nest_by(age, raterType) %>%
  mutate(alpha = alpha(data)$total$raw_alpha,
         omega = ci.reliability(data, type = "omega", interval.type = "none")$est)

这引发了一个错误。由于某种原因，该代码提供了欧米茄的错误估计值，并给alpha提供了错误：

> # This provides the wrong estimates:
> mydata %>%
+     pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+     select(-ID) %>% 
+     nest_by(age, raterType) %>%
+     mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise:  age, raterType
     age raterType               data omega
   <int> <fct>     <list<tibble[,2]>> <dbl>
 1     1 self               [600 × 2] 0.218
 2     1 friend             [600 × 2] 0.257
 3     1 parent             [600 × 2] 0.261
 4     2 self               [600 × 2] 0.196
 5     2 friend             [600 × 2] 0.257
 6     2 parent             [600 × 2] 0.209
 7     3 self               [600 × 2] 0.179
 8     3 friend             [600 × 2] 0.225
 9     3 parent             [600 × 2] 0.247
10     4 self               [600 × 2] 0.224
11     4 friend             [600 × 2] 0.252
12     4 parent             [600 × 2] 0.218
13     5 self               [600 × 2] 0.248
14     5 friend             [600 × 2] 0.218
15     5 parent             [600 × 2] 0.202
> 
> # This throws an error:
> mydata %>%
+     pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+     select(-ID) %>% 
+     nest_by(age, raterType) %>%
+     mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased  in order to count frequencies. 
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1. 
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.

上面的欧米茄值与从运行ci.reliability（）在各自的子集中获得的值获得的值不符数据：

> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356

原文

I’d like to calculate the internal consistency (alpha and omega) of items by grouping variables (e.g., age and raterType). Ideally I’d be able to do this using dplyr/tidyverse. My question is similar to another question (Using dplyr to nest or group two variables, then perform the Cronbach's alpha function or other statistics to the data), however I can’t get the solution to work in my case.

Here is a minimal example:

library("tidyverse")
library("psych")
library("MBESS")

mydata <- expand.grid(ID = 1:100,
                      age = 1:5,
                      raterType = c("self",
                                    "friend",
                                    "parent"))

set.seed(12345)

mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)

mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA

itemNames <- paste("item", 1:6, sep = "")

To calculate internal consistency for the entire dataset, I would calculate alpha and omega, respectively, by the following code:

alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est

However, I want to calculate alpha and omega for each combination of age and raterType.

Here's my attempt:

mydata %>%
  pivot_longer(cols = c(-age, -raterType, -ID)) %>%
  select(-ID) %>% 
  nest_by(age, raterType) %>%
  mutate(alpha = alpha(data)$total$raw_alpha,
         omega = ci.reliability(data, type = "omega", interval.type = "none")$est)

This throws an error. For some reason, the code provides incorrect estimates for omega and throws an error for alpha:

> # This provides the wrong estimates:
> mydata %>%
+     pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+     select(-ID) %>% 
+     nest_by(age, raterType) %>%
+     mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise:  age, raterType
     age raterType               data omega
   <int> <fct>     <list<tibble[,2]>> <dbl>
 1     1 self               [600 × 2] 0.218
 2     1 friend             [600 × 2] 0.257
 3     1 parent             [600 × 2] 0.261
 4     2 self               [600 × 2] 0.196
 5     2 friend             [600 × 2] 0.257
 6     2 parent             [600 × 2] 0.209
 7     3 self               [600 × 2] 0.179
 8     3 friend             [600 × 2] 0.225
 9     3 parent             [600 × 2] 0.247
10     4 self               [600 × 2] 0.224
11     4 friend             [600 × 2] 0.252
12     4 parent             [600 × 2] 0.218
13     5 self               [600 × 2] 0.248
14     5 friend             [600 × 2] 0.218
15     5 parent             [600 × 2] 0.202
> 
> # This throws an error:
> mydata %>%
+     pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+     select(-ID) %>% 
+     nest_by(age, raterType) %>%
+     mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased  in order to count frequencies. 
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1. 
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.

The omega values above do not correspond to the values obtained from running the ci.reliability() function on the respective subset of the data:

> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

国产ˉ祖宗 2025-02-20 02:44:08

也许这有帮助

out1 <-  mydata %>%
    group_by(age, raterType) %>%    
     summarise(alpha = alpha(across(all_of(itemNames)))$total$raw_alpha, 
     omega = ci.reliability(across(all_of(itemNames)), 
    type = "omega", interval.type = "none")$est, .groups = 'drop')

- 输出

> out1
# A tibble: 15 × 4
     age raterType   alpha     omega
   <int> <fct>       <dbl>     <dbl>
 1     1 self      -0.135    2.76   
 2     1 friend     0.138    0.231  
 3     1 parent    -0.229  255.     
 4     2 self      -0.421   NA      
 5     2 friend     0.0650  58.7    
 6     2 parent     0.153   NA      
 7     3 self      -0.302    0.00836
 8     3 friend     0.147    0.334  
 9     3 parent     0.196    0.132  
10     4 self      -0.0699  NA      
11     4 friend     0.118    0.214  
12     4 parent    -0.0303  31.1    
13     5 self      -0.0166   0.246  
14     5 friend    -0.192    0.0151 
15     5 parent     0.0847  NA

或可能是此

out2 <- mydata %>%
   nest_by(age, raterType) %>%
   mutate(alpha = alpha(data[, itemNames])$total$raw_alpha, 
   omega = ci.reliability(data[, itemNames], type = "omega", 
    interval.type = "none")$est)

输出

out2
# A tibble: 15 × 5
# Rowwise:  age, raterType
     age raterType               data   alpha     omega
   <int> <fct>     <list<tibble[,7]>>   <dbl>     <dbl>
 1     1 self               [100 × 7] -0.135    2.76   
 2     1 friend             [100 × 7]  0.138    0.231  
 3     1 parent             [100 × 7] -0.229  255.     
 4     2 self               [100 × 7] -0.421   NA      
 5     2 friend             [100 × 7]  0.0650  58.7    
 6     2 parent             [100 × 7]  0.153   NA      
 7     3 self               [100 × 7] -0.302    0.00836
 8     3 friend             [100 × 7]  0.147    0.334  
 9     3 parent             [100 × 7]  0.196    0.132  
10     4 self               [100 × 7] -0.0699  NA      
11     4 friend             [100 × 7]  0.118    0.214  
12     4 parent             [100 × 7] -0.0303  31.1    
13     5 self               [100 × 7] -0.0166   0.246  
14     5 friend             [100 × 7] -0.192    0.0151 
15     5 parent             [100 × 7]  0.0847  NA

Perhaps this helps

out1 <-  mydata %>%
    group_by(age, raterType) %>%    
     summarise(alpha = alpha(across(all_of(itemNames)))$total$raw_alpha, 
     omega = ci.reliability(across(all_of(itemNames)), 
    type = "omega", interval.type = "none")$est, .groups = 'drop')

-output

> out1
# A tibble: 15 × 4
     age raterType   alpha     omega
   <int> <fct>       <dbl>     <dbl>
 1     1 self      -0.135    2.76   
 2     1 friend     0.138    0.231  
 3     1 parent    -0.229  255.     
 4     2 self      -0.421   NA      
 5     2 friend     0.0650  58.7    
 6     2 parent     0.153   NA      
 7     3 self      -0.302    0.00836
 8     3 friend     0.147    0.334  
 9     3 parent     0.196    0.132  
10     4 self      -0.0699  NA      
11     4 friend     0.118    0.214  
12     4 parent    -0.0303  31.1    
13     5 self      -0.0166   0.246  
14     5 friend    -0.192    0.0151 
15     5 parent     0.0847  NA

Or may be this

out2 <- mydata %>%
   nest_by(age, raterType) %>%
   mutate(alpha = alpha(data[, itemNames])$total$raw_alpha, 
   omega = ci.reliability(data[, itemNames], type = "omega", 
    interval.type = "none")$est)

-output

out2
# A tibble: 15 × 5
# Rowwise:  age, raterType
     age raterType               data   alpha     omega
   <int> <fct>     <list<tibble[,7]>>   <dbl>     <dbl>
 1     1 self               [100 × 7] -0.135    2.76   
 2     1 friend             [100 × 7]  0.138    0.231  
 3     1 parent             [100 × 7] -0.229  255.     
 4     2 self               [100 × 7] -0.421   NA      
 5     2 friend             [100 × 7]  0.0650  58.7    
 6     2 parent             [100 × 7]  0.153   NA      
 7     3 self               [100 × 7] -0.302    0.00836
 8     3 friend             [100 × 7]  0.147    0.334  
 9     3 parent             [100 × 7]  0.196    0.132  
10     4 self               [100 × 7] -0.0699  NA      
11     4 friend             [100 × 7]  0.118    0.214  
12     4 parent             [100 × 7] -0.0303  31.1    
13     5 self               [100 × 7] -0.0166   0.246  
14     5 friend             [100 × 7] -0.192    0.0151 
15     5 parent             [100 × 7]  0.0847  NA

回复收藏 0 原文

~没有更多了~