通过使用dplyr/tidyverse分组变量来计算项目的内部一致性
我想通过对变量进行分组(例如,age
和ratertype
)来计算项目的内部一致性(Alpha和Omega)。理想情况下,我可以使用dplyr/tidyverse进行此操作。我的问题类似于另一个问题(使用dplyr嵌套或组两个变量,然后执行cronbach的alpha函数或数据),但是,在我的情况下,我无法获得解决方案。
这是一个最小示例:
library("tidyverse")
library("psych")
library("MBESS")
mydata <- expand.grid(ID = 1:100,
age = 1:5,
raterType = c("self",
"friend",
"parent"))
set.seed(12345)
mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA
itemNames <- paste("item", 1:6, sep = "")
要计算整个数据集的内部一致性,我将通过以下代码分别计算Alpha和Omega:
alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est
但是,我想计算age age>的每种组合的Alpha和Omega
的每个组合。和ratertype
。
这是我的尝试:
mydata %>%
pivot_longer(cols = c(-age, -raterType, -ID)) %>%
select(-ID) %>%
nest_by(age, raterType) %>%
mutate(alpha = alpha(data)$total$raw_alpha,
omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
这引发了一个错误。由于某种原因,该代码提供了欧米茄的错误估计值,并给alpha提供了错误:
> # This provides the wrong estimates:
> mydata %>%
+ pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+ select(-ID) %>%
+ nest_by(age, raterType) %>%
+ mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise: age, raterType
age raterType data omega
<int> <fct> <list<tibble[,2]>> <dbl>
1 1 self [600 × 2] 0.218
2 1 friend [600 × 2] 0.257
3 1 parent [600 × 2] 0.261
4 2 self [600 × 2] 0.196
5 2 friend [600 × 2] 0.257
6 2 parent [600 × 2] 0.209
7 3 self [600 × 2] 0.179
8 3 friend [600 × 2] 0.225
9 3 parent [600 × 2] 0.247
10 4 self [600 × 2] 0.224
11 4 friend [600 × 2] 0.252
12 4 parent [600 × 2] 0.218
13 5 self [600 × 2] 0.248
14 5 friend [600 × 2] 0.218
15 5 parent [600 × 2] 0.202
>
> # This throws an error:
> mydata %>%
+ pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+ select(-ID) %>%
+ nest_by(age, raterType) %>%
+ mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased in order to count frequencies.
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1.
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.
上面的欧米茄值与从运行ci.reliability()
在各自的子集中获得的值获得的值不符数据:
> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356
I’d like to calculate the internal consistency (alpha and omega) of items by grouping variables (e.g., age
and raterType
). Ideally I’d be able to do this using dplyr/tidyverse. My question is similar to another question (Using dplyr to nest or group two variables, then perform the Cronbach's alpha function or other statistics to the data), however I can’t get the solution to work in my case.
Here is a minimal example:
library("tidyverse")
library("psych")
library("MBESS")
mydata <- expand.grid(ID = 1:100,
age = 1:5,
raterType = c("self",
"friend",
"parent"))
set.seed(12345)
mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA
itemNames <- paste("item", 1:6, sep = "")
To calculate internal consistency for the entire dataset, I would calculate alpha and omega, respectively, by the following code:
alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est
However, I want to calculate alpha and omega for each combination of age
and raterType
.
Here's my attempt:
mydata %>%
pivot_longer(cols = c(-age, -raterType, -ID)) %>%
select(-ID) %>%
nest_by(age, raterType) %>%
mutate(alpha = alpha(data)$total$raw_alpha,
omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
This throws an error. For some reason, the code provides incorrect estimates for omega and throws an error for alpha:
> # This provides the wrong estimates:
> mydata %>%
+ pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+ select(-ID) %>%
+ nest_by(age, raterType) %>%
+ mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise: age, raterType
age raterType data omega
<int> <fct> <list<tibble[,2]>> <dbl>
1 1 self [600 × 2] 0.218
2 1 friend [600 × 2] 0.257
3 1 parent [600 × 2] 0.261
4 2 self [600 × 2] 0.196
5 2 friend [600 × 2] 0.257
6 2 parent [600 × 2] 0.209
7 3 self [600 × 2] 0.179
8 3 friend [600 × 2] 0.225
9 3 parent [600 × 2] 0.247
10 4 self [600 × 2] 0.224
11 4 friend [600 × 2] 0.252
12 4 parent [600 × 2] 0.218
13 5 self [600 × 2] 0.248
14 5 friend [600 × 2] 0.218
15 5 parent [600 × 2] 0.202
>
> # This throws an error:
> mydata %>%
+ pivot_longer(cols = c(-age, -raterType, -ID)) %>%
+ select(-ID) %>%
+ nest_by(age, raterType) %>%
+ mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased in order to count frequencies.
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1.
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.
The omega values above do not correspond to the values obtained from running the ci.reliability()
function on the respective subset of the data:
> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
也许这有帮助
- 输出
或可能是此
输出
Perhaps this helps
-output
Or may be this
-output