将估算的 2 级数据(小鼠)与非估算的 1 级数据合并,以使用 brms 进行多级分析
我正在使用 R mice
包来估算一些参与者随机缺失的问卷项目值。随后,使用 brms 将问卷的总分用于多级模型中,作为任务(多次试验,第 1 级)中反应时间的预测因子(第 2 级)。
我已经尝试了两种不同的方法来创建一个包含所有数据的 mids 对象,稍后可以在 brms_multiple 中使用,但到目前为止还没有成功:
1.)我将数据帧分开,将项目值估算为问卷数据框,创建一个长格式的数据框,包括原始数据和所有插补(使用 complete
函数),并计算每个参与者在每个插补中的总分(使用 rowSums< /代码>)。之后,我将这个长数据帧与 1 级反应时间数据连接起来(使用
full_join
),并尝试将其转换为 mids 对象(as.mids
)。然而,鉴于由于加入而出现多次出现 .id,这是不可行的。
2.) 我在插补之前加入了数据框,并尝试通过使用 miceadds
扩展 mice
来仅插补 2 级调查问卷。在这里,我通过预测矩阵仅将项目得分定义为预测变量,2lonly.function
作为方法,正确的插补函数和 ID 作为聚类变量。这导致 edit.setup(data, setup, ...) 中出现错误:`mice` 检测到常量和/或共线变量。删除后没有留下任何预测变量。
有人遇到过类似的问题并且可以解决它们吗?
--- 编辑:这是方法 1 的可重现示例(我的首选)
#So this is a fake dataset for the level 1 data:
data1 <- structure(list(participant = structure(1:20, .Label = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"), class = "factor"),
scale1 = c(20.5176893097081, 17.1907529978866, NA, NA, 23.0900118234823,
16.825451016666, 17.9720180052918, 28.4363035263208, 26.0191098441877,
26.1444447937135, NA, 25.091133563164, 10.3353758051478,
18.0322232007671, 14.1767794585022, 20.9102922916395, 20.6239907650613,
17.661597152285, 18.3255223659322, 18.9958533053766),
scale2 = c(23.8446274459682,
NA, 13.3562256053306, 8.52823315494693, 18.3034641524201,
17.1100738924451, 20.0295218831116, 15.6986473122548, 14.9647149797442,
32.1875950434602, 25.255823725488, NA, 15.2625337013248,
17.6354282904461, 5.86783073951034, NA, 16.3987924521716,
11.3574747700045, 18.3557569542574, 18.741406021827)),
row.names = c(NA,
-20L), class = "data.frame")
#This is for the level 2 data:
data2 <- structure(list(participant = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 18L, 18L,
18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L),
.Label = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"), class = "factor"),
RT = c(416, 389, 383, 411, 354, 404, 354, 433, 411, 408,
339, 368, 474, 407, 411, 366, 401, 427, 415, 376, 398, 393,
391, 483, 466, 427, 372, 380, 360, 383, 374, 412, 412, 394,
403, 387, 427, 383, 362, 402, 397, 445, 393, 407, 450, 381,
395, 428, 423, 423, 435, 404, 405, 426, 392, 408, 383, 371,
409, 422, 386, 412, 420, 353, 429, 350, 395, 428, 428, 437,
423, 475, 444, 369, 360, 429, 365, 379, 391, 446, 405, 360,
354, 399, 428, 403, 432, 392, 394, 448, 474, 411, 398, 373,
415, 333, 401, 395, 403, 429, 344, 426, 391, 394, 456, 371,
339, 409, 373, 389, 384, 408, 436, 359, 394, 440, 415, 418,
401, 379, 330, 452, 388, 388, 315, 389, 399, 403, 344, 441,
404, 409, 357, 369, 385, 385, 452, 370, 436, 371, 403, 459,
466, 408, 451, 393, 355, 362, 418, 440, 360, 377, 400, 390,
369, 414, 390, 368, 381, 387, 386, 415, 387, 374, 442, 405,
441, 395, 420, 431, 435, 438, 420, 412, 391, 408, 409, 413,
371, 447, 392, 385, 421, 377, 419, 437, 401, 392, 431, 491,
412, 399, 446, 408, 369, 387, 372, 428, 389, 401)),
row.names = c(NA,
-200L), class = "data.frame")
# run imputation on level 1 data
imputed <- mice(data1)
#create dataframe with all imputation + sum scores of scales (each participant)
data1_imputed <- complete(imputed, action = "long", include = TRUE)
data1_imputed$sumscore <- rowSums(data1_imputed[c("scale1", "scale2")])
# merge imputed level 1 data with level 2 data
data_all <- dplyr::full_join(data1_imputed, data2)
# try to create mids object with merged data - NOT WORKING
merged_imputed <- as.mids(data_all)```
I'm using the R mice
package to impute random missing questionnaire item values for a few participants. The sum score of the questionnaire is later used in a multilevel model as predictor (level 2) of reaction times in a task (multiple trials, level 1), using brms.
I already tried two different approaches to create a mids object which includes all data and can later be used in brms_multiple
but none worked so far:
1.) I kept the data frames separate, imputed the item values in the questionnaire data frame, created a data frame in long format including the original data and all imputations (using the complete
function) and calculated the sum scores for each participant in each imputation (using rowSums
). Afterwards, I joined this long data frame with the level-1 reaction time data (using full_join
) and tried to convert it in a mids object (as.mids
). This was, however, not feasible given the multiple occurrences of .id which emerged due to the joining.
2.) I joined the data frames before imputation and tried to impute only the level-2 questionnaire by extending mice
with miceadds
. Here, I defined only the item scores as predictors via the predictor matrix, 2lonly.function
as method,the correct imputation function and ID as cluster variable. This resulted in Error in edit.setup(data, setup, ...) : `mice` detected constant and/or collinear variables. No predictors were left after their removal.
Did anyone experience similar issues and could solve them?
--- edit: here is a reproducible example for method 1 (my preferred one)
#So this is a fake dataset for the level 1 data:
data1 <- structure(list(participant = structure(1:20, .Label = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"), class = "factor"),
scale1 = c(20.5176893097081, 17.1907529978866, NA, NA, 23.0900118234823,
16.825451016666, 17.9720180052918, 28.4363035263208, 26.0191098441877,
26.1444447937135, NA, 25.091133563164, 10.3353758051478,
18.0322232007671, 14.1767794585022, 20.9102922916395, 20.6239907650613,
17.661597152285, 18.3255223659322, 18.9958533053766),
scale2 = c(23.8446274459682,
NA, 13.3562256053306, 8.52823315494693, 18.3034641524201,
17.1100738924451, 20.0295218831116, 15.6986473122548, 14.9647149797442,
32.1875950434602, 25.255823725488, NA, 15.2625337013248,
17.6354282904461, 5.86783073951034, NA, 16.3987924521716,
11.3574747700045, 18.3557569542574, 18.741406021827)),
row.names = c(NA,
-20L), class = "data.frame")
#This is for the level 2 data:
data2 <- structure(list(participant = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L,
9L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L,
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 13L,
13L, 13L, 13L, 13L, 13L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L,
14L, 14L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 17L, 17L, 17L, 17L,
17L, 17L, 17L, 17L, 17L, 17L, 18L, 18L, 18L, 18L, 18L, 18L, 18L,
18L, 18L, 18L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L, 19L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L),
.Label = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20"), class = "factor"),
RT = c(416, 389, 383, 411, 354, 404, 354, 433, 411, 408,
339, 368, 474, 407, 411, 366, 401, 427, 415, 376, 398, 393,
391, 483, 466, 427, 372, 380, 360, 383, 374, 412, 412, 394,
403, 387, 427, 383, 362, 402, 397, 445, 393, 407, 450, 381,
395, 428, 423, 423, 435, 404, 405, 426, 392, 408, 383, 371,
409, 422, 386, 412, 420, 353, 429, 350, 395, 428, 428, 437,
423, 475, 444, 369, 360, 429, 365, 379, 391, 446, 405, 360,
354, 399, 428, 403, 432, 392, 394, 448, 474, 411, 398, 373,
415, 333, 401, 395, 403, 429, 344, 426, 391, 394, 456, 371,
339, 409, 373, 389, 384, 408, 436, 359, 394, 440, 415, 418,
401, 379, 330, 452, 388, 388, 315, 389, 399, 403, 344, 441,
404, 409, 357, 369, 385, 385, 452, 370, 436, 371, 403, 459,
466, 408, 451, 393, 355, 362, 418, 440, 360, 377, 400, 390,
369, 414, 390, 368, 381, 387, 386, 415, 387, 374, 442, 405,
441, 395, 420, 431, 435, 438, 420, 412, 391, 408, 409, 413,
371, 447, 392, 385, 421, 377, 419, 437, 401, 392, 431, 491,
412, 399, 446, 408, 369, 387, 372, 428, 389, 401)),
row.names = c(NA,
-200L), class = "data.frame")
# run imputation on level 1 data
imputed <- mice(data1)
#create dataframe with all imputation + sum scores of scales (each participant)
data1_imputed <- complete(imputed, action = "long", include = TRUE)
data1_imputed$sumscore <- rowSums(data1_imputed[c("scale1", "scale2")])
# merge imputed level 1 data with level 2 data
data_all <- dplyr::full_join(data1_imputed, data2)
# try to create mids object with merged data - NOT WORKING
merged_imputed <- as.mids(data_all)```
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论