glm.nb非连贯和其他负二项式回归的选项
我正在尝试在R中实现具有负二项式分布的GLM,并有一些问题。以下是我使用的数据 - 请注意,我的预测因子都使用2个标准偏差将全部居中和缩放。
df <- structure(list(offset_var = c(1.28599085563929, 0.56011798945881,
0.910539373132146, 1.40150012655384, 1.99332290786881, 0.431352036569322,
1.9664387276399, 0.39501204122542, 0.837952086514098, 1.49309174747163,
0.185176379335315, 0.340780160418832, 0.23261768703236, 1.9358463846208,
0.683228994093252, 0.766384544663353, 2.36951602293399, 0.570222733062088,
2.035225147398, 2.66190465894079, 2.59441609615926, 2.45851192989865,
9.10131475225225, 1.79085721952421, 1.14341199635633, 1.86464965904907,
0.969684569635742, 1.48586083003075, 0.398395739771472, 1.57819408350658,
0.348242838034098, 1.01779798183851, 1.56020949397414, 1.14211413937976,
1.15472189286642, 2.17372502761175, 2.17372502761175, 1.15472189286642,
0.91480376034087, 0.52428786649856, 0.663761139444733, 1.85426680323653,
0.799109081286816, 1.2592920835499, 1.32010595330908, 1.9473416892704,
1.74079702185659, 1.92731761020335), response_var = c(31, 154,
40, 71, 3, 43, 66, 180, 62, 27, 84, 81, 66, 14, 40, 35, 48, 155,
23, 37, 33, 88, 38, 8, 73, 29, 11, 12, 74, 70, 298, 40, 24, 35,
0, 27, 0, 116, 83, 60, 77, 68, 56, 8, 64, 234, 57, 71), pred1 = c(0.593253968253968,
0.212619047619048, 0.208888888888889, 0.824603174603175, 0.791031746031746,
0.096984126984127, 0.589523809523809, 0.096984126984127, 0.783571428571429,
0.156666666666667, 0, 0.361904761904762, 0.164126984126984, 0.723809523809524,
0.123095238095238, 0.671587301587302, 0.708888888888889, 0.186507936507937,
0.77984126984127, 0.0820634920634921, 0.220079365079365, 1, 0.27984126984127,
0.835793650793651, 0.526031746031746, 0.0298412698412698, 0.641746031746032,
0.0783333333333333, 0.156666666666667, 0.791031746031746, 0.167857142857143,
0.641746031746032, 0.817142857142857, 0.126825396825397, 0.160396825396825,
0.776031746031746, 0.776031746031746, 0.160396825396825, 0.130555555555556,
0.600714285714286, 0.123095238095238, 0.70515873015873, 0.164126984126984,
0.492460317460317, 0.0932539682539683, 0.596984126984127, 0.134285714285714,
0.899206349206349), pred3 = c(0.779445727482678, 0.779445727482678,
0.981524249422633, 0.981524249422633, 0.431293302540416, 0.431293302540416,
0.85796766743649, 0.85796766743649, 0.974018475750578, 0.974018475750578,
0.215357967667437, 0.215357967667437, 0.767898383371826, 0.767898383371826,
0.929561200923788, 0.929561200923788, 0.319861431870669, 0.319861431870669,
0.930138568129331, 0.930138568129331, 1, 1, 0.753464203233256,
0.753464203233256, 0.937644341801385, 0.937644341801385, 0.438799076212472,
0.438799076212472, 0.960161662817552, 0.960161662817552, 0.852193995381064,
0.852193995381064, 0.803117782909932, 0.803117782909932, 0.571593533487299,
0.571593533487299, 0.571593533487299, 0.571593533487299, 0, 0,
0.801385681293304, 0.801385681293304, 0.656466512702079, 0.656466512702079,
0.852771362586606, 0.852771362586606, 0.203810623556583, 0.203810623556583
), pred6 = c(0.688053824835413, 0.540952466204905, 0.628399305213738,
0.6881894451542, 0.780506864063453, 0.596192469087064, 0.664351792193705,
0.601630037911551, 0.725643181043039, 0.617634868802872, 0.811784880984014,
0.804339479465123, 0.576349468957153, 0.847678636187369, 0.625702671539389,
0.719554647550091, 0.363763777714112, 0.446477292582477, 0.474674252997289,
0.480035185899025, 0.4937661725622, 0, 0.513379111687292, 0.903418101749096,
0.695930304479286, 0.617335579129072, 0.759590008590574, 0.654198263624173,
0.638351745840282, 0.758157124611394, 0.616963963666818, 0.749652691590418,
0.626922408721354, 0.587580912470206, 0.430636110291636, 0.349235608213707,
0.349235608213707, 0.430636110291636, 0.723455668440276, 1, 0.615287637567703,
0.696660949537828, 0.570881476208878, 0.51486263616898, 0.599650096633513,
0.661291018750272, 0.44995045321675, 0.317335721700076), pred2 = c(0.0640028225107189,
0.0640028225107189, 1, 1, 0.0467149443570715, 0.0467149443570715,
0.955928169494161, 0.955928169494161, 0.991149754525286, 0.991149754525286,
0.636541945977623, 0.636541945977623, 0.0695282460368243, 0.0695282460368243,
0.945044759518499, 0.945044759518499, 0.666620820800469, 0.666620820800469,
0.865751343981534, 0.865751343981534, 0.158030700783964, 0.158030700783964,
0, 0, 0.941696017987526, 0.941696017987526, 0.194526003575978,
0.194526003575978, 0.974286448958601, 0.974286448958601, 0.182153599598151,
0.182153599598151, 0.834117696305023, 0.834117696305023, 0.729828317197582,
0.729828317197582, 0.729828317197582, 0.729828317197582, 0.477488683047594,
0.477488683047594, 0.914726688872012, 0.914726688872012, 0.0551346373492319,
0.0551346373492319, 0.950905057197701, 0.950905057197701, 0.653584648412039,
0.653584648412039), pred4 = c(0.521197167238095, 0.521197167238095,
0.675165075565519, 0.675165075565519, 0.754750741994489, 0.754750741994489,
0.656214157700682, 0.656214157700682, 0.779025156922262, 0.779025156922262,
0.466656174770789, 0.466656174770789, 1, 1, 0.19412522736884,
0.19412522736884, 0, 0, 0.914127451104004, 0.914127451104004,
0.450341983837599, 0.450341983837599, 0.200288723846985, 0.200288723846985,
0.51864603534987, 0.51864603534987, 0.632983879774839, 0.632983879774839,
0.219950187016786, 0.219950187016786, 0.378371968866928, 0.378371968866928,
0.332552996854781, 0.332552996854781, 0.543573394076179, 0.543573394076179,
0.543573394076179, 0.543573394076179, 0.110505257947263, 0.110505257947263,
0.80446053190608, 0.80446053190608, 0.942739062050578, 0.942739062050578,
0.769130578963835, 0.769130578963835, 0.910376608809312, 0.910376608809312
), pred5 = c(1, 1, 0.277346201087263, 0.277346201087263, 0.977777461589862,
0.977777461589862, 0.690563309054654, 0.690563309054654, 0.439008770827643,
0.439008770827643, 0.819995545023047, 0.819995545023047, 0.579039846892373,
0.579039846892373, 0.365538938318013, 0.365538938318013, 0.0971685384896833,
0.0971685384896833, 0.599718772091419, 0.599718772091419, 0.871837464542416,
0.871837464542416, 0.0550210151699806, 0.0550210151699806, 0.740706048133834,
0.740706048133834, 0.0826796777514591, 0.0826796777514591, 0.52020534667614,
0.52020534667614, 0, 0, 0.660904950317592, 0.660904950317592,
0.790027991312289, 0.790027991312289, 0.790027991312289, 0.790027991312289,
0.206354045553633, 0.206354045553633, 0.600235537123976, 0.600235537123976,
0.829243121962118, 0.829243121962118, 0.595320097430252, 0.595320097430252,
0.17138558268267, 0.17138558268267)), row.names = c(NA, -48L), class = "data.frame")
首先,我尝试使用glm.nb
函数在mass
软件包中运行该模型,但一直在获取非连面警告(glm.fit:算法:算法确实即使将迭代次数增加到默认25(我尝试了50、100、250、1000,甚至5000)之后,也不会收敛
):
> glm1 <- glm.nb(response_var~pred1*pred2 + pred1*pred3 + pred1*pred4 + pred1*pred5 + pred6 + offset(log(offset_var)), data = df, control = glm.control(maxit = 5000))
There were 50 or more warnings (use warnings() to see the first 50) #these are the non-convergence warnings
> summary(glm1)
Call:
glm.nb(formula = response_var ~ pred1 * pred2 + pred1 * pred3 +
pred1 * pred4 + pred1 * pred5 + pred6 + offset(log(offset_var)),
data = df, control = glm.control(maxit = 5000), init.theta = 0.6779942761,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8944 -0.9971 -0.4541 0.2178 1.9029
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.3937 1.2057 6.132 8.65e-10 ***
pred1 -5.2640 2.0354 -2.586 0.0097 **
pred2 -0.9040 0.8356 -1.082 0.2793
pred3 -0.6444 1.0470 -0.616 0.5382
pred4 -1.1811 1.1321 -1.043 0.2968
pred5 -0.2583 1.0225 -0.253 0.8006
pred6 -0.6567 1.0222 -0.642 0.5206
pred1:pred2 2.0800 1.5741 1.321 0.1864
pred1:pred3 1.7663 2.0589 0.858 0.3910
pred1:pred4 0.6586 2.1055 0.313 0.7544
pred1:pred5 0.4524 2.0225 0.224 0.8230
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.791) family taken to be 1)
Null deviance: 68.201 on 47 degrees of freedom
Residual deviance: 57.762 on 37 degrees of freedom
AIC: 541.63
Number of Fisher Scoring iterations: 5000
Theta: 0.678
Std. Err.: 0.124
Warning while fitting theta: alternation limit reached
2 x log-likelihood: -517.634
我知道我试图在样本尺寸的情况下适合一个复杂的模型,所以我假设这就是为什么模型没有收敛的原因(我想适合如此复杂的模型,因为我将使用它作为执行模型平均的全局模型)。但是,然后我尝试了可以适合R中具有负二项式分布的GLM的替代软件包。结果:
library(glmmTMB)
library(brglm2)
> #fit model using glmmTMB
> glm2 <- glmmTMB(response_var~pred1*pred2 + pred1*pred3 + pred1*pred4 + pred1*pred5 + pred6 + offset(log(offset_var)), data = df, family = nbinom2)
> summary(glm2)
Family: nbinom2 ( log )
Formula: response_var ~ pred1 * pred2 + pred1 * pred3 + pred1 * pred4 + pred1 * pred5 + pred6 + offset(log(offset_var))
Data: df
AIC BIC logLik deviance df.resid
538.8 561.3 -257.4 514.8 36
Dispersion parameter for nbinom2 family (): 0.815
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.09772 1.36899 5.185 2.16e-07 ***
pred1 -5.42650 2.23604 -2.427 0.0152 *
pred2 -1.10202 0.79743 -1.382 0.1670
pred3 -0.39245 1.14313 -0.343 0.7314
pred4 -1.33843 1.32065 -1.013 0.3108
pred5 -0.07125 1.09954 -0.065 0.9483
pred6 -0.17567 1.03227 -0.170 0.8649
pred1:pred2 2.80613 1.69120 1.659 0.0971 .
pred1:pred3 1.04996 2.04619 0.513 0.6079
pred1:pred4 1.15544 2.37031 0.487 0.6259
pred1:pred5 -0.10083 2.31516 -0.044 0.9653
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #fit model using brglm2
> glm_3 <- brnb(response_var~pred1*pred2 + pred1*pred3 + pred1*pred4 + pred1*pred5 + pred6 + offset(log(offset_var)), data = df, link = "log", type = "ML", transformation = "identity")
> summary(glm_3)
Call:
brnb(formula = response_var ~ pred1 * pred2 + pred1 * pred3 + pred1 * pred4 + pred1 * pred5 + pred6 + offset(log(offset_var)), data = df, link = "log", type = "ML",
transformation = "identity")
Coefficients (mean model with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.09773 1.18882 5.970 2.37e-09 ***
pred1 -5.42651 2.00939 -2.701 0.00692 **
pred2 -1.10203 0.82354 -1.338 0.18084
pred3 -0.39243 1.03189 -0.380 0.70372
pred4 -1.33854 1.11593 -1.199 0.23034
pred5 -0.07116 1.00813 -0.071 0.94373
pred6 -0.17567 1.00911 -0.174 0.86180
pred1:pred2 2.80608 1.55187 1.808 0.07058 .
pred1:pred3 1.05001 2.03198 0.517 0.60534
pred1:pred4 1.15563 2.07750 0.556 0.57803
pred1:pred5 -0.10104 1.99736 -0.051 0.95965
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter with identity transformation function: 1.2267971)
Null deviance: 81.222 on 47 degrees of freedom
Residual deviance: 57.153 on 37 degrees of freedom
AIC: 538.825
Type of estimator: ML (maximum likelihood)
Number of quasi-Fisher scoring iterations:32
我的问题是:
- 为什么模型不使用
glm.nb
而不是收敛而是使用其他两个软件包收敛?我是否应该怀疑glmmtmb
或brglm2
给定使用glm.nb
的结果? glmmtmb
是迄今为止最快的,所以我想使用该软件包。但是我想知道:鉴于glmmtmb
是用于通用线性混合建模的软件包,在glmmtmb
中拟合模型是否有任何问题?还是唯一的权衡是使用glmmtmb
与brglm2
相比,使用glmmtmb
的系数略高的标准错误?
I'm trying to implement a glm with a negative binomial distribution in R and have a few questions. Here are the data I'm working with - note that my predictors are all centred and scaled using 2 standard deviations.
df <- structure(list(offset_var = c(1.28599085563929, 0.56011798945881,
0.910539373132146, 1.40150012655384, 1.99332290786881, 0.431352036569322,
1.9664387276399, 0.39501204122542, 0.837952086514098, 1.49309174747163,
0.185176379335315, 0.340780160418832, 0.23261768703236, 1.9358463846208,
0.683228994093252, 0.766384544663353, 2.36951602293399, 0.570222733062088,
2.035225147398, 2.66190465894079, 2.59441609615926, 2.45851192989865,
9.10131475225225, 1.79085721952421, 1.14341199635633, 1.86464965904907,
0.969684569635742, 1.48586083003075, 0.398395739771472, 1.57819408350658,
0.348242838034098, 1.01779798183851, 1.56020949397414, 1.14211413937976,
1.15472189286642, 2.17372502761175, 2.17372502761175, 1.15472189286642,
0.91480376034087, 0.52428786649856, 0.663761139444733, 1.85426680323653,
0.799109081286816, 1.2592920835499, 1.32010595330908, 1.9473416892704,
1.74079702185659, 1.92731761020335), response_var = c(31, 154,
40, 71, 3, 43, 66, 180, 62, 27, 84, 81, 66, 14, 40, 35, 48, 155,
23, 37, 33, 88, 38, 8, 73, 29, 11, 12, 74, 70, 298, 40, 24, 35,
0, 27, 0, 116, 83, 60, 77, 68, 56, 8, 64, 234, 57, 71), pred1 = c(0.593253968253968,
0.212619047619048, 0.208888888888889, 0.824603174603175, 0.791031746031746,
0.096984126984127, 0.589523809523809, 0.096984126984127, 0.783571428571429,
0.156666666666667, 0, 0.361904761904762, 0.164126984126984, 0.723809523809524,
0.123095238095238, 0.671587301587302, 0.708888888888889, 0.186507936507937,
0.77984126984127, 0.0820634920634921, 0.220079365079365, 1, 0.27984126984127,
0.835793650793651, 0.526031746031746, 0.0298412698412698, 0.641746031746032,
0.0783333333333333, 0.156666666666667, 0.791031746031746, 0.167857142857143,
0.641746031746032, 0.817142857142857, 0.126825396825397, 0.160396825396825,
0.776031746031746, 0.776031746031746, 0.160396825396825, 0.130555555555556,
0.600714285714286, 0.123095238095238, 0.70515873015873, 0.164126984126984,
0.492460317460317, 0.0932539682539683, 0.596984126984127, 0.134285714285714,
0.899206349206349), pred3 = c(0.779445727482678, 0.779445727482678,
0.981524249422633, 0.981524249422633, 0.431293302540416, 0.431293302540416,
0.85796766743649, 0.85796766743649, 0.974018475750578, 0.974018475750578,
0.215357967667437, 0.215357967667437, 0.767898383371826, 0.767898383371826,
0.929561200923788, 0.929561200923788, 0.319861431870669, 0.319861431870669,
0.930138568129331, 0.930138568129331, 1, 1, 0.753464203233256,
0.753464203233256, 0.937644341801385, 0.937644341801385, 0.438799076212472,
0.438799076212472, 0.960161662817552, 0.960161662817552, 0.852193995381064,
0.852193995381064, 0.803117782909932, 0.803117782909932, 0.571593533487299,
0.571593533487299, 0.571593533487299, 0.571593533487299, 0, 0,
0.801385681293304, 0.801385681293304, 0.656466512702079, 0.656466512702079,
0.852771362586606, 0.852771362586606, 0.203810623556583, 0.203810623556583
), pred6 = c(0.688053824835413, 0.540952466204905, 0.628399305213738,
0.6881894451542, 0.780506864063453, 0.596192469087064, 0.664351792193705,
0.601630037911551, 0.725643181043039, 0.617634868802872, 0.811784880984014,
0.804339479465123, 0.576349468957153, 0.847678636187369, 0.625702671539389,
0.719554647550091, 0.363763777714112, 0.446477292582477, 0.474674252997289,
0.480035185899025, 0.4937661725622, 0, 0.513379111687292, 0.903418101749096,
0.695930304479286, 0.617335579129072, 0.759590008590574, 0.654198263624173,
0.638351745840282, 0.758157124611394, 0.616963963666818, 0.749652691590418,
0.626922408721354, 0.587580912470206, 0.430636110291636, 0.349235608213707,
0.349235608213707, 0.430636110291636, 0.723455668440276, 1, 0.615287637567703,
0.696660949537828, 0.570881476208878, 0.51486263616898, 0.599650096633513,
0.661291018750272, 0.44995045321675, 0.317335721700076), pred2 = c(0.0640028225107189,
0.0640028225107189, 1, 1, 0.0467149443570715, 0.0467149443570715,
0.955928169494161, 0.955928169494161, 0.991149754525286, 0.991149754525286,
0.636541945977623, 0.636541945977623, 0.0695282460368243, 0.0695282460368243,
0.945044759518499, 0.945044759518499, 0.666620820800469, 0.666620820800469,
0.865751343981534, 0.865751343981534, 0.158030700783964, 0.158030700783964,
0, 0, 0.941696017987526, 0.941696017987526, 0.194526003575978,
0.194526003575978, 0.974286448958601, 0.974286448958601, 0.182153599598151,
0.182153599598151, 0.834117696305023, 0.834117696305023, 0.729828317197582,
0.729828317197582, 0.729828317197582, 0.729828317197582, 0.477488683047594,
0.477488683047594, 0.914726688872012, 0.914726688872012, 0.0551346373492319,
0.0551346373492319, 0.950905057197701, 0.950905057197701, 0.653584648412039,
0.653584648412039), pred4 = c(0.521197167238095, 0.521197167238095,
0.675165075565519, 0.675165075565519, 0.754750741994489, 0.754750741994489,
0.656214157700682, 0.656214157700682, 0.779025156922262, 0.779025156922262,
0.466656174770789, 0.466656174770789, 1, 1, 0.19412522736884,
0.19412522736884, 0, 0, 0.914127451104004, 0.914127451104004,
0.450341983837599, 0.450341983837599, 0.200288723846985, 0.200288723846985,
0.51864603534987, 0.51864603534987, 0.632983879774839, 0.632983879774839,
0.219950187016786, 0.219950187016786, 0.378371968866928, 0.378371968866928,
0.332552996854781, 0.332552996854781, 0.543573394076179, 0.543573394076179,
0.543573394076179, 0.543573394076179, 0.110505257947263, 0.110505257947263,
0.80446053190608, 0.80446053190608, 0.942739062050578, 0.942739062050578,
0.769130578963835, 0.769130578963835, 0.910376608809312, 0.910376608809312
), pred5 = c(1, 1, 0.277346201087263, 0.277346201087263, 0.977777461589862,
0.977777461589862, 0.690563309054654, 0.690563309054654, 0.439008770827643,
0.439008770827643, 0.819995545023047, 0.819995545023047, 0.579039846892373,
0.579039846892373, 0.365538938318013, 0.365538938318013, 0.0971685384896833,
0.0971685384896833, 0.599718772091419, 0.599718772091419, 0.871837464542416,
0.871837464542416, 0.0550210151699806, 0.0550210151699806, 0.740706048133834,
0.740706048133834, 0.0826796777514591, 0.0826796777514591, 0.52020534667614,
0.52020534667614, 0, 0, 0.660904950317592, 0.660904950317592,
0.790027991312289, 0.790027991312289, 0.790027991312289, 0.790027991312289,
0.206354045553633, 0.206354045553633, 0.600235537123976, 0.600235537123976,
0.829243121962118, 0.829243121962118, 0.595320097430252, 0.595320097430252,
0.17138558268267, 0.17138558268267)), row.names = c(NA, -48L), class = "data.frame")
First off, I tried running the model using the glm.nb
function in the MASS
package, but kept getting non-convergence warnings (glm.fit: algorithm did not converge
) even after increasing the number of iterations beyond the default 25 (I tried 50, 100, 250, 1000, and even 5000):
> glm1 <- glm.nb(response_var~pred1*pred2 + pred1*pred3 + pred1*pred4 + pred1*pred5 + pred6 + offset(log(offset_var)), data = df, control = glm.control(maxit = 5000))
There were 50 or more warnings (use warnings() to see the first 50) #these are the non-convergence warnings
> summary(glm1)
Call:
glm.nb(formula = response_var ~ pred1 * pred2 + pred1 * pred3 +
pred1 * pred4 + pred1 * pred5 + pred6 + offset(log(offset_var)),
data = df, control = glm.control(maxit = 5000), init.theta = 0.6779942761,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.8944 -0.9971 -0.4541 0.2178 1.9029
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.3937 1.2057 6.132 8.65e-10 ***
pred1 -5.2640 2.0354 -2.586 0.0097 **
pred2 -0.9040 0.8356 -1.082 0.2793
pred3 -0.6444 1.0470 -0.616 0.5382
pred4 -1.1811 1.1321 -1.043 0.2968
pred5 -0.2583 1.0225 -0.253 0.8006
pred6 -0.6567 1.0222 -0.642 0.5206
pred1:pred2 2.0800 1.5741 1.321 0.1864
pred1:pred3 1.7663 2.0589 0.858 0.3910
pred1:pred4 0.6586 2.1055 0.313 0.7544
pred1:pred5 0.4524 2.0225 0.224 0.8230
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.791) family taken to be 1)
Null deviance: 68.201 on 47 degrees of freedom
Residual deviance: 57.762 on 37 degrees of freedom
AIC: 541.63
Number of Fisher Scoring iterations: 5000
Theta: 0.678
Std. Err.: 0.124
Warning while fitting theta: alternation limit reached
2 x log-likelihood: -517.634
I know I am trying to fit a complex model given my sample size, so I am assuming this is why the model is not converging (I want to fit such a complex model because I am going to use it as the global model upon which to perform model averaging). However, I then tried alternative packages that can fit glms with negative binomial distributions in R. When I tried glmmTMB
and brglm2
the models converged and I got the following (very similar) results:
library(glmmTMB)
library(brglm2)
> #fit model using glmmTMB
> glm2 <- glmmTMB(response_var~pred1*pred2 + pred1*pred3 + pred1*pred4 + pred1*pred5 + pred6 + offset(log(offset_var)), data = df, family = nbinom2)
> summary(glm2)
Family: nbinom2 ( log )
Formula: response_var ~ pred1 * pred2 + pred1 * pred3 + pred1 * pred4 + pred1 * pred5 + pred6 + offset(log(offset_var))
Data: df
AIC BIC logLik deviance df.resid
538.8 561.3 -257.4 514.8 36
Dispersion parameter for nbinom2 family (): 0.815
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.09772 1.36899 5.185 2.16e-07 ***
pred1 -5.42650 2.23604 -2.427 0.0152 *
pred2 -1.10202 0.79743 -1.382 0.1670
pred3 -0.39245 1.14313 -0.343 0.7314
pred4 -1.33843 1.32065 -1.013 0.3108
pred5 -0.07125 1.09954 -0.065 0.9483
pred6 -0.17567 1.03227 -0.170 0.8649
pred1:pred2 2.80613 1.69120 1.659 0.0971 .
pred1:pred3 1.04996 2.04619 0.513 0.6079
pred1:pred4 1.15544 2.37031 0.487 0.6259
pred1:pred5 -0.10083 2.31516 -0.044 0.9653
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> #fit model using brglm2
> glm_3 <- brnb(response_var~pred1*pred2 + pred1*pred3 + pred1*pred4 + pred1*pred5 + pred6 + offset(log(offset_var)), data = df, link = "log", type = "ML", transformation = "identity")
> summary(glm_3)
Call:
brnb(formula = response_var ~ pred1 * pred2 + pred1 * pred3 + pred1 * pred4 + pred1 * pred5 + pred6 + offset(log(offset_var)), data = df, link = "log", type = "ML",
transformation = "identity")
Coefficients (mean model with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.09773 1.18882 5.970 2.37e-09 ***
pred1 -5.42651 2.00939 -2.701 0.00692 **
pred2 -1.10203 0.82354 -1.338 0.18084
pred3 -0.39243 1.03189 -0.380 0.70372
pred4 -1.33854 1.11593 -1.199 0.23034
pred5 -0.07116 1.00813 -0.071 0.94373
pred6 -0.17567 1.00911 -0.174 0.86180
pred1:pred2 2.80608 1.55187 1.808 0.07058 .
pred1:pred3 1.05001 2.03198 0.517 0.60534
pred1:pred4 1.15563 2.07750 0.556 0.57803
pred1:pred5 -0.10104 1.99736 -0.051 0.95965
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter with identity transformation function: 1.2267971)
Null deviance: 81.222 on 47 degrees of freedom
Residual deviance: 57.153 on 37 degrees of freedom
AIC: 538.825
Type of estimator: ML (maximum likelihood)
Number of quasi-Fisher scoring iterations:32
My questions are:
- Why would the model not converge using
glm.nb
but converge using the other two packages? Should I be suspicious of the results fromglmmTMB
orbrglm2
given the issues with convergence in usingglm.nb
? glmmTMB
was by far the fastest so I'd like to use that package going forward. However I am wondering: given thatglmmTMB
is a package for generalized linear mixed modelling, is there any issue with fitting a model inglmmTMB
without random effects? Or would the only tradeoff be the slightly higher standard errors for the coefficients fit usingglmmTMB
compared with those frombrglm2
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用GLM.NB()生成的参数估计值和相关统计量应与仅使用相同名称包装的GLMMTMB()函数获得的glmmtmb()函数相同,仅使用相同的固定效果。但是,我只是遇到了两个软件包之间模型系数和AIC值不同的情况。通常他们同意。
GLMMTMB的一件非常酷的事情是,它允许您考虑泊松家族的替代扩展,例如负二项式I型(NB1,family = nbinom1),平均参数化的Conway-Maxwell-Poisson(family = Compois)和广义泊松(family = genpois)。您可能需要查看这些其他错误结构,以建模您的(我认为)计数数据。
另外,您是否尝试使用偏移,而是按原样使用此协变量,然后为您的预测选择该协变量范围的恒定值(例如,平均值)?
祝你好运。
The parameter estimates and related statistics that are produced with glm.nb() should be the same as the one you get from using the glmmTMB() function of the package of the same name using only the same fixed effects. However, I have just came across a situation where both the model coefficients and AIC values differed between the two packages. Normally they agree.
A pretty cool thing about glmmTMB is that it allows you to consider alternative extensions of the Poisson family, such as the negative binomial type I (NB1, family=nbinom1), the mean-parameterized Conway-Maxwell-Poisson (family=compois) and the Generalized Poisson (family=genpois). You may want to have a look at these additional error structures to model your (I presume) count data.
Also, have you tried to not use an offset, but rather use this covariate as is and then, for your predictions, choose a constant value (e.g., mean) in the range of values for this covariate?
Good luck.