用“时间”拟合广义线性混合效应模型是否有效？作为连续固定效应，但分类随机效应？

发布于 2025-01-16 07:43:15 字数 680 浏览 5 评论 0 原文

我想使用 R 包 glmmTMB 中的 AR1 相关结构来拟合广义线性混合效应模型。检查文档后

...似乎“时间”必须作为随机效应分量的一个因素进行拟合，即

glmmTMB(y ~ ar1(time_fac + 0 | group), data=dat0, family=binomial(link = "logit"))

但是，我想知道拟合这种模型是否有意义：

glmmTMB(y ~ ns(time_cont,3) + ar1(time_fac + 0 | group), 
    data=dat0, family=binomial(link = "logit"))

这里我将固定效应时间指定为连续的自然样条，同时保留时间作为一个因素对于 ar1 成分。该模型确实适合，但我不确定它是否有意义。另外，我还没有真正看到为什么时间必须编码为 AR1 分量的一个因素的解释，以及为什么我们不能拟合随机斜率？

原文

I would like to fit a generalized linear mixed effect model using an AR1 correlation structure in the R package glmmTMB. After checking the documentation

... it seems that "time" must be fitted as a factor for the random effects component i.e.

glmmTMB(y ~ ar1(time_fac + 0 | group), data=dat0, family=binomial(link = "logit"))

However, I wondered if it makes sense to fit this kind of model:

glmmTMB(y ~ ns(time_cont,3) + ar1(time_fac + 0 | group), 
    data=dat0, family=binomial(link = "logit"))

Here I have specified the fixed effect time to be a continuous natural spline whilst retaining time as a factor for the ar1 component. The model does fit, but I am not sure if it makes sense. Also I have not really seen an explanation of WHY time must be coded as a factor for the AR1 component, and why we cannot fit a random slope?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

酒解孤独 2025-01-23 07:43:15

总的来说，我认为这个问题对于 CrossValidated 更好，但它太旧了，无法迁移。问题的各个组成部分是统计建模和计算实施问题的混合体。

同时包含平滑（自然样条）组件和 AR1 组件的模型有意义吗？ 理论上是的；我在下面展示了一个模拟并拟合此类模型的示例。然而，模型拟合方面有一些注意事项：
- 在某些情况下，样条曲线和自回归项等平滑项将捕获整体变化的相似部分，并可能会遇到可识别性问题
- 我会特别担心尝试使用伯努利 (0/1) 响应（下面的示例使用高斯），每次观察的数据很少。我可能会选择包含先前观察的固定效果（即 logit(prob(y(t))) = b0 + b1*y(t-1))
为什么必须时间被指定为 AR1 项中的一个因子？ glmmTMB 协方差结构小插图中的部分href="https://cran.r-project.org/web/packages/glmmTMB/vignettes/covstruct.html#construction-of-structed-covariance-matrices" rel="nofollow noreferrer">构造结构化协方差矩阵< /a> 详细解释了如何定义 AR1（和其他）结构化模型的协方差矩阵；它应该解释为什么时间必须是一个因素，以及为什么需要抑制拦截。
为什么我们不能指定一个随机斜率 AR1 项？我不知道如何在数学上定义具有随机斜率的 AR1 模型，或者它意味着什么......

模拟示例

dd <- expand.grid(time_cont = 1:100, group = factor(1:10)) |>
    transform(time_fac = factor(time_cont))
library(glmmTMB)
form <- y ~ splines::ns(time_cont,3) + ar1(time_fac + 0 | group)
dd$y <- simulate_new(form[-2],
            newdata = dd,
            newparams = list(beta = 0:3,
                             theta = c(1,2),
                             betad = 1),
            seed = 101)[[1]]
glmmTMB(form, data=dd, family=gaussian)

Overall I think this question is better for CrossValidated, but it is too old to migrate. The individual components of the question are a mixture of statistical modeling and computational implementation questions.

does a model including both a smooth (natural spline) component and an AR1 component make sense? Yes, in theory; I show an example below of simulating from, and then fitting, such a model. However, there are some caveats on the model-fitting side:
- in some cases, smooth terms like splines and autoregressive terms will be capturing similar parts of the overall variation and may run into identifiability problems
- I would be especially worried about trying this with a Bernoulli (0/1) response (the example below uses a Gaussian), which has very little data per observation. I might choose to include a fixed effect of the previous observation instead (i.e. logit(prob(y(t))) = b0 + b1*y(t-1))
why must time be specified as a factor in an AR1 term? the section in the glmmTMB covariance-structure vignette on construction of structured covariance matrices explains in detail how the covariance matrices for AR1 (and other) structured models are defined; it should explain why time has to be a factor, and why the intercept needs to be suppressed.
why can't we specify a random-slopes AR1 term? I don't know how an AR1 model with random slopes would be defined, mathematically, or what it would mean ...

simulated example

dd <- expand.grid(time_cont = 1:100, group = factor(1:10)) |>
    transform(time_fac = factor(time_cont))
library(glmmTMB)
form <- y ~ splines::ns(time_cont,3) + ar1(time_fac + 0 | group)
dd$y <- simulate_new(form[-2],
            newdata = dd,
            newparams = list(beta = 0:3,
                             theta = c(1,2),
                             betad = 1),
            seed = 101)[[1]]
glmmTMB(form, data=dd, family=gaussian)

回复收藏 0 原文

~没有更多了~