具有随时间自相关的 GLS / GLM 嵌套设计

发布于 2025-01-12 10:53:43 字数 718 浏览 0 评论 0原文

对 GLM 仍然相当陌生，并且对如何建立我的模型有点困惑。

关于我的项目：

我从 9 棵树 (=tree1_cat) 样本的根系中采样了微生物组（并测量了多样性指数值 = Shannon）。

在每棵树中，我对细根和粗根（=rootpart）进行了采样，并且在一个季节的过程中对每棵树进行了四次采样（=天）。因此，我有一个嵌套设计，但必须记住自相关的时间。而且并非所有值都存在，因此我有一些缺失值）。到目前为止，我已经尝试并测试了以下内容：

Model <- gls(Shannon ~ tree1_cat/rootpart + tree1_cat + days,  
     na.action = na.omit, data = psL.meta, 
     correlation = corAR1(form =~ 1|days), 
     weights = varIdent(form= ~ 1|days))

此外，我还尝试获得更多见解并使用 anova(Model) 来获取这些因素的 p 值。我可以使用这些 p 值吗？另外，我还使用 emmeans(Model, specs =pairwise ~ rootpart) 进行成对比较，但由于 rootpart 是作为嵌套因子输入的，所以它只提供了配对交互。

这一切都有效，但我不确定这是否是正确的模型！任何帮助将不胜感激！

原文

Still fairly new to GLM and a bit confused about how to establish my model.

About my project:

I sampled the microbiome (and measured a diversity index value = Shannon) from the root system of a sample of 9 trees (=tree1_cat).

In each tree I sampled fine and thick roots (=rootpart) and each tree was sampled four times (=days) over the course of one season. Thus I have a nested design but have to keep time in mind for autocorrelation. Also not all values are present, thus I have a few missing values). So far I have tried and tested the following:

Model <- gls(Shannon ~ tree1_cat/rootpart + tree1_cat + days,  
     na.action = na.omit, data = psL.meta, 
     correlation = corAR1(form =~ 1|days), 
     weights = varIdent(form= ~ 1|days))

Furthermore I've tried to get more insight and used anova(Model) to get the p-values of those factors. Am I allowed to use those p-values? Also I've used emmeans(Model, specs = pairwise ~ rootpart)for pairwise comparisons but since rootpart was entered as nested factor it only gives me the paired interactions.

It all works, but I am not sure, whether this is the right model! Any help would be highly appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

琉璃梦幻 2025-01-19 10:53:43

了解您的科学问题会很有帮助，但假设您对细根和粗根之间的香农多样性以及时间趋势的差异感兴趣。您可以使用的模型是：

library(lmerTest)
lmer(Shannon ~ rootpart*days + (rootpart*days|tree1_cat), data = ...)

固定效应组件 rootpart*days 可以扩展为 1 + rootpart + days + rootpart:days （其中 1< /code> 表示截距）

截距：第 0 天细根的 SD（希望这是季节的开始）
rootpart：第 0 天细根和粗根之间的差异
days：每天的变化（以 SD 表示）细根（斜率）
rootpart:days 粗根和细根之间的斜率差异

随机效应分量(rootpart*days|tree1_cat) 测量所有四个部分的表现这些影响的差异因树木而异，而且它们之间的相关性也不同（例如，第 0 天细根和粗根之间差异大于平均水平的树木，随着时间的推移，细根 SD 的变化是否也会大于平均水平？ )

这个“最大”随机效应模型几乎肯定对于您的数据来说过于复杂；根据粗略的经验法则，每个估计参数应该有 10-20 个数据点，固定效应模型需要 4 个参数。具有 4 个随机效应的完整模型需要估计 4×4 协方差矩阵，该矩阵本身具有 (4*5)/2 = 10 个参数。我可能只是尝试 (1+days|tree1_cat) （随机斜率）或 (rootpart|tree_cat) （细与粗差异中的树差异），倾向于允许您主要感兴趣的效果的变化（例如，如果您的主要问题是关于细与粗，那么请使用 (rootpart|tree_cat) 。

我可能不会担心自相关根本不，也不按天计算的异方差性（您的 varIdent(~1|days) 术语），除非这些模式在数据中非常明显。

如果您想允许自相关，则需要使用 来拟合模型。自相关模型的机制）；

library(nlme)
lme(Shannon ~ rootpart*days,
    random = ~days|tree1_cat,
    data = ...,
    correlation = corCAR1(form = ~days|tree1_cat)
)

>nlme::lme 或 glmmTMB （lmer 仍然没有用于对于不均匀采样的数据，corCAR1（连续时间自回归阶数 1）而不是更常见的 corAR1 请注意，lme 更加挑剔/。不擅长处理单一模型，因此您可能会发现需要简化模型才能真正运行该模型。

It would be helpful to know your scientific question, but let's suppose you're interested in differences in Shannon diversity between fine and thick roots and in time trends. A model you could use would be:

library(lmerTest)
lmer(Shannon ~ rootpart*days + (rootpart*days|tree1_cat), data = ...)

The fixed-effect component rootpart*days can be expanded into 1 + rootpart + days + rootpart:days (where 1 signifies the intercept)

intercept: SD in fine roots on day 0 (hopefully that's the beginning of the season)
rootpart: difference between fine and thick roots on day 0
days: change per day in SD in fine roots (slope)
rootpart:days difference in slope between thick roots and fine roots

The random-effect component (rootpart*days|tree1_cat) measures how all four of these effects vary across trees, and their correlations (e.g. do trees with a larger-than-average difference between fine and thick roots on day 0 also have a larger-than-average change over time in fine root SD?)

This 'maximal' random effects model is almost certainly too complex for your data; a rough rule of thumb says you should have 10-20 data points per parameter estimated, the fixed-effect model takes 4 parameters. A full model with 4 random effects requires the estimate of a 4×4 covariance matrix, which has (4*5)/2 = 10 parameters all by itself. I might just try (1+days|tree1_cat) (random slopes) or (rootpart|tree_cat) (among-tree difference in fine vs. thick differences), with a bias towards allowing for the variation in the effect that is your primary interest (e.g. if your primary question is about fine vs. thick then go with (rootpart|tree_cat).

I probably wouldn't worry about autocorrelation at all, nor heteroscedasticity by day (your varIdent(~1|days) term) unless those patterns are very strongly evident in the data.

If you want to allow for autocorrelation you'll need to fit the model with nlme::lme or glmmTMB (lmer still doesn't have machinery for autocorrelation models); something like

library(nlme)
lme(Shannon ~ rootpart*days,
    random = ~days|tree1_cat,
    data = ...,
    correlation = corCAR1(form = ~days|tree1_cat)
)

You need to use corCAR1 (continuous-time autoregressive order-1) rather than the more common corAR1 for unevenly sampled data. Be aware that lme is more finicky/worse at dealing with singular models, so you may discover you need to simplify your model before you can actually get this model to run.

回复收藏 0 原文

~没有更多了~