如何通过观察提取 lmer 固定效应？

发布于 2024-12-23 15:56:01 字数 4615 浏览 5 评论 0原文

我有一个 lme 对象，它是根据一些重复测量的营养摄入数据（每个 RespondentID 两个 24 小时摄入期）构建的：

Male.lme2 <- lmer(BoxCoxXY ~ -1 + AgeFactor + IntakeDay + (1|RespondentID),
    data = Male.Data, 
    weights = SampleWeight)

并且我可以使用 ranef(Male. lme1）。我还想通过RespondentID收集固定效应的结果。 coef(Male.lme1) 没有完全提供我所需要的内容，如下所示。

> summary(Male.lme1)
Linear mixed model fit by REML 
Formula: BoxCoxXY ~ AgeFactor + IntakeDay + (1 | RespondentID) 
   Data: Male.Data 
  AIC   BIC logLik deviance REMLdev
  9994 10039  -4990     9952    9980
Random effects:
 Groups       Name        Variance Std.Dev.
 RespondentID (Intercept) 0.19408  0.44055 
 Residual                 0.37491  0.61230 
Number of obs: 4498, groups: RespondentID, 2249

Fixed effects:
                    Estimate Std. Error t value
(Intercept)         13.98016    0.03405   410.6
AgeFactor4to8        0.50572    0.04084    12.4
AgeFactor9to13       0.94329    0.04159    22.7
AgeFactor14to18      1.30654    0.04312    30.3
IntakeDayDay2Intake -0.13871    0.01809    -7.7

Correlation of Fixed Effects:
            (Intr) AgFc48 AgF913 AF1418
AgeFactr4t8 -0.775                     
AgeFctr9t13 -0.761  0.634              
AgFctr14t18 -0.734  0.612  0.601       
IntkDyDy2In -0.266  0.000  0.000  0.000

我已将拟合结果附加到我的数据中，head(Male.Data) 显示

   NutrientID RespondentID Gender Age SampleWeight  IntakeDay IntakeAmt AgeFactor BoxCoxXY  lmefits
2         267       100020      1  12    0.4952835 Day1Intake 12145.852     9to13 15.61196 15.22633
7         267       100419      1  14    0.3632839 Day1Intake  9591.953    14to18 15.01444 15.31373
8         267       100459      1  11    0.4952835 Day1Intake  7838.713     9to13 14.51458 15.00062
12        267       101138      1  15    1.3258785 Day1Intake 11113.266    14to18 15.38541 15.75337
14        267       101214      1   6    2.1198688 Day1Intake  7150.133      4to8 14.29022 14.32658
18        267       101389      1   5    2.1198688 Day1Intake  5091.528      4to8 13.47928 14.58117

coef(Male.lme1) 的前几行是：

$RespondentID
       (Intercept) AgeFactor4to8 AgeFactor9to13 AgeFactor14to18 IntakeDayDay2Intake
100020    14.28304     0.5057221      0.9432941        1.306542          -0.1387098
100419    14.00719     0.5057221      0.9432941        1.306542          -0.1387098
100459    14.05732     0.5057221      0.9432941        1.306542          -0.1387098
101138    14.44682     0.5057221      0.9432941        1.306542          -0.1387098
101214    13.82086     0.5057221      0.9432941        1.306542          -0.1387098
101389    14.07545     0.5057221      0.9432941        1.306542          -0.1387098

演示 < code>coef 结果与 Male.Data 中的拟合估计值相关（对于第一个 RespondentID，使用 Male.Data$lmefits <-fitting(Male.lme1) 获取该估计值有年龄因素 9-13 级： - 拟合值为15.22633，等于 - 来自系数 - (Intercept) + (AgeFactor9-13) = 14.28304 + 0.9432941

有一个聪明的命令供我使用吗这将自动实现我想要的，即提取每个主题的固定效应估计，或者我面临一系列 if 语句在扣除截距的随机效应贡献后，尝试将正确的年龄因子水平应用于每个受试者以获得正确的固定效应估计？

更新，抱歉，试图减少我提供的输出并忘记了 str()。输出为：

>str(Male.Data)
'data.frame':   4498 obs. of  11 variables:
 $ NutrientID  : int  267 267 267 267 267 267 267 267 267 267 ...
 $ RespondentID: Factor w/ 2249 levels "100020","100419",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Gender      : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Age         : int  12 14 11 15 6 5 10 2 2 9 ...
 $ BodyWeight  : num  51.6 46.3 46.1 63.2 28.4 18 38.2 14.4 14.6 32.1 ...
 $ SampleWeight: num  0.495 0.363 0.495 1.326 2.12 ...
 $ IntakeDay   : Factor w/ 2 levels "Day1Intake","Day2Intake": 1 1 1 1 1 1 1 1 1 1 ...
 $ IntakeAmt   : num  12146 9592 7839 11113 7150 ...
 $ AgeFactor   : Factor w/ 4 levels "1to3","4to8",..: 3 4 3 4 2 2 3 1 1 3 ...
 $ BoxCoxXY    : num  15.6 15 14.5 15.4 14.3 ...
 $ lmefits     : num  15.2 15.3 15 15.8 14.3 ...

未使用 BodyWeight 和 Gender（这是男性数据，因此所有 Gender 值都相同），并且 NutrientID 对于数据也同样固定。

自从我发布以来，我一直在做可怕的 ifelse 语句，所以我会立即尝试你的建议。 :)

Update2：这与我当前的数据完美配合，并且应该能够适应新数据的未来，感谢 DWin 在评论中提供的额外帮助。 :)

AgeLevels <- length(unique(Male.Data$AgeFactor))
Temp <- as.data.frame(fixef(Male.lme1)['(Intercept)'] + 
c(0,fixef(Male.lme1)[2:AgeLevels])[
      match(Male.Data$AgeFactor, c("1to3", "4to8", "9to13","14to18",  "19to30","31to50","51to70","71Plus") )] + 
c(0,fixef(Male.lme1)[(AgeLevels+1)])[
      match(Male.Data$IntakeDay, c("Day1Intake","Day2Intake") )])
names(Temp) <- c("FxdEffct")

原文

I have a lme object, constructed from some repeated measures nutrient intake data (two 24-hour intake periods per RespondentID):

Male.lme2 <- lmer(BoxCoxXY ~ -1 + AgeFactor + IntakeDay + (1|RespondentID),
    data = Male.Data, 
    weights = SampleWeight)

and I can successfully retrieve the random effects by RespondentID using ranef(Male.lme1). I would also like to collect the result of the fixed effects by RespondentID. coef(Male.lme1) does not provide exactly what I need, as I show below.

> summary(Male.lme1)
Linear mixed model fit by REML 
Formula: BoxCoxXY ~ AgeFactor + IntakeDay + (1 | RespondentID) 
   Data: Male.Data 
  AIC   BIC logLik deviance REMLdev
  9994 10039  -4990     9952    9980
Random effects:
 Groups       Name        Variance Std.Dev.
 RespondentID (Intercept) 0.19408  0.44055 
 Residual                 0.37491  0.61230 
Number of obs: 4498, groups: RespondentID, 2249

Fixed effects:
                    Estimate Std. Error t value
(Intercept)         13.98016    0.03405   410.6
AgeFactor4to8        0.50572    0.04084    12.4
AgeFactor9to13       0.94329    0.04159    22.7
AgeFactor14to18      1.30654    0.04312    30.3
IntakeDayDay2Intake -0.13871    0.01809    -7.7

Correlation of Fixed Effects:
            (Intr) AgFc48 AgF913 AF1418
AgeFactr4t8 -0.775                     
AgeFctr9t13 -0.761  0.634              
AgFctr14t18 -0.734  0.612  0.601       
IntkDyDy2In -0.266  0.000  0.000  0.000

I have appended the fitted results to my data, head(Male.Data) shows

   NutrientID RespondentID Gender Age SampleWeight  IntakeDay IntakeAmt AgeFactor BoxCoxXY  lmefits
2         267       100020      1  12    0.4952835 Day1Intake 12145.852     9to13 15.61196 15.22633
7         267       100419      1  14    0.3632839 Day1Intake  9591.953    14to18 15.01444 15.31373
8         267       100459      1  11    0.4952835 Day1Intake  7838.713     9to13 14.51458 15.00062
12        267       101138      1  15    1.3258785 Day1Intake 11113.266    14to18 15.38541 15.75337
14        267       101214      1   6    2.1198688 Day1Intake  7150.133      4to8 14.29022 14.32658
18        267       101389      1   5    2.1198688 Day1Intake  5091.528      4to8 13.47928 14.58117

The first couple of lines from coef(Male.lme1) are:

$RespondentID
       (Intercept) AgeFactor4to8 AgeFactor9to13 AgeFactor14to18 IntakeDayDay2Intake
100020    14.28304     0.5057221      0.9432941        1.306542          -0.1387098
100419    14.00719     0.5057221      0.9432941        1.306542          -0.1387098
100459    14.05732     0.5057221      0.9432941        1.306542          -0.1387098
101138    14.44682     0.5057221      0.9432941        1.306542          -0.1387098
101214    13.82086     0.5057221      0.9432941        1.306542          -0.1387098
101389    14.07545     0.5057221      0.9432941        1.306542          -0.1387098

To demonstrate how the coef results relate to the fitted estimates in Male.Data (which were grabbed using Male.Data$lmefits <- fitted(Male.lme1), for the first RespondentID, who has the AgeFactor level 9-13:
- the fitted value is 15.22633, which equals - from the coeffs - (Intercept) + (AgeFactor9-13) = 14.28304 + 0.9432941

Is there a clever command for me to use that will do want I want automatically, which is to extract the fixed effect estimate for each subject, or am I faced with a series of if statements trying to apply the correct AgeFactor level to each subject to get the correct fixed effect estimate, after deducting the random effect contribution off the Intercept?

Update, apologies, was trying to cut down on the output I was providing and forgot about str(). Output is:

>str(Male.Data)
'data.frame':   4498 obs. of  11 variables:
 $ NutrientID  : int  267 267 267 267 267 267 267 267 267 267 ...
 $ RespondentID: Factor w/ 2249 levels "100020","100419",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Gender      : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Age         : int  12 14 11 15 6 5 10 2 2 9 ...
 $ BodyWeight  : num  51.6 46.3 46.1 63.2 28.4 18 38.2 14.4 14.6 32.1 ...
 $ SampleWeight: num  0.495 0.363 0.495 1.326 2.12 ...
 $ IntakeDay   : Factor w/ 2 levels "Day1Intake","Day2Intake": 1 1 1 1 1 1 1 1 1 1 ...
 $ IntakeAmt   : num  12146 9592 7839 11113 7150 ...
 $ AgeFactor   : Factor w/ 4 levels "1to3","4to8",..: 3 4 3 4 2 2 3 1 1 3 ...
 $ BoxCoxXY    : num  15.6 15 14.5 15.4 14.3 ...
 $ lmefits     : num  15.2 15.3 15 15.8 14.3 ...

The BodyWeight and Gender aren't being used (this is the males data, so all the Gender values are the same) and the NutrientID is similarly fixed for the data.

I have been doing horrible ifelse statements sinced I posted, so will try out your suggestion immediately. :)

Update2: this works perfectly with my current data and should be future-proof for new data, thanks to DWin for the extra help in the comment for this. :)

AgeLevels <- length(unique(Male.Data$AgeFactor))
Temp <- as.data.frame(fixef(Male.lme1)['(Intercept)'] + 
c(0,fixef(Male.lme1)[2:AgeLevels])[
      match(Male.Data$AgeFactor, c("1to3", "4to8", "9to13","14to18",  "19to30","31to50","51to70","71Plus") )] + 
c(0,fixef(Male.lme1)[(AgeLevels+1)])[
      match(Male.Data$IntakeDay, c("Day1Intake","Day2Intake") )])
names(Temp) <- c("FxdEffct")

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

苍风燃霜 2024-12-30 15:56:01

下面是我一直发现在 lme4 包中提取个人固定效应和随机效应组件的最简单方法。它实际上提取了每个观察结果的相应拟合。假设我们有一个混合效应模型：

y = Xb + Zu + e

其中 Xb 是固定效应，Zu 是随机效应，我们可以提取组件（以 lme4 的 sleepstudy 为例）：

library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

# Xb 
fix <- getME(fm1,'X') %*% fixef(fm1)
# Zu
ran <- t(as.matrix(getME(fm1,'Zt'))) %*% unlist(ranef(fm1))
# Xb + Zu
fixran <- fix + ran

我知道这是从线性混合效应模型中提取成分的通用方法。对于非线性模型，模型矩阵 X 包含重复，您可能需要稍微修改上面的代码。以下是一些验证输出以及使用点阵的可视化：

> head(cbind(fix, ran, fixran, fitted(fm1)))
         [,1]      [,2]     [,3]     [,4]
[1,] 251.4051  2.257187 253.6623 253.6623
[2,] 261.8724 11.456439 273.3288 273.3288
[3,] 272.3397 20.655691 292.9954 292.9954
[4,] 282.8070 29.854944 312.6619 312.6619
[5,] 293.2742 39.054196 332.3284 332.3284
[6,] 303.7415 48.253449 351.9950 351.9950

# Xb + Zu
> all(round((fixran),6) == round(fitted(fm1),6))
[1] TRUE

# e = y - (Xb + Zu)
> all(round(resid(fm1),6) == round(sleepstudy[,"Reaction"]-(fixran),6))
[1] TRUE

nobs <- 10 # 10 observations per subject
legend = list(text=list(c("y", "Xb + Zu", "Xb")), lines = list(col=c("blue", "red", "black"), pch=c(1,1,1), lwd=c(1,1,1), type=c("b","b","b")))
require(lattice)
xyplot(
    Reaction ~ Days | Subject, data = sleepstudy,
    panel = function(x, y, ...){
        panel.points(x, y, type='b', col='blue')
        panel.points(x, fix[(1+nobs*(panel.number()-1)):(nobs*(panel.number()))], type='b', col='black')
        panel.points(x, fixran[(1+nobs*(panel.number()-1)):(nobs*(panel.number()))], type='b', col='red')
    },
    key = legend
)

在此处输入图像描述

Below is how I've always found it easiest to extract the individuals' fixed effects and random effects components in the lme4-package. It actually extracts the corresponding fit to each observation. Assuming we have a mixed-effects model of form:

y = Xb + Zu + e

where Xb are the fixed effects and Zu are the random effects, we can extract the components (using lme4's sleepstudy as an example):

library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

# Xb 
fix <- getME(fm1,'X') %*% fixef(fm1)
# Zu
ran <- t(as.matrix(getME(fm1,'Zt'))) %*% unlist(ranef(fm1))
# Xb + Zu
fixran <- fix + ran

I know that this works as a generalized approach to extracting components from linear mixed-effects models. For non-linear models, the model matrix X contains repeats and you may have to tailor the above code a bit. Here's some validation output as well as a visualization using lattice:

> head(cbind(fix, ran, fixran, fitted(fm1)))
         [,1]      [,2]     [,3]     [,4]
[1,] 251.4051  2.257187 253.6623 253.6623
[2,] 261.8724 11.456439 273.3288 273.3288
[3,] 272.3397 20.655691 292.9954 292.9954
[4,] 282.8070 29.854944 312.6619 312.6619
[5,] 293.2742 39.054196 332.3284 332.3284
[6,] 303.7415 48.253449 351.9950 351.9950

# Xb + Zu
> all(round((fixran),6) == round(fitted(fm1),6))
[1] TRUE

# e = y - (Xb + Zu)
> all(round(resid(fm1),6) == round(sleepstudy[,"Reaction"]-(fixran),6))
[1] TRUE

nobs <- 10 # 10 observations per subject
legend = list(text=list(c("y", "Xb + Zu", "Xb")), lines = list(col=c("blue", "red", "black"), pch=c(1,1,1), lwd=c(1,1,1), type=c("b","b","b")))
require(lattice)
xyplot(
    Reaction ~ Days | Subject, data = sleepstudy,
    panel = function(x, y, ...){
        panel.points(x, y, type='b', col='blue')
        panel.points(x, fix[(1+nobs*(panel.number()-1)):(nobs*(panel.number()))], type='b', col='black')
        panel.points(x, fixran[(1+nobs*(panel.number()-1)):(nobs*(panel.number()))], type='b', col='red')
    },
    key = legend
)

enter image description here

回复收藏 0 原文

涙—继续流 2024-12-30 15:56:01

它会是这样的（尽管您确实应该给我们 str(Male.Data) 的结果，因为模型输出不会告诉我们基线值的因子水平

#First look at the coefficients
fixef(Male.lme2)

#Then do the calculations
fixef(Male.lme2)[`(Intercept)`] + 
c(0,fixef(Male.lme2)[2:4])[
          match(Male.Data$AgeFactor, c("1to3", "4to8", "9to13","14to18") )] + 
c(0,fixef(Male.lme2)[5])[
          match(Male.Data$IntakeDay, c("Day1Intake","Day2Intake") )]

:)基本上通过 match 函数运行原始数据来选择正确的系数添加到截距中...如果数据是因子的基本水平（我猜测其拼写），则该系数将为 0在。）

编辑：我刚刚注意到你在公式中输入“-1”，这样也许所有的 AgeFactor 项都会在输出中列出，并且您可以找出系数向量中的 0 以及匹配表向量中发明的 AgeFactor 级别。

It is going to be something like this (although you really should have given us the results of str(Male.Data) because model output does not tell us the factor levels for the baseline values:)

#First look at the coefficients
fixef(Male.lme2)

#Then do the calculations
fixef(Male.lme2)[`(Intercept)`] + 
c(0,fixef(Male.lme2)[2:4])[
          match(Male.Data$AgeFactor, c("1to3", "4to8", "9to13","14to18") )] + 
c(0,fixef(Male.lme2)[5])[
          match(Male.Data$IntakeDay, c("Day1Intake","Day2Intake") )]

You are basically running the original data through a match function to pick the correct coefficient(s) to add to the intercept ... which will be 0 if the data is the factor's base level (whose spelling I am guessing at.)

EDIT: I just noticed that you put a "-1" in the formula so perhaps all of your AgeFactor terms are listed in the output and you can tale out the 0 in the coefficient vector and the invented AgeFactor level in the match table vector.

回复收藏 0 原文

~没有更多了~