使用小鼠进行生长曲线模型
我用小鼠估算数据,现在我正在尝试进行增长曲线建模。我正处于评估是否需要多层次建模的阶段,这是我的代码
ICept <-gls(edeqGLOBAL_mean ~ 1, data=Imputed, method = "ML", na.action=na.exclude)
RICept <-lme(edeqGLOBAL_mean ~ 1, data=Imputed, random=~1|ID, method = "ML", na.actioin=na.exclude, control=c(optim="optim"))
,这是我收到的错误消息
as.data.frame.default(数据,可选= true)中的错误
对该怎么做的任何帮助?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
首先,您需要了解多个插补的含义:它为每个缺失价值创建几个插补。因此,
MIDS
对象本质上是数据帧的列表,对于缺失值的插图略有不同。这些归因之间的差异代表了您对丢失数据的不确定性。因为
MIDS
不仅是data.frame
,因此您不能以相同的方式使用它。分析多个插补数据涉及两个步骤:首先将分析应用于每个估算的数据集。第二个根据 rubin的规则这样您将获得总体估计以及包括归档之间差异的标准错误。对于多种统计功能(例如
ml
,glm
,anova
)两个步骤。例如,可以在mids
对象上进行简单的线性回归:现在,对于
nlme :: lme()
and code> andgls()
这些方法不容易实施。您将不得不做一些编程。具体来说,您的代码应涉及以下内容:icept_fun&lt; - function(dat)摘要(gls(edeqglobal_mean〜1,method =“ ml”,data = dat))$系数
一个。从
MIDS
对象中提取数据集:Impute_list&lt; - lapply(1:估算的$ M,函数(i)完成(归为算法,action = i))
b。应用功能:
icept_list&lt; - lapply(Impute_list,icept_fun)
pool.scalar.scalar
fromMICE
池。它是用于汇总您从乘数估算数据集计算的任何正态分布(!)统计量。这也意味着您可能必须改变一些感兴趣的统计信息,然后才能应用pool.scalar
并之后将它们反向转换。例如,相关性应转换为费舍尔的Z。请参阅pool.scalar()
函数需要估算的向量(参数q
)和相应的方差(参数u
),因此您需要重新设计结果列表。这可能看起来也可能不会像这样 - 取决于您的功能icept_fun
:icept_qs&lt; - lapply(icept_list,function(x)x [“(intercept)”,1])
icept_us&lt; - lapply(Icept_list,function(x)x [“(intercept)”,2]^2)#squared se for Variance估算
如果事实证明您需要多级建模,请注意:
多层次数据的多重插奖带来了自己的其他问题。在插补本身中,您应该考虑到数据的多级结构。如果您刚刚应用
小鼠:: MICE()
在您的(长期)数据集中,则这是不是正确的。一种替代方法是制作一个宽幅的数据集,进行多个插补,然后将所得的估算数据集重塑为长格式。在这种情况下,将在上述步骤2a和2b之间进行后整形格式。关于这是否是首选方法,我不知道。有一些很好的来源,例如这个vignette 。First of all, you need to understand what multiple imputation means: It creates several imputation for each missing value. Therefore, the
mids
object is essentially a list of data frames that have slightly different imputations for the missing values. The variance between these imputations represents your uncertainty about the missing data.Because
mids
is not just adata.frame
, you cannot use it the same way. Analyzing multiple imputation data involves two steps: First apply the analysis to each imputed data set. Second aggregate the results (i.e. model coefficients etc.) according to Rubin's rules such that you get an overall estimate as well as standard errors that include the variance between imputations.For several statistical functions (e.g.
ml
,glm
,anova
), themice
package provides an easy implementation of these two steps. A simple linear regression, for instance can be conducted on amids
object like this:Now, for
nlme::lme()
andgls()
these methods are not readily implemented. You will have to do a bit of programming instead. Specifically your code should involve the following:ICept_fun <- function(dat) summary(gls(edeqGLOBAL_mean ~ 1, method="ML", data=dat))$coefficients
a. Extract datasets from the
mids
object:Imputed_list <- lapply(1:Imputed$m, function(i) complete(Imputed, action=i))
b. Apply function:
ICept_list <- lapply(Imputed_list, ICept_fun)
pool.scalar
-function frommice
. It is meant for pooling any normally distributed (!) statistic that you calculated from a multiply imputed dataset. This also means that you may have to transform some of your statistics of interest before you can applypool.scalar
and back-transform them afterwards. For example, correlations should be transformed to Fisher's Z. Please see this useful vignette if you are not sure about your statistics of interest.The
pool.scalar()
function wants vectors of estimates (argumentQ
) and respective variances (argumentU
), so you need to reshape the list of results a bit. This may or may not look like this - depending on your functionICept_fun
:ICept_Qs <- lapply(ICept_list, function(x) x["(Intercept)", 1])
ICept_Us <- lapply(ICept_list, function(x) x["(Intercept)", 2]^2) #squared SE for variance estimate
If it turns out you need multilevel modelling, please be aware:
Multiple imputation of multilevel data brings its very own additional problems. In the imputation itself, you should take the multilevel structure of the data into account. If you just applied
mice::mice()
to your (long-format) dataset, this is not correct. One alternative method is to make a wide-format dataset, conduct multiple imputation, and then reshape the resulting imputed datasets back to long-format. In this case, back-shaping to long format would take place between the steps 2a and 2b that described above. As to whether this is the preferred method, I do not know. There are some good sources about this, for instance this vignette.