使用小鼠进行生长曲线模型

发布于 2025-01-27 18:55:00 字数 410 浏览 3 评论 0 原文

我用小鼠估算数据，现在我正在尝试进行增长曲线建模。我正处于评估是否需要多层次建模的阶段，这是我的代码

ICept <-gls(edeqGLOBAL_mean ~ 1, data=Imputed, method = "ML", na.action=na.exclude)
RICept <-lme(edeqGLOBAL_mean ~ 1, data=Imputed, random=~1|ID, method = "ML", na.actioin=na.exclude, control=c(optim="optim"))

，这是我收到的错误消息

as.data.frame.default（数据，可选= true）中的错误

对该怎么做的任何帮助？

原文

I used MICE to impute data, and now I am trying to do growth curve modeling. I'm in the stage of assessing if there is a need for multlivevel modelling Here is my code

ICept <-gls(edeqGLOBAL_mean ~ 1, data=Imputed, method = "ML", na.action=na.exclude)
RICept <-lme(edeqGLOBAL_mean ~ 1, data=Imputed, random=~1|ID, method = "ML", na.actioin=na.exclude, control=c(optim="optim"))

and this is the error message I am getting

Error in as.data.frame.default(data, optional = TRUE) : cannot coerce class ‘"mids"’ to a data.frame

Any help as to what to do?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

只怪假的太真实 2025-02-03 18:55:00

首先，您需要了解多个插补的含义：它为每个缺失价值创建几个插补。因此， MIDS 对象本质上是数据帧的列表，对于缺失值的插图略有不同。这些归因之间的差异代表了您对丢失数据的不确定性。

因为 MIDS 不仅是 data.frame ，因此您不能以相同的方式使用它。分析多个插补数据涉及两个步骤：首先将分析应用于每个估算的数据集。第二个根据 rubin的规则这样您将获得总体估计以及包括归档之间差异的标准错误。

对于多种统计功能（例如 ml ， glm ， anova ）两个步骤。例如，可以在 mids 对象上进行简单的线性回归：

lm1_mira <- with(mydata_mids, lm(y ~ x1 + x2)) #with.mids() creates a `mira` object
pool(lm1_mira)

现在，对于 nlme :: lme（） and code> and gls（）这些方法不容易实施。您将不得不做一些编程。具体来说，您的代码应涉及以下内容：

创建一个执行分析并输出相关系数/估计值及其标准错误的函数。这看起来像这样的东西：
icept_fun＆lt; - function（dat）摘要（gls（edeqglobal_mean〜1，method =“ ml”，data = dat））$系数
将功能应用于每个估算的数据集。
一个。从 MIDS 对象中提取数据集：
Impute_list＆lt; - lapply（1：估算的$ M，函数（i）完成（归为算法，action = i））

b。应用功能：
icept_list＆lt; - lapply（Impute_list，icept_fun）
使用 pool.scalar.scalar from MICE 池。它是用于汇总您从乘数估算数据集计算的任何正态分布（！）统计量。这也意味着您可能必须改变一些感兴趣的统计信息，然后才能应用 pool.scalar 并之后将它们反向转换。例如，相关性应转换为费舍尔的Z。请参阅
pool.scalar（）函数需要估算的向量（参数 q ）和相应的方差（参数 u ），因此您需要重新设计结果列表。这可能看起来也可能不会像这样 - 取决于您的功能 icept_fun ：
icept_qs＆lt; - lapply（icept_list，function（x）x [“（intercept）”，1]）

icept_us＆lt; - lapply（Icept_list，function（x）x [“（intercept）”，2]^2）#squared se for Variance估算

如果事实证明您需要多级建模，请注意：
多层次数据的多重插奖带来了自己的其他问题。在插补本身中，您应该考虑到数据的多级结构。如果您刚刚应用小鼠:: MICE（）在您的（长期）数据集中，则这是不是正确的。一种替代方法是制作一个宽幅的数据集，进行多个插补，然后将所得的估算数据集重塑为长格式。在这种情况下，将在上述步骤2a和2b之间进行后整形格式。关于这是否是首选方法，我不知道。有一些很好的来源，例如这个vignette 。

First of all, you need to understand what multiple imputation means: It creates several imputation for each missing value. Therefore, the mids object is essentially a list of data frames that have slightly different imputations for the missing values. The variance between these imputations represents your uncertainty about the missing data.

Because mids is not just a data.frame, you cannot use it the same way. Analyzing multiple imputation data involves two steps: First apply the analysis to each imputed data set. Second aggregate the results (i.e. model coefficients etc.) according to Rubin's rules such that you get an overall estimate as well as standard errors that include the variance between imputations.

For several statistical functions (e.g. ml, glm, anova), the mice package provides an easy implementation of these two steps. A simple linear regression, for instance can be conducted on a mids object like this:

lm1_mira <- with(mydata_mids, lm(y ~ x1 + x2)) #with.mids() creates a `mira` object
pool(lm1_mira)

Now, for nlme::lme() and gls() these methods are not readily implemented. You will have to do a bit of programming instead. Specifically your code should involve the following:

Create a function that conducts your analysis and outputs the relevant coefficients/estimates and their standard errors. This may look something like this:
ICept_fun <- function(dat) summary(gls(edeqGLOBAL_mean ~ 1, method="ML", data=dat))$coefficients
Apply the function to each imputed dataset.
a. Extract datasets from the mids object:
Imputed_list <- lapply(1:Imputed$m, function(i) complete(Imputed, action=i))
b. Apply function:
ICept_list <- lapply(Imputed_list, ICept_fun)
Pool the results using the pool.scalar-function from mice. It is meant for pooling any normally distributed (!) statistic that you calculated from a multiply imputed dataset. This also means that you may have to transform some of your statistics of interest before you can apply pool.scalar and back-transform them afterwards. For example, correlations should be transformed to Fisher's Z. Please see this useful vignette if you are not sure about your statistics of interest.
The pool.scalar() function wants vectors of estimates (argument Q) and respective variances (argument U), so you need to reshape the list of results a bit. This may or may not look like this - depending on your function ICept_fun:
ICept_Qs <- lapply(ICept_list, function(x) x["(Intercept)", 1])
ICept_Us <- lapply(ICept_list, function(x) x["(Intercept)", 2]^2) #squared SE for variance estimate

If it turns out you need multilevel modelling, please be aware:
Multiple imputation of multilevel data brings its very own additional problems. In the imputation itself, you should take the multilevel structure of the data into account. If you just applied mice::mice() to your (long-format) dataset, this is not correct. One alternative method is to make a wide-format dataset, conduct multiple imputation, and then reshape the resulting imputed datasets back to long-format. In this case, back-shaping to long format would take place between the steps 2a and 2b that described above. As to whether this is the preferred method, I do not know. There are some good sources about this, for instance this vignette.