如何在 ggplot2 中绘制 nls 模型的输出

发布于 2025-01-10 20:44:34 字数 1407 浏览 0 评论 0原文

我有一些数据，我想使用 nls 将非线性模型拟合到数据的每个子集，然后使用 ggplot2 将拟合模型叠加到数据点的图形上。具体来说，该模型的形式

y~V*x/(K+x)

您可能会认为是 Michaelis-Menten。一种方法是使用 geom_smooth，但如果我使用 geom_smooth，我就没有任何方法来检索模型拟合的系数。或者，我可以使用 nls 拟合数据，然后使用 geom_smooth 拟合曲线，但是我怎么知道 geom_smooth 绘制的曲线与我的 nls 拟合给出的曲线相同？我无法将 nls 拟合中的系数传递给 geom_smooth 并告诉它使用它们，除非我可以让 geom_smooth 仅使用数据的子集，然后我可以指定起始参数，这样就可以工作，但是......有一次我尝试得到如下错误读数：

Aesthetics must be either length 1 or the same as the data (8): x, y, colour

这是我一直在使用的一些示例虚构数据：

Concentration <- c(500.0,250.0,100.0,62.5,50.0,25.0,12.5,5.0,
                   500.0,250.0,100.0,62.5,50.0,25.0,12.5,5.0)

drug <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2)

rate <- c(1.889220,1.426500,0.864720,0.662210,0.564340,0.343140,0.181120,0.077170,
          3.995055,3.011800,1.824505,1.397237,1.190078,0.723637,0.381865,0.162771)

file<-data.frame(Concentration,drug,rate)

其中浓度为 x，速率为 y；药物将是颜色变量。如果我编写以下内容，则会收到该错误：

plot <- ggplot(file,aes(x=file[,1],y=file[,3],color=Compound))+geom_point()

plot<-plot+geom_smooth(data=subset(file,file[,2]==drugNames[i]),method.args=list(formula=y~Vmax*x/(Km+x),start=list(Vmax=coef(models[[i]])[1],Km=coef(models[[i]])[2])),se=FALSE,size=0.5)

其中 models[[]] 是 nls 返回的模型参数列表。

关于如何在 geom_smooth 中对数据框进行子集化，以便我可以使用 nls 拟合中的起始参数单独绘制曲线，有什么想法吗？

原文

I have some data where I would like to fit a nonlinear model to each subset of the data using nls, then superimpose the fitted models onto a graph of the data points using ggplot2. Specifically the model is of the form

y~V*x/(K+x)

which you may recognize as Michaelis-Menten. One way to do this is using geom_smooth, but if I use geom_smooth I don't have any way to retrieve the coefficients for the model fit. Alternatively I could fit the data using nls then plot lines fitted using geom_smooth, but then how do I know that the curves which geom_smooth plotted are the same as those given by my nls fit? I can't pass the coefficients from my nls fit to geom_smooth and tell it to use them unless I can get geom_smooth to only use a subset of the data, then I can specify the starting parameters so that would work, but... Every time I've tried that I get an error reading as follows:

Aesthetics must be either length 1 or the same as the data (8): x, y, colour

Here's some sample made-up data I have been using:

Concentration <- c(500.0,250.0,100.0,62.5,50.0,25.0,12.5,5.0,
                   500.0,250.0,100.0,62.5,50.0,25.0,12.5,5.0)

drug <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2)

rate <- c(1.889220,1.426500,0.864720,0.662210,0.564340,0.343140,0.181120,0.077170,
          3.995055,3.011800,1.824505,1.397237,1.190078,0.723637,0.381865,0.162771)

file<-data.frame(Concentration,drug,rate)

where Concentration will be x in my plot and rate will be y; drug will be the color variable. If I write the following I get that error:

plot <- ggplot(file,aes(x=file[,1],y=file[,3],color=Compound))+geom_point()

plot<-plot+geom_smooth(data=subset(file,file[,2]==drugNames[i]),method.args=list(formula=y~Vmax*x/(Km+x),start=list(Vmax=coef(models[[i]])[1],Km=coef(models[[i]])[2])),se=FALSE,size=0.5)

where models[[]] is a list of model parameters returned by nls.

Any ideas on how I can subset a data frame in geom_smooth so I can individually plot curves using starting parameters from my nls fit?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我做我的改变 2025-01-17 20:44:34

理想的解决方案是使用 ggplot 绘制 nls() 的结果，但这是一个基于一些观察结果的“快速而肮脏”的解决方案。

首先，您可以确定，如果对 nls() 和 geom_smooth(method = "nls") 使用相同的公式，您将获得相同的系数。那是因为后者正在调用前者。

其次，使用您的示例数据，nls() 收敛到相同的 Vmax 和 Km 值（每种药物不同），无论起始值如何。换句话说，无需使用每种药物的范围内的起始值来构建模型。以下任何一项都会对药物 1 给出相同的结果（对于药物 2 也类似）：

library(dplyr)
# use maximum as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = max(.$Concentration), Vm = max(.$rate)))

# use minimum as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = min(.$Concentration), Vm = min(.$rate)))

# use arbitrary values as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = 50, Vm = 2))

因此绘制曲线的最快方法就是将药物映射到 ggplot 美学（例如颜色）。这将从相同的起始值构造单独的 nls 曲线，然后如果需要获取系数，您可以返回 nls()，因为模型应该是相同的作为情节。

使用示例数据file（但不要将其称为file，我使用df1）：

library(ggplot2)
df1 <- structure(list(Concentration = c(500, 250, 100, 62.5, 50, 25, 12.5, 5, 
                                        500, 250, 100, 62.5, 50, 25, 12.5, 5), 
                      drug = c(1, 1, 1, 1, 1, 1, 1, 1, 
                               2, 2, 2, 2, 2, 2, 2, 2), 
                      rate = c(1.88922, 1.4265, 0.86472, 0.66221, 0.56434, 0.34314, 
                               0.18112, 0.07717, 3.995055, 3.0118, 1.824505, 1.397237, 
                               1.190078, 0.723637, 0.381865, 0.162771)),
                      .Names = c("Concentration", "drug", "rate"), 
                      row.names = c(NA, -16L), 
                      class = "data.frame")

# could use e.g. Km = min(df1$Concentration) for start
# but here we use arbitrary values
ggplot(df1, aes(Concentration, rate)) + 
  geom_point() + 
  geom_smooth(method = "nls", 
              method.args = list(formula = y ~ Vmax * x / (Km + x),
                                 start = list(Km = 50, Vmax = 2)), 
              data = df1,
              se = FALSE,
              aes(color = factor(drug)))

The ideal solution would plot the results of nls() using ggplot, but here's a "quick and dirty" solution based on a couple of observations.

First, you can be sure that if you use the same formula for nls() and geom_smooth(method = "nls"), you will get the same coefficients. That's because the latter is calling the former.

Second, using your example data, nls() converges to the same values of Vmax and Km (different for each drug), regardless of start value. In other words, there's no need to build models using start values in the range for each individual drug. Any of the following give the same result for drug 1 (and similarly for drug 2):

library(dplyr)
# use maximum as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = max(.$Concentration), Vm = max(.$rate)))

# use minimum as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = min(.$Concentration), Vm = min(.$rate)))

# use arbitrary values as start
df1 %>% 
  filter(drug == 1) %>% 
  nls(rate ~ Vm * Concentration/(K + Concentration), 
      data = ., 
      start = list(K = 50, Vm = 2))

So the quickest way to plot the curves is simply to map the drug to a ggplot aesthetic, such as color. This will construct separate nls curves from the same start values and you can then go back to nls() if required to get the coefficients, knowing that the models should be the same as the plot.

Using your example data file (but don't call it file, I used df1):

library(ggplot2)
df1 <- structure(list(Concentration = c(500, 250, 100, 62.5, 50, 25, 12.5, 5, 
                                        500, 250, 100, 62.5, 50, 25, 12.5, 5), 
                      drug = c(1, 1, 1, 1, 1, 1, 1, 1, 
                               2, 2, 2, 2, 2, 2, 2, 2), 
                      rate = c(1.88922, 1.4265, 0.86472, 0.66221, 0.56434, 0.34314, 
                               0.18112, 0.07717, 3.995055, 3.0118, 1.824505, 1.397237, 
                               1.190078, 0.723637, 0.381865, 0.162771)),
                      .Names = c("Concentration", "drug", "rate"), 
                      row.names = c(NA, -16L), 
                      class = "data.frame")

# could use e.g. Km = min(df1$Concentration) for start
# but here we use arbitrary values
ggplot(df1, aes(Concentration, rate)) + 
  geom_point() + 
  geom_smooth(method = "nls", 
              method.args = list(formula = y ~ Vmax * x / (Km + x),
                                 start = list(Km = 50, Vmax = 2)), 
              data = df1,
              se = FALSE,
              aes(color = factor(drug)))

回复收藏 0 原文

~没有更多了~