尝试使用 ggplot2 的 geom_smooth() 显示原始数据和拟合数据(nls + dnorm)
我正在探索一些数据,所以我想做的第一件事就是尝试拟合正态(高斯)分布。这是我第一次在 R 中尝试这个,所以我一次一步地尝试。首先,我预先对数据进行分箱:
myhist = data.frame(size = 10:27, counts = c(1L, 3L, 5L, 6L, 9L, 14L, 13L, 23L, 31L, 40L, 42L, 22L, 14L, 7L, 4L, 2L, 2L, 1L) )
qplot(x=size, y=counts, data=myhist)
因为我想要计数,所以我需要添加标准化因子 (N) 来缩放增加密度:
fit = nls(counts ~ N * dnorm(size, m, s), data=myhist, start=c(m=20, s=5, N=sum(myhist$counts)) )
然后我创建用于显示的拟合数据,一切都很好:
x = seq(10,30,0.2)
fitted = data.frame(size = x, counts=predict(fit, data.frame(size=x)) )
ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + geom_line(data=fitted)
当我找到这个线程时,我很兴奋其中谈到使用 geom_smooth() 一步完成这一切,但我无法让它工作:
这是我尝试的...以及我得到的:
ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + geom_smooth(method="nls", formula = counts ~ N * dnorm(size, m, s), se=F, start=list(m=20, s=5, N=300, size=10))
Error in method(formula, data = data, weights = weight, ...) :
parameters without starting value in 'data': counts
该错误似乎表明它正在尝试适应观察到的变量,计数,但这并不产生任何影响感觉,如果我也为计数指定一个“起始”值,它可以预见地会吓坏:
fitting parameters ‘m’, ‘s’, ‘N’, ‘size’, ‘counts’ without any variables
Error in eval(expr, envir, enclos) : object 'counts' not found
知道我做错了什么吗?当然,这不是世界末日,但步骤越少越好,而且你们总是能为这些常见任务想出最优雅的解决方案。
提前致谢!
杰弗里
I am exploring some data, so the first thing I wanted to do was try to fit a normal (Gaussian) distribution to it. This is my first time trying this in R, so I'm taking it one step at a time. First I pre-binned my data:
myhist = data.frame(size = 10:27, counts = c(1L, 3L, 5L, 6L, 9L, 14L, 13L, 23L, 31L, 40L, 42L, 22L, 14L, 7L, 4L, 2L, 2L, 1L) )
qplot(x=size, y=counts, data=myhist)
Since I want counts, I need to add a normalization factor (N) to scale up the density:
fit = nls(counts ~ N * dnorm(size, m, s), data=myhist, start=c(m=20, s=5, N=sum(myhist$counts)) )
Then I create the fitted data for display and everything works great:
x = seq(10,30,0.2)
fitted = data.frame(size = x, counts=predict(fit, data.frame(size=x)) )
ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + geom_line(data=fitted)
I got excited when I found this thread which talks about using geom_smooth() to do it all in one step, but I can't get it to work:
Here's what I try... and what I get:
ggplot(data=myhist, aes(x=size, y=counts)) + geom_point() + geom_smooth(method="nls", formula = counts ~ N * dnorm(size, m, s), se=F, start=list(m=20, s=5, N=300, size=10))
Error in method(formula, data = data, weights = weight, ...) :
parameters without starting value in 'data': counts
The error seems to indicate that it's trying to fit for the observed variable, counts, but that doesn't make any sense, and it predictably freaks out if I specify a "starting" value for counts too:
fitting parameters ‘m’, ‘s’, ‘N’, ‘size’, ‘counts’ without any variables
Error in eval(expr, envir, enclos) : object 'counts' not found
Any idea what I'm doing wrong? It's not the end of the world, of course, but fewer steps are always better, and you guys always come up with the most elegant solutions to these common tasks.
Thanks in advance!
Jeffrey
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
第一个错误表明 ggplot2 在数据中找不到公式中使用的变量“count”。
映射后进行统计,即大小-> x,并计数 -> y。
下面是在 geom_smooth 中使用 nls 的示例:
要点是在公式规范中使用 x 和 y,而不是大小和计数。
the first error indicates that ggplot2 cannot find the variable 'count', which is used in formula, in data.
Stats take place after mapping, that is, size -> x, and counts -> y.
Here is an example for using nls in geom_smooth:
The point is that using x and y, instead of size and counts, in the specification of formula.