将密度曲线拟合到 R 中的直方图
R中有没有可以将曲线拟合到直方图的函数?
假设您有以下直方图
hist(c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4)))
它看起来正常,但它是倾斜的。我想拟合一条倾斜的正态曲线以环绕该直方图。
这个问题相当基本,但我似乎无法在互联网上找到 R 的答案。
Is there a function in R that fits a curve to a histogram?
Let's say you had the following histogram
hist(c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4)))
It looks normal, but it's skewed. I want to fit a normal curve that is skewed to wrap around this histogram.
This question is rather basic, but I can't seem to find the answer for R on the internet.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
Dirk 解释了如何在直方图上绘制密度函数。但有时您可能想要采用更强烈的偏态正态分布假设并绘制它而不是密度。您可以估计分布的参数并使用 sn package< /a>:
这可能对于更偏斜正态的数据效果更好:
Dirk has explained how to plot the density function over the histogram. But sometimes you might want to go with the stronger assumption of a skewed normal distribution and plot that instead of density. You can estimate the parameters of the distribution and plot it using the sn package:
This probably works better on data that is more skew-normal:
我遇到了同样的问题,但德克的解决方案似乎不起作用。
每当
我阅读
?hist
并发现freq: 默认情况下逻辑向量设置为 TRUE 时,
我都会收到此警告消息。对我有用的代码是
I had the same problem but Dirk's solution didn't seem to work.
I was getting this warning messege every time
I read through
?hist
and found aboutfreq: a logical vector set TRUE by default.
the code that worked for me is
这是核密度估计,请点击此链接查看该概念及其概念的精彩说明参数。
曲线的形状主要取决于两个元素:1)估计 a 的内核(通常是 Epanechnikov 或 Gaussian)通过输入并权衡所有数据,为 x 坐标中的每个值确定 y 坐标中的点;它是对称的,通常是一个集成为一个的正函数; 2)带宽,越大曲线越平滑,越小曲线越摆动。
针对不同的需求,需要采用不同的套餐,可以参考此文档: R 中的密度估计。对于多元变量,您可以转向多元核密度估计。
It's the kernel density estimation, and please hit this link to check a great illustration for the concept and its parameters.
The shape of the curve depends mostly on two elements: 1) the kernel(usually Epanechnikov or Gaussian) that estimates a point in the y coordinate for every value in the x coordinate by inputting and weighing all data; and it is symmetric and usually a positive function that integrates into one; 2) the bandwidth, the larger the smoother the curve, and the smaller the more wiggled the curve.
For different requirements, different packages should be applied, and you can refer to this document: Density estimation in R. And for multivariate variables, you can turn to the multivariate kernel density estimation.
一些评论要求将密度估计线缩放到直方图的峰值,以便 y 轴保留为计数而不是密度。为了实现这一点,我编写了一个小函数来自动拉动最大箱高度并相应地缩放密度函数的 y 维度。
由 reprex 包 (v2.0.1)
Some comments requested scaling the density estimate line to the peak of the histogram so that the y axis would remain as counts rather than density. To achieve this I wrote a small function to automatically pull the max bin height and scale the y dimension of the density function accordingly.
Created on 2021-12-19 by the reprex package (v2.0.1)
如果我正确理解你的问题,那么你可能需要密度估计和直方图:
稍后编辑:
这是一个稍微更漂亮的版本:
以及它生成的图表:
< img src="https://i.sstatic.net/lHCqw.png" alt="在此处输入图像描述">
If I understand your question correctly, then you probably want a density estimate along with the histogram:
Edit a long while later:
Here is a slightly more dressed-up version:
along with the graph it produces:
很容易做到这一点
使用 ggplot2或模仿德克解决方案的结果
Such thing is easy with ggplot2
or to mimic the result from Dirk's solution
这是我的做法:
一个额外的练习是使用 ggplot2 包来完成此操作......
Here's the way I do it:
A bonus exercise is to do this with ggplot2 package ...