在 R 中使用 ggplot2 叠加直方图
我是 R 新手,正在尝试在同一张图表上绘制 3 个直方图。 一切工作正常,但我的问题是你看不到两个直方图重叠的地方 - 它们看起来相当被切断。
当我制作密度图时,它看起来很完美:每条曲线都被黑色框线包围,并且曲线重叠处的颜色看起来不同。
有人可以告诉我第一张图片中的直方图是否可以实现类似的效果吗?这是我正在使用的代码:
lowf0 <-read.csv (....)
mediumf0 <-read.csv (....)
highf0 <-read.csv(....)
lowf0$utt<-'low f0'
mediumf0$utt<-'medium f0'
highf0$utt<-'high f0'
histogram<-rbind(lowf0,mediumf0,highf0)
ggplot(histogram, aes(f0, fill = utt)) + geom_histogram(alpha = 0.2)
I am new to R and am trying to plot 3 histograms onto the same graph.
Everything worked fine, but my problem is that you don't see where 2 histograms overlap - they look rather cut off.
When I make density plots, it looks perfect: each curve is surrounded by a black frame line, and colours look different where curves overlap.
Can someone tell me if something similar can be achieved with the histograms in the 1st picture? This is the code I'm using:
lowf0 <-read.csv (....)
mediumf0 <-read.csv (....)
highf0 <-read.csv(....)
lowf0$utt<-'low f0'
mediumf0$utt<-'medium f0'
highf0$utt<-'high f0'
histogram<-rbind(lowf0,mediumf0,highf0)
ggplot(histogram, aes(f0, fill = utt)) + geom_histogram(alpha = 0.2)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
使用@joran的示例数据,
请注意,
geom_histogram()
默认为position="stack"
。请参阅 geom_histogram 文档中的“位置调整”
Using @joran's sample data,
Note that
geom_histogram()
default isposition="stack"
.see "position adjustment" within geom_histogram documentation
您当前的代码:
告诉
ggplot
使用f0
中的所有值构造一个直方图,然后根据变量utt
。相反,您想要创建三个独立的直方图,并使用 alpha 混合,以便它们可以相互可见。因此,您可能想要对
geom_histogram
使用三个单独的调用,其中每个调用都获取自己的数据框并填充:这是一个带有一些输出的具体示例:
它会生成如下内容:
已编辑以修复拼写错误;你想要填充,而不是颜色。
Your current code:
is telling
ggplot
to construct one histogram using all the values inf0
and then color the bars of this single histogram according to the variableutt
.What you want instead is to create three separate histograms, with alpha blending so that they are visible through each other. So you probably want to use three separate calls to
geom_histogram
, where each one gets it's own data frame and fill:Here's a concrete example with some output:
which produces something like this:
Edited to fix typos; you wanted fill, not colour.
虽然在 ggplot2 中绘制多个/重叠直方图只需要几行,但结果并不总是令人满意。需要正确使用边框和颜色,以确保眼睛能够区分直方图。
以下函数平衡边框颜色、不透明度和叠加密度图,使查看者能够区分分布。
单个直方图:
多个直方图:
用法:
只需将数据框与所需参数一起传递到上述函数中:
plot_multi_histogram 中的额外参数是包含类别标签的列的名称。
通过创建具有许多不同分布方式的数据框,我们可以更直观地看到这一点:
像以前一样传递数据框(并使用选项扩大图表):
添加 每个单独的垂直线distribution:
与之前的 plot_multi_histogram 函数相比,唯一的变化是在参数中添加了
means
,并将geom_vline
行更改为接受多个值。用法:
结果:
由于我在
many_distros
中显式设置了方法,因此我可以简单地将它们传入。或者您也可以只需在函数内计算这些并使用那样。While only a few lines are required to plot multiple/overlapping histograms in ggplot2, the results are't always satisfactory. There needs to be proper use of borders and coloring to ensure the eye can differentiate between histograms.
The following functions balance border colors, opacities, and superimposed density plots to enable the viewer to differentiate among distributions.
Single histogram:
Multiple histogram:
Usage:
Simply pass your data frame into the above functions along with desired arguments:
The extra parameter in plot_multi_histogram is the name of the column containing the category labels.
We can see this more dramatically by creating a dataframe with many different distribution means:
Passing data frame in as before (and widening chart using options):
To add a separate vertical line for each distribution:
The only change over the previous plot_multi_histogram function is the addition of
means
to the parameters, and changing thegeom_vline
line to accept multiple values.Usage:
Result:
Since I set the means explicitly in
many_distros
I can simply pass them in. Alternatively you can simply calculate these inside the function and use that way.