在单个图表上绘制两个直方图
我使用文件中的数据创建了直方图,没有问题。现在我想将另一个文件中的数据叠加在同一个直方图中,所以我做了类似的事情
n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)
,但问题是对于每个间隔,只有最高值的条形出现,而另一个则隐藏。我想知道如何同时用不同的颜色绘制两个直方图。
I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this
n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)
but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
这里有一个工作示例:
Here you have a working example:
接受的答案给出了具有重叠条形的直方图的代码,但如果您希望每个条形并排(就像我所做的那样),请尝试以下变体:
参考:http://matplotlib.org/examples/statistics/histogram_demo_multihist.html
编辑 [2018/03/ 16]:更新为允许绘制不同大小的数组,如 @stochastic_zeitgeist 的建议
The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:
Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html
EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by @stochastic_zeitgeist
如果样本大小不同,则可能很难将分布与单个 y 轴进行比较。例如:
在这种情况下,您可以在不同的轴上绘制两个数据集。为此,您可以使用 matplotlib 获取直方图数据,清除轴,然后在两个单独的轴上重新绘制它(移动 bin 边缘,使它们不重叠):
In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:
In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don't overlap):
您应该使用
hist
返回的值中的bins
:You should use
bins
from the values returned byhist
:作为Gustavo Bezerra的回答的补充:
如果您希望每个直方图都标准化(
normed
对于 mpl<=2.1,密度
对于 mpl>=3.1)您不能只需使用normed/密度=True
,您需要为每个值设置权重:作为比较,完全相同的
x
和 <具有默认权重的 code>y 向量和密度=True
:As a completion to Gustavo Bezerra's answer:
If you want each histogram to be normalized (
normed
for mpl<=2.1 anddensity
for mpl>=3.1) you cannot just usenormed/density=True
, you need to set the weights for each value instead:As a comparison, the exact same
x
andy
vectors with default weights anddensity=True
:绘制两个重叠的直方图(或更多)可能会导致绘图相当混乱。我发现使用 阶梯直方图 (又名空心直方图)大大提高了可读性。唯一的缺点是,在 matplotlib 中,步进直方图的默认图例格式不正确,因此可以像以下示例一样进行编辑:
如您所见,结果看起来非常干净。当重叠两个以上直方图时,这尤其有用。根据变量的分布方式,这最多适用于大约 5 个重叠分布。不仅如此,还需要使用另一种类型的绘图,例如 所提供的绘图之一在这里。
Plotting two overlapping histograms (or more) can lead to a rather cluttered plot. I find that using step histograms (aka hollow histograms) improves the readability quite a bit. The only downside is that in matplotlib the default legend for a step histogram is not properly formatted, so it can be edited like in the following example:
As you can see, the result looks quite clean. This is especially useful when overlapping even more than two histograms. Depending on how the variables are distributed, this can work for up to around 5 overlapping distributions. More than that would require the use of another type of plot, such as one of those presented here.
这是一种简单的方法,当数据大小不同时,可以在同一个图上绘制两个并排的直方图:
Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:
还有一个与 joaquin 答案非常相似的选项:
给出以下输出:
Also an option which is quite similar to joaquin answer:
Gives the following output:
当您想要从二维 numpy 数组绘制直方图时,有一个警告。您需要交换 2 个轴。
There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.
以防万一您有 pandas(
import pandas as pd
)或可以使用它:Just in case you have pandas (
import pandas as pd
) or are ok with using it:这个问题之前已经被回答过,但想添加另一个快速/简单的解决方法,可以帮助其他访问者解决这个问题。
此处有一些有用的示例,用于 kde 与直方图的比较。
This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.
Some helpful examples are here for kde vs histogram comparison.
受到所罗门答案的启发,但为了坚持与直方图相关的问题,一个干净的解决方案是:
确保首先绘制较高的,否则您需要设置 plt.ylim(0,0.45) 以便较高的直方图没有被截断。
Inspired by Solomon's answer, but to stick with the question, which is related to histogram, a clean solution is:
Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.