在单个图表上绘制两个直方图

发布于 2024-11-27 02:18:18 字数 225 浏览 2 评论 0原文

我使用文件中的数据创建了直方图,没有问题。现在我想将另一个文件中的数据叠加在同一个直方图中,所以我做了类似的事情

n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)

,但问题是对于每个间隔,只有最高值的条形出现,而另一个则隐藏。我想知道如何同时用不同的颜色绘制两个直方图。

I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this

n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)

but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

深陷 2024-12-04 02:18:18

这里有一个工作示例:

import random
import numpy
from matplotlib import pyplot

x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

bins = numpy.linspace(-10, 10, 100)

pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()

在此处输入图像描述

Here you have a working example:

import random
import numpy
from matplotlib import pyplot

x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

bins = numpy.linspace(-10, 10, 100)

pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()

enter image description here

苹果你个爱泡泡 2024-12-04 02:18:18

接受的答案给出了具有重叠条形的直方图的代码,但如果您希望每个条形并排(就像我所做的那样),请尝试以下变体:

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')

x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)

plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()

在此处输入图像描述

参考:http://matplotlib.org/examples/statistics/histogram_demo_multihist.html

编辑 [2018/03/ 16]:更新为允许绘制不同大小的数组,如 @stochastic_zeitgeist 的建议

The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')

x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)

plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()

enter image description here

Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html

EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by @stochastic_zeitgeist

弱骨蛰伏 2024-12-04 02:18:18

如果样本大小不同,则可能很难将分布与单个 y 轴进行比较。例如:

import numpy as np
import matplotlib.pyplot as plt

#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']

#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()

 hist_single_ax

在这种情况下,您可以在不同的轴上绘制两个数据集。为此,您可以使用 matplotlib 获取直方图数据,清除轴,然后在两个单独的轴上重新绘制它(移动 bin 边缘,使它们不重叠):

#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis

#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])

#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()

hist_twin_ax

In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:

import numpy as np
import matplotlib.pyplot as plt

#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']

#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()

hist_single_ax

In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don't overlap):

#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis

#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])

#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()

hist_twin_ax

滥情稳全场 2024-12-04 02:18:18

您应该使用 hist 返回的值中的 bins

import numpy as np
import matplotlib.pyplot as plt

foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution

_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)

两个具有相同分箱的 matplotlib 直方图

You should use bins from the values returned by hist:

import numpy as np
import matplotlib.pyplot as plt

foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution

_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)

Two matplotlib histograms with same binning

输什么也不输骨气 2024-12-04 02:18:18

作为Gustavo Bezerra的回答的补充:

如果您希望每个直方图都标准化normed 对于 mpl<=2.1,密度 对于 mpl>=3.1)您不能只需使用 normed/密度=True,您需要为每个值设置权重:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)

plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()

在此处输入图像描述

作为比较,完全相同的 x 和 <具有默认权重的 code>y 向量和密度=True

在此处输入图像描述

As a completion to Gustavo Bezerra's answer:

If you want each histogram to be normalized (normed for mpl<=2.1 and density for mpl>=3.1) you cannot just use normed/density=True, you need to set the weights for each value instead:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)

plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()

enter image description here

As a comparison, the exact same x and y vectors with default weights and density=True:

enter image description here

征﹌骨岁月お 2024-12-04 02:18:18

绘制两个重叠的直方图(或更多)可能会导致绘图相当混乱。我发现使用 阶梯直方图 (又名空心直方图)大大提高了可读性。唯一的缺点是,在 matplotlib 中,步进直方图的默认图例格式不正确,因此可以像以下示例一样进行编辑:

import numpy as np                   # v 1.19.2
import matplotlib.pyplot as plt      # v 3.3.2
from matplotlib.lines import Line2D

rng = np.random.default_rng(seed=123)

# Create two normally distributed random variables of different sizes
# and with different shapes
data1 = rng.normal(loc=30, scale=10, size=500)
data2 = rng.normal(loc=50, scale=10, size=1000)

# Create figure with 'step' type of histogram to improve plot readability
fig, ax = plt.subplots(figsize=(9,5))
ax.hist([data1, data2], bins=15, histtype='step', linewidth=2,
        alpha=0.7, label=['data1','data2'])

# Edit legend to get lines as legend keys instead of the default polygons
# and sort the legend entries in alphanumeric order
handles, labels = ax.get_legend_handles_labels()
leg_entries = {}
for h, label in zip(handles, labels):
    leg_entries[label] = Line2D([0], [0], color=h.get_facecolor()[:-1],
                                alpha=h.get_alpha(), lw=h.get_linewidth())
labels_sorted, lines = zip(*sorted(leg_entries.items()))
ax.legend(lines, labels_sorted, frameon=False)

# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

# Add annotations
plt.ylabel('Frequency', labelpad=15)
plt.title('Matplotlib step histogram', fontsize=14, pad=20)
plt.show()

step_hist

如您所见,结果看起来非常干净。当重叠两个以上直方图时,这尤其有用。根据变量的分布方式,这最多适用于大约 5 个重叠分布。不仅如此,还需要使用另一种类型的绘图,例如 所提供的绘图之一在这里

Plotting two overlapping histograms (or more) can lead to a rather cluttered plot. I find that using step histograms (aka hollow histograms) improves the readability quite a bit. The only downside is that in matplotlib the default legend for a step histogram is not properly formatted, so it can be edited like in the following example:

import numpy as np                   # v 1.19.2
import matplotlib.pyplot as plt      # v 3.3.2
from matplotlib.lines import Line2D

rng = np.random.default_rng(seed=123)

# Create two normally distributed random variables of different sizes
# and with different shapes
data1 = rng.normal(loc=30, scale=10, size=500)
data2 = rng.normal(loc=50, scale=10, size=1000)

# Create figure with 'step' type of histogram to improve plot readability
fig, ax = plt.subplots(figsize=(9,5))
ax.hist([data1, data2], bins=15, histtype='step', linewidth=2,
        alpha=0.7, label=['data1','data2'])

# Edit legend to get lines as legend keys instead of the default polygons
# and sort the legend entries in alphanumeric order
handles, labels = ax.get_legend_handles_labels()
leg_entries = {}
for h, label in zip(handles, labels):
    leg_entries[label] = Line2D([0], [0], color=h.get_facecolor()[:-1],
                                alpha=h.get_alpha(), lw=h.get_linewidth())
labels_sorted, lines = zip(*sorted(leg_entries.items()))
ax.legend(lines, labels_sorted, frameon=False)

# Remove spines
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

# Add annotations
plt.ylabel('Frequency', labelpad=15)
plt.title('Matplotlib step histogram', fontsize=14, pad=20)
plt.show()

step_hist

As you can see, the result looks quite clean. This is especially useful when overlapping even more than two histograms. Depending on how the variables are distributed, this can work for up to around 5 overlapping distributions. More than that would require the use of another type of plot, such as one of those presented here.

夏花。依旧 2024-12-04 02:18:18

这是一种简单的方法,当数据大小不同时,可以在同一个图上绘制两个并排的直方图:

def plotHistogram(p, o):
    """
    p and o are iterables with the values you want to 
    plot the histogram of
    """
    plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
    plt.show()

Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:

def plotHistogram(p, o):
    """
    p and o are iterables with the values you want to 
    plot the histogram of
    """
    plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
    plt.show()
梦屿孤独相伴 2024-12-04 02:18:18

还有一个与 joaquin 答案非常相似的选项:

import random
from matplotlib import pyplot

#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()

给出以下输出:

在此处输入图像描述

Also an option which is quite similar to joaquin answer:

import random
from matplotlib import pyplot

#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()

Gives the following output:

enter image description here

那支青花 2024-12-04 02:18:18

当您想要从二维 numpy 数组绘制直方图时,有一个警告。您需要交换 2 个轴。

import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()

输入图片此处描述

There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.

import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()

enter image description here

云裳 2024-12-04 02:18:18

以防万一您有 pandas(import pandas as pd)或可以使用它:

test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)], 
                     [random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()

Just in case you have pandas (import pandas as pd) or are ok with using it:

test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)], 
                     [random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()
来世叙缘 2024-12-04 02:18:18

这个问题之前已经被回答过,但想添加另一个快速/简单的解决方法,可以帮助其他访问者解决这个问题。

import seasborn as sns 
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)

此处有一些有用的示例,用于 kde 与直方图的比较。

This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.

import seasborn as sns 
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)

Some helpful examples are here for kde vs histogram comparison.

时光清浅 2024-12-04 02:18:18

受到所罗门答案的启发,但为了坚持与直方图相关的问题,一个干净的解决方案是:

sns.distplot(bar)
sns.distplot(foo)
plt.show()

确保首先绘制较高的,否则您需要设置 plt.ylim(0,0.45) 以便较高的直方图没有被截断。

Inspired by Solomon's answer, but to stick with the question, which is related to histogram, a clean solution is:

sns.distplot(bar)
sns.distplot(foo)
plt.show()

Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文