Matplotlib 中的 bin 大小(直方图)
我正在使用 matplotlib 制作直方图。
有没有办法手动设置垃圾箱的大小而不是垃圾箱的数量?
I'm using matplotlib to make a histogram.
Is there any way to manually set the size of the bins as opposed to the number of bins?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
实际上,这很简单:您可以提供带有垃圾箱边界的列表,而不是垃圾箱的数量。它们也可以不均匀分布:
如果您只是希望它们均匀分布,您可以简单地使用范围:
添加到原始答案
上面的行适用于仅填充整数的
数据
。正如 macrocosme 指出的,对于浮点数,您可以使用:Actually, it's quite easy: instead of the number of bins you can give a list with the bin boundaries. They can be unequally distributed, too:
If you just want them equally distributed, you can simply use range:
Added to original answer
The above line works for
data
filled with integers only. As macrocosme points out, for floats you can use:对于 N 个箱,箱边缘由 N+1 个值的列表指定,其中前 N 给出箱的下边缘,+1 给出最后一个箱的上边缘。
代码:
请注意,linspace 生成从 min_edge 到 max_edge 的数组,分为 N+1 个值或 N 个 bin
For N bins, the bin edges are specified by list of N+1 values where the first N give the lower bin edges and the +1 gives the upper edge of the last bin.
Code:
Note that linspace produces array from min_edge to max_edge broken into N+1 values or N bins
我使用分位数来统一垃圾箱并适合样本:
I use quantiles to do bins uniform and fitted to sample:
我想最简单的方法是计算您拥有的数据的最小值和最大值,然后计算
L = max - min
。然后,将L
除以所需的 bin 宽度(我假设这就是您所说的 bin 大小),并使用该值的上限作为 bin 的数量。I guess the easy way would be to calculate the minimum and maximum of the data you have, then calculate
L = max - min
. Then you divideL
by the desired bin width (I'm assuming this is what you mean by bin size) and use the ceiling of this value as the number of bins.我和OP有同样的问题(我想!),但我无法让它按照Lastalda指定的方式工作。我不知道我是否正确解释了这个问题,但我找到了另一种解决方案(尽管这可能是一种非常糟糕的方法)。
这就是我这样做的方式:
plt.hist([1,11,21,31,41], bins=[0,10,20,30,40,50],weights=[10,1 ,40,33,6]);
创建此:
所以第一个参数基本上“初始化”了 bin - 我专门创建了一个介于 bins 参数中设置的范围之间的数字。
为了演示这一点,请查看第一个参数中的数组 ([1,11,21,31,41]) 和第二个参数中的“bins”数组 ([0,10,20,30,40,50]) :
然后我使用 'weights' 参数来定义每个 bin 的大小。这是用于权重参数的数组:[10,1,40,33,6]。
因此,0 到 10 bin 的值为 10,11 到 20 bin 的值为 1,21 到 30 bin 的值为 40,等等。
I had the same issue as OP (I think!), but I couldn't get it to work in the way that Lastalda specified. I don't know if I have interpreted the question properly, but I have found another solution (it probably is a really bad way of doing it though).
This was the way that I did it:
plt.hist([1,11,21,31,41], bins=[0,10,20,30,40,50], weights=[10,1,40,33,6]);
Which creates this:
So the first parameter basically 'initialises' the bin - I'm specifically creating a number that is in between the range I set in the bins parameter.
To demonstrate this, look at the array in the first parameter ([1,11,21,31,41]) and the 'bins' array in the second parameter ([0,10,20,30,40,50]):
Then I'm using the 'weights' parameter to define the size of each bin. This is the array used for the weights parameter: [10,1,40,33,6].
So the 0 to 10 bin is given the value 10, the 11 to 20 bin is given the value of 1, the 21 to 30 bin is given the value of 40, etc.
我喜欢事情自动发生,并且让垃圾箱落在“好的”值上。以下似乎工作得很好。
结果的箱大小间隔很好。
I like things to happen automatically and for bins to fall on "nice" values. The following seems to work quite well.
The result has bins on nice intervals of bin size.
如果您还关注可视化方面,则可以添加 edgecolor='white', linewidth=2 并将分箱分开:
If you are looking on the visualization aspect also, you can add edgecolor='white', linewidth=2 and will have the binned separated :
我使用热图作为 hist2d 图。另外,我使用 cmin=0.5 表示无计数值,使用 cmap 表示颜色,r 表示给定颜色的反转。
描述统计数据。
I am using heat map as hist2d plot. Additionally I use cmin=0.5 for no count value and cmap for color, r represent the reverse of given color.
Describe statistics.
对于具有整数 x 值的直方图,我最终使用
0.5 的偏移量将 bin 置于 x 轴值的中心。
plt.xticks
调用为每个整数添加一个刻度。For a histogram with integer x-values I ended up using
The offset of 0.5 centers the bins on the x-axis values. The
plt.xticks
call adds a tick for every integer.