使用 (python) Scipy 拟合伽玛分布

发布于 2024-09-02 11:50:38 字数 283 浏览 3 评论 0原文

谁能帮我在 python 中拟合伽玛分布?好吧,我有一些数据:X 和 Y 坐标,我想找到适合这个分布的伽马参数......在 Scipy doc,原来确实存在一个 fit 方法,但我不知道如何使用它:s ..首先,参数“数据”必须采用哪种格式,以及如何提供第二个参数(参数),因为这就是我正在寻找的?

Can anyone help me out in fitting a gamma distribution in python? Well, I've got some data : X and Y coordinates, and I want to find the gamma parameters that fit this distribution... In the Scipy doc, it turns out that a fit method actually exists but I don't know how to use it :s.. First, in which format the argument "data" must be, and how can I provide the second argument (the parameters) since that's what I'm looking for?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

预谋 2024-09-09 11:50:38

生成一些伽玛数据:

import scipy.stats as stats    
alpha = 5
loc = 100.5
beta = 22
data = stats.gamma.rvs(alpha, loc=loc, scale=beta, size=10000)    
print(data)
# [ 202.36035683  297.23906376  249.53831795 ...,  271.85204096  180.75026301
#   364.60240242]

这里我们将数据拟合到伽玛分布:

fit_alpha, fit_loc, fit_beta=stats.gamma.fit(data)
print(fit_alpha, fit_loc, fit_beta)
# (5.0833692504230008, 100.08697963283467, 21.739518937816108)

print(alpha, loc, beta)
# (5, 100.5, 22)

Generate some gamma data:

import scipy.stats as stats    
alpha = 5
loc = 100.5
beta = 22
data = stats.gamma.rvs(alpha, loc=loc, scale=beta, size=10000)    
print(data)
# [ 202.36035683  297.23906376  249.53831795 ...,  271.85204096  180.75026301
#   364.60240242]

Here we fit the data to the gamma distribution:

fit_alpha, fit_loc, fit_beta=stats.gamma.fit(data)
print(fit_alpha, fit_loc, fit_beta)
# (5.0833692504230008, 100.08697963283467, 21.739518937816108)

print(alpha, loc, beta)
# (5, 100.5, 22)
李不 2024-09-09 11:50:38

我对 ss.gamma.rvs 函数不满意,因为它可以生成负数,而伽玛分布不应该有负数。因此,我通过预期值 = 均值(数据)和方差 = var(数据)(有关详细信息,请参阅维基百科)来拟合样本,并编写了一个函数,可以在没有 scipy 的情况下生成伽马分布的随机样本(我发现很难正确安装,旁注):

import random
import numpy

data = [6176, 11046, 670, 6146, 7945, 6864, 767, 7623, 7212, 9040, 3213, 6302, 10044, 10195, 9386, 7230, 4602, 6282, 8619, 7903, 6318, 13294, 6990, 5515, 9157]

# Fit gamma distribution through mean and average
mean_of_distribution = numpy.mean(data)
variance_of_distribution = numpy.var(data)

def gamma_random_sample(mean, variance, size):
    """Yields a list of random numbers following a gamma distribution defined by mean and variance"""
    g_alpha = mean*mean/variance
    g_beta = mean/variance
    for i in range(size):
        yield random.gammavariate(g_alpha,1/g_beta)

# force integer values to get integer sample
grs = [int(i) for i in gamma_random_sample(mean_of_distribution,variance_of_distribution,len(data))]

print("Original data: ", sorted(data))
print("Random sample: ", sorted(grs))

# Original data: [670, 767, 3213, 4602, 5515, 6146, 6176, 6282, 6302, 6318, 6864, 6990, 7212, 7230, 7623, 7903, 7945, 8619, 9040, 9157, 9386, 10044, 10195, 11046, 13294]
# Random sample:  [1646, 2237, 3178, 3227, 3649, 4049, 4171, 5071, 5118, 5139, 5456, 6139, 6468, 6726, 6944, 7050, 7135, 7588, 7597, 7971, 10269, 10563, 12283, 12339, 13066]

I was unsatisfied with the ss.gamma.rvs-function as it can generate negative numbers, something the gamma-distribution is supposed not to have. So I fitted the sample through expected value = mean(data) and variance = var(data) (see wikipedia for details) and wrote a function that can yield random samples of a gamma distribution without scipy (which I found hard to install properly, on a sidenote):

import random
import numpy

data = [6176, 11046, 670, 6146, 7945, 6864, 767, 7623, 7212, 9040, 3213, 6302, 10044, 10195, 9386, 7230, 4602, 6282, 8619, 7903, 6318, 13294, 6990, 5515, 9157]

# Fit gamma distribution through mean and average
mean_of_distribution = numpy.mean(data)
variance_of_distribution = numpy.var(data)

def gamma_random_sample(mean, variance, size):
    """Yields a list of random numbers following a gamma distribution defined by mean and variance"""
    g_alpha = mean*mean/variance
    g_beta = mean/variance
    for i in range(size):
        yield random.gammavariate(g_alpha,1/g_beta)

# force integer values to get integer sample
grs = [int(i) for i in gamma_random_sample(mean_of_distribution,variance_of_distribution,len(data))]

print("Original data: ", sorted(data))
print("Random sample: ", sorted(grs))

# Original data: [670, 767, 3213, 4602, 5515, 6146, 6176, 6282, 6302, 6318, 6864, 6990, 7212, 7230, 7623, 7903, 7945, 8619, 9040, 9157, 9386, 10044, 10195, 11046, 13294]
# Random sample:  [1646, 2237, 3178, 3227, 3649, 4049, 4171, 5071, 5118, 5139, 5456, 6139, 6468, 6726, 6944, 7050, 7135, 7588, 7597, 7971, 10269, 10563, 12283, 12339, 13066]
韶华倾负 2024-09-09 11:50:38

如果您想要一个长示例,包括有关估计或修复发行版支持的讨论,那么您可以在 https://github.com/scipy/scipy/issues/1359 以及链接的邮件列表消息。

scipy 的主干版本中添加了对拟合期间修复参数(例如位置)的初步支持。

If you want a long example including a discussion about estimating or fixing the support of the distribution, then you can find it in https://github.com/scipy/scipy/issues/1359 and the linked mailing list message.

Preliminary support to fix parameters, such as location, during fit has been added to the trunk version of scipy.

ペ泪落弦音 2024-09-09 11:50:38

OpenTURNS 有一个简单的方法可以使用 GammaFactory 类来执行此操作。

首先,让我们生成一个样本:

import openturns as ot
gammaDistribution = ot.Gamma()
sample = gammaDistribution.getSample(100)

然后将 Gamma 拟合到它:

distribution = ot.GammaFactory().build(sample)

然后我们可以绘制 Gamma 的 PDF:

import openturns.viewer as otv
otv.View(distribution.drawPDF())

生成:

A gamma distribution

有关此主题的更多详细信息,请访问:http://openturns.github.io/openturns/latest/user_manual/_ generated/openturns.GammaFactory.html

OpenTURNS has a simple way to do this with the GammaFactory class.

First, let's generate a sample:

import openturns as ot
gammaDistribution = ot.Gamma()
sample = gammaDistribution.getSample(100)

Then fit a Gamma to it:

distribution = ot.GammaFactory().build(sample)

Then we can draw the PDF of the Gamma:

import openturns.viewer as otv
otv.View(distribution.drawPDF())

which produces:

A gamma distribution

More details on this topic at: http://openturns.github.io/openturns/latest/user_manual/_generated/openturns.GammaFactory.html

潜移默化 2024-09-09 11:50:38

1):“data”变量可以是python列表或元组的格式,也可以是numpy.ndarray的格式,可以通过以下方式获得:

data=numpy.array(data)

其中上行中的第二个数据应该是列表或元组,包含您的数据。

2:“参数”变量是您可以选择提供给拟合函数作为拟合过程的起点的第一个猜测,因此可以省略。

3:关于@mondano答案的注释。使用矩(均值和方差)来计算伽马参数对于大形状参数(alpha>10)来说相当好,但对于较小的 alpha 值可能会产生较差的结果(参见大气科学中的统计方法< /em> 作者:Wilks 和 THOM,HCS,1958:关于 gamma 分布的注释,Wea,86,117-122,

正如在 scipy 模块中实现的那样。在这种情况下的选择。

1): the "data" variable could be in the format of a python list or tuple, or a numpy.ndarray, which could be obtained by using:

data=numpy.array(data)

where the 2nd data in the above line should be a list or a tuple, containing your data.

2: the "parameter" variable is a first guess you could optionally provide to the fitting function as a starting point for the fitting process, so it could be omitted.

3: a note on @mondano's answer. The usage of moments (mean and variances) to work out the gamma parameters are reasonably good for large shape parameters (alpha>10), but could yield poor results for small values of alpha (See Statistical methods in the atmospheric scineces by Wilks, and THOM, H. C. S., 1958: A note on the gamma distribution. Mon. Wea. Rev., 86, 117–122.

Using Maximum Likelihood Estimators, as that implemented in the scipy module, is regarded a better choice in such cases.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文