scipy.stats 中 cdf 的精度
我使用 chi2 分布作为模拟系统的理论问题。
对于给定的区间,我需要将此分布估计为 PMF,定义为该区间内 PDF 的积分。该值应接近间隔中心处的 PDF 值,但可能略有不同,具体取决于 PDF 的形状。
这就是我所做的:
import numpy
from scipy.stats import chi2
dist = chi2(10)
nbins = 120
F = dist.cdf(numpy.arange(nbins+1))
pmf = F[1:] - F[:-1] # surface inside the interval
pmf /= pmf.sum() # Normalisation
问题是 chi2.cdf(100, 10)
及以上给出的正是 1.0。所以我能得到的最小值约为 1.11e-16。但 chi2.pdf(100, 10)
并不完全是 0(大约是 2.5e-17)。
我的问题是:如何获得更高精度的 pmf 估计(可能高达 1e-25)?为什么 cdf 函数不如 pdf 函数精确?
I'm using chi2 distribution as a theoretical problem for a simulation system.
For a given interval, I need to estimate this distribution as a PMF defined as the integral of the PDF inside that interval. This value should be near the value of the PDF at the center of the interval, but can be slightly different, depending on the shape of the PDF.
Here is what I do:
import numpy
from scipy.stats import chi2
dist = chi2(10)
nbins = 120
F = dist.cdf(numpy.arange(nbins+1))
pmf = F[1:] - F[:-1] # surface inside the interval
pmf /= pmf.sum() # Normalisation
The problem is that chi2.cdf(100, 10)
and above gives exactly 1.0. So the minimum value I'm able to get is around 1.11e-16. But chi2.pdf(100, 10)
isn't exactly 0 (it's about 2.5e-17).
My question is: how can I get my pmf estimation with greater precision (maybe up to 1e-25)? Why is cdf function less precise than pdf function?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
cdf 在等于 1 的浮点精度范围内,但 sf 接近于零,因此微小的差异(1e-20)不会被大 1 掩盖。(参见 JABS 参考)
我不知道 的精确范围是多远sf,即 scipy.special.chdtrc(df, x),去
cdf is within floating point precision equal to one, but sf is close to zero, so tiny differences, 1e-20, are not covered up by the big 1. (see JABS reference)
I don't know how far the accurate range of the sf, i.e. scipy.special.chdtrc(df, x), goes
通常,每当我遇到精度问题时,我第一个使用的工具就是 mpmath。 90% 的时间它都能正常工作,而且速度足够快。在这种情况下,我们可以写:
给予(使用你的 F):
应该很容易使用四元组来获得你需要的任何标准化。
Usually whenever I have a precision problem the first tool I reach for is mpmath. 90% of the time it Just Works(tm), quickly enough. In this case, we can write:
giving (using your F):
Should be straightforward to use quad to get any normalization you need.