在 Perl 中如何计算给定正态分布的点的概率?
Perl 中是否有一个包可以让您计算每个给定点的概率分布高度。例如,在 R 中可以这样完成:
> dnorm(0, mean=4,sd=10)
> 0.03682701
即点 x=0 落入正态分布(平均值=4、sd=10)的概率为 0.0368。 我查看了 Statistics::Distribution 但它没有不给那个非常 函数来做到这一点。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
dnorm(0, Mean=4, sd=10) 确实没有给出这样一个点发生的概率。引用维基百科上的概率密度函数
您提到的概率是
从 N(4, 10) 分布中获得等于或小于 0 的值的概率为 34.46%。
至于你的 Perl 问题:如果你知道如何在 R 中做到这一点,但需要从 Perl 中得到它,也许你需要编写一个基于 R 的 libRmath (在 Debian 中由包 r-mathlib 提供)的 Perl 扩展来获取这些函数珀尔?这不需要 R 解释器。
否则,您可以尝试使用 GNU GSL 或 Cephes 库来访问这些特殊函数。
dnorm(0, mean=4, sd=10) does not give you thr probability of such a point occurring. To quote Wikipedia on probability density function
and the probability you mention is
or a 34.46% chance of getting a value equal to or smaller than 0 from a N(4, 10) distribution.
As for your Perl question: If you know how to do it in R, but need it from Perl, maybe you need to write a Perl extension based on R's libRmath (provided in Debian by the package r-mathlib) to get those functions to Perl? This does not require the R interpreter.
Otherwise, you could try the GNU GSL or the Cephes libraries for access to these special functions.
为什么不沿着这些思路做一些事情(我正在用 R 编写,但它可以在 Perl 中使用Statistics::Distribution 完成):
[编辑:1] 我应该提到,这种近似在远端可能会变得非常可怕。根据您的应用程序,这可能重要也可能不重要。
[编辑:2] @foolishbrat 将代码变成了函数。结果应该始终是积极的。也许您忘记了在 perl 模块中您提到函数返回较高概率 1-F,而 R 返回 F?
[编辑:3]修复了复制和粘贴错误。
Why not something along these lines (I am writing in R, but it could be done in perl with Statistics::Distribution):
[edit:1] I should mention that this approximation can get pretty horrible at the far tails. it might or might not matter depending on your application.
[edit:2] @foolishbrat Turned the code into a function. The result should always be positive. Perhaps you are forgetting that in the perl module you mention the function returns the upper probability 1-F, and R returns F?
[edit: 3] fixed a copy and paste error.
如果你确实想要密度函数,为什么不直接使用它:
它给出的结果是 1.65649768474891 ,与 R 中的 dnorm 大致相同。
If you really want the density function, why not use it directly:
It gives 1.65649768474891 about the same as dnorm in R.
我认为朱尼的说法不太正确。这似乎给出了 PDF 的合理版本(如果您只想要特定的 xy 点,请提取循环的中间部分):
I don't think Jouni is quite right. This seems to give a reasonable version of the PDF (extract the middle of the loop if you just want a specific x-y point):
正如其他人指出的那样,您可能需要累积分布函数。这可以通过误差函数获得(按平均值移动并按标准差缩放你的正态分布),它存在于标准数学库中,并且可以通过 Math 在 Perl 中访问: :Libm。
As others have pointed out, you probably want the cumulative distribution function. This can be obtained via the error function (shifted by the mean and scaled by the standard deviation of your normal distribution), which exists in the standard math library and is made accessible in Perl by Math::Libm.
使用 Perl 的 Statistics::Distributions,您可以通过以下方式实现此目的:
结果为“点 0 处的概率分布高度 = 0.34458”
Using Perl's Statistics::Distributions, you can achieve this with:
Results with "Height of probablility distribution at point 0 = 0.34458"
以下是如何使用 CPAN 中的 Math::SymbolicX::Statistics::Distributions 模块:
该模块中的 normal_distribution() 函数是函数生成器。 $norm 将是 Math::Symbolic ( ::Operator) 可以修改的对象。例如,implement,在上面的示例中,将两个参数变量替换为常量。
但请注意,正如德克指出的那样,您可能需要正态分布的累积函数。或者更一般地说是某个范围内的积分。
不幸的是,Math::Symbolic 无法进行符号积分。因此,您必须求助于诸如 Math::Integral::Romberg。 (或者,在 CPAN 中搜索错误函数的实现。)这可能很慢,但仍然很容易做到。将其添加到上面的代码片段中:
这应该为您提供 Dirk 答案中的 ~0.344578258389676 。
Here's how you can do the same thing you're doing with R in Perl using the Math::SymbolicX::Statistics::Distributions module from CPAN:
The normal_distribution() function from that module is a generator for functions. $norm will be a Math::Symbolic (::Operator) object that you can modify. For example with implement, which, in the above example, replaces the two parameter variables with constants.
Note, however as Dirk pointed out, that you probably want the cumulative function of the normal distribution. Or more generally the integral in a certain range.
Unfortunately, Math::Symbolic can't do integration symbolically. Therefore, you'd have to resort to numerical integration with the likes of Math::Integral::Romberg. (Alternatively, search CPAN for an implementation of the error function.) This may be slow, but it's still easy to do. Add this to the above snippet:
This should give you the ~0.344578258389676 from Dirk's answer.