当前位置：文江博客话题详情

Perl statistics r probability

在 Perl 中如何计算给定正态分布的点的概率？

发布于 2024-08-04 06:41:25 字数 339 浏览 3 评论 0 原文

Perl 中是否有一个包可以让您计算每个给定点的概率分布高度。例如，在 R 中可以这样完成：

> dnorm(0, mean=4,sd=10)
> 0.03682701

即点 x=0 落入正态分布（平均值=4、sd=10）的概率为 0.0368。我查看了 Statistics::Distribution 但它没有不给那个非常函数来做到这一点。

原文

Is there a package in Perl that allows you to compute the height of probability distribution at each given point. For example this can be done in R this way:

> dnorm(0, mean=4,sd=10)
> 0.03682701

Namely the probability of point x=0 falls into a normal distribution, with mean=4 and sd=10, is 0.0368.
I looked at Statistics::Distribution but it doesn't give that very
function to do it.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

趁年轻赶紧闹 2024-08-11 06:41:25

dnorm(0, Mean=4, sd=10) 确实没有给出这样一个点发生的概率。引用维基百科上的概率密度函数

在概率论中，概率
密度函数 (pdf)——经常提及
作为概率分布
函数1 - 或密度，随机数
变量是一个函数，描述
每个点的概率密度
样本空间中的点。这
随机变量的概率
落入给定集合由下式给出
其密度的积分
设置。

您提到的概率是

R> pnorm(0, 4, 10)
[1] 0.3446

从 N(4, 10) 分布中获得等于或小于 0 的值的概率为 34.46%。

至于你的 Perl 问题：如果你知道如何在 R 中做到这一点，但需要从 Perl 中得到它，也许你需要编写一个基于 R 的 libRmath （在 Debian 中由包 r-mathlib 提供）的 Perl 扩展来获取这些函数珀尔？这不需要 R 解释器。

否则，您可以尝试使用 GNU GSL 或 Cephes 库来访问这些特殊函数。

dnorm(0, mean=4, sd=10) does not give you thr probability of such a point occurring. To quote Wikipedia on probability density function

In probability theory, a probability
density function (pdf)—often referred
to as a probability distribution
function1—or density, of a random
variable is a function that describes
the density of probability at each
point in the sample space. The
probability of a random variable
falling within a given set is given by
the integral of its density over the
set.

and the probability you mention is

R> pnorm(0, 4, 10)
[1] 0.3446

or a 34.46% chance of getting a value equal to or smaller than 0 from a N(4, 10) distribution.

As for your Perl question: If you know how to do it in R, but need it from Perl, maybe you need to write a Perl extension based on R's libRmath (provided in Debian by the package r-mathlib) to get those functions to Perl? This does not require the R interpreter.

Otherwise, you could try the GNU GSL or the Cephes libraries for access to these special functions.

回复收藏 0 原文

止于盛夏 2024-08-11 06:41:25

为什么不沿着这些思路做一些事情（我正在用 R 编写，但它可以在 Perl 中使用Statistics::Distribution 完成）：

dn <- function(x=0 # value
               ,mean=0 # mean 
               ,sd=1 # sd
               ,sc=10000 ## scale the precision
               ) {
  res <- (pnorm(x+1/sc, mean=mean, sd=sd)-pnorm(x, mean=mean, sd=sd))*sc
  res
}
> dn(0,4,10,10000)
0.03682709
> dn(2.02,2,.24)
1.656498

[编辑：1] 我应该提到，这种近似在远端可能会变得非常可怕。根据您的应用程序，这可能重要也可能不重要。

[编辑：2] @foolishbrat 将代码变成了函数。结果应该始终是积极的。也许您忘记了在 perl 模块中您提到函数返回较高概率 1-F，而 R 返回 F？

[编辑：3]修复了复制和粘贴错误。

Why not something along these lines (I am writing in R, but it could be done in perl with Statistics::Distribution):

dn <- function(x=0 # value
               ,mean=0 # mean 
               ,sd=1 # sd
               ,sc=10000 ## scale the precision
               ) {
  res <- (pnorm(x+1/sc, mean=mean, sd=sd)-pnorm(x, mean=mean, sd=sd))*sc
  res
}
> dn(0,4,10,10000)
0.03682709
> dn(2.02,2,.24)
1.656498

[edit:1] I should mention that this approximation can get pretty horrible at the far tails. it might or might not matter depending on your application.

[edit:2] @foolishbrat Turned the code into a function. The result should always be positive. Perhaps you are forgetting that in the perl module you mention the function returns the upper probability 1-F, and R returns F?

[edit: 3] fixed a copy and paste error.

回复收藏 0 原文

从﹋此江山别 2024-08-11 06:41:25

如果你确实想要密度函数，为什么不直接使用它：

$pi = 3.141593;
$x = 2.02;
$mean = 2;
$sd = .24;
print 1/($sd * sqrt(2*$pi)) * exp(-($x-$mean)**2 / (2 * $sd**2));

它给出的结果是 1.65649768474891 ，与 R 中的 dnorm 大致相同。

If you really want the density function, why not use it directly:

$pi = 3.141593;
$x = 2.02;
$mean = 2;
$sd = .24;
print 1/($sd * sqrt(2*$pi)) * exp(-($x-$mean)**2 / (2 * $sd**2));

It gives 1.65649768474891 about the same as dnorm in R.

回复收藏 0 原文

复古式 2024-08-11 06:41:25

我认为朱尼的说法不太正确。这似乎给出了 PDF 的合理版本（如果您只想要特定的 xy 点，请提取循环的中间部分）：

!/usr/bin/perl

use strict;
use Getopt::Std;
use POSIX qw(ceil floor);

# Usage
# Outputs normal density function given a mean and sd
# -s standard deviation
# -m mean
# -n normalization factor (multiply result by this amount), optional

my %para = ();
getopts('s:m:n:', \%para);
if (!exists ($para{'s'}) || !exists ($para{'m'})) {
   die ("mean and standard deviation required");
}

my $norm = 1.0;
if (exists ($para{'n'})) {
   $norm = $para{'n'};
}

my $sd = $para{'s'};
my $mean = $para{'m'};

my $start = floor($mean - ($sd * 5));
my $end = ceil($mean + ($sd * 5));

my $pi = 3.141593;

my $var = $sd**2;

for (my $x = $start; $x < $end; $x+=0.1) {
    my $e = exp( -1 * (($x-$mean)**2) / (2*$var));
    my $d = sqrt($var) * sqrt(2*$pi);
    my $y = 1.0/$d*$e * $norm;
    printf ("%5.5f %5.5f\n", $x, $y);
}

I don't think Jouni is quite right. This seems to give a reasonable version of the PDF (extract the middle of the loop if you just want a specific x-y point):

!/usr/bin/perl

use strict;
use Getopt::Std;
use POSIX qw(ceil floor);

# Usage
# Outputs normal density function given a mean and sd
# -s standard deviation
# -m mean
# -n normalization factor (multiply result by this amount), optional

my %para = ();
getopts('s:m:n:', \%para);
if (!exists ($para{'s'}) || !exists ($para{'m'})) {
   die ("mean and standard deviation required");
}

my $norm = 1.0;
if (exists ($para{'n'})) {
   $norm = $para{'n'};
}

my $sd = $para{'s'};
my $mean = $para{'m'};

my $start = floor($mean - ($sd * 5));
my $end = ceil($mean + ($sd * 5));

my $pi = 3.141593;

my $var = $sd**2;

for (my $x = $start; $x < $end; $x+=0.1) {
    my $e = exp( -1 * (($x-$mean)**2) / (2*$var));
    my $d = sqrt($var) * sqrt(2*$pi);
    my $y = 1.0/$d*$e * $norm;
    printf ("%5.5f %5.5f\n", $x, $y);
}

回复收藏 0 原文

蓝戈者 2024-08-11 06:41:25

正如其他人指出的那样，您可能需要累积分布函数。这可以通过误差函数获得（按平均值移动并按标准差缩放你的正态分布），它存在于标准数学库中，并且可以通过 Math 在 Perl 中访问： :Libm。

回复收藏 0 原文

樱娆 2024-08-11 06:41:25

使用 Perl 的 Statistics::Distributions，您可以通过以下方式实现此目的：

#!/usr/bin/perl

use strict; use warnings;
use Statistics::Distributions qw(uprob);

my $x       = 0;
my $mean    = 4;
my $stdev   = 10;

print "Height of probablility distribution at point $x = "
    . (1-uprob(($x-$mean)/$stdev))."\n";

结果为“点 0 处的概率分布高度 = 0.34458”

Using Perl's Statistics::Distributions, you can achieve this with:

#!/usr/bin/perl

use strict; use warnings;
use Statistics::Distributions qw(uprob);

my $x       = 0;
my $mean    = 4;
my $stdev   = 10;

print "Height of probablility distribution at point $x = "
    . (1-uprob(($x-$mean)/$stdev))."\n";

Results with "Height of probablility distribution at point 0 = 0.34458"

回复收藏 0 原文

无名指的心愿 2024-08-11 06:41:25

以下是如何使用 CPAN 中的 Math::SymbolicX::Statistics::Distributions 模块：

use strict; use warnings;

use Math::SymbolicX::Statistics::Distributions qw/normal_distribution/;

my $norm = normal_distribution(qw/mean sd/);
print $norm->value(mean => 4, sd => 10, x => 0), "\n";

# curry it with the parameter values
$norm->implement(mean => 4, sd => 10);
print $norm->value(x => 0),"\n"; # prints the same as above

该模块中的 normal_distribution() 函数是函数生成器。 $norm 将是 Math::Symbolic ( ::Operator) 可以修改的对象。例如，implement，在上面的示例中，将两个参数变量替换为常量。

但请注意，正如德克指出的那样，您可能需要正态分布的累积函数。或者更一般地说是某个范围内的积分。

不幸的是，Math::Symbolic 无法进行符号积分。因此，您必须求助于诸如 Math::Integral::Romberg。（或者，在 CPAN 中搜索错误函数的实现。）这可能很慢，但仍然很容易做到。将其添加到上面的代码片段中：

use Math::Integral::Romberg 'integral';

my ($int_sub) = $norm->to_sub(); # compile to a faster Perl sub
print $int_sub->(0),"\n";  # same number as above

print "p=" . integral($int_sub, -100., 0) . "\n";
# -100 is an arbitrary, small number

这应该为您提供 Dirk 答案中的 ~0.344578258389676 。

Here's how you can do the same thing you're doing with R in Perl using the Math::SymbolicX::Statistics::Distributions module from CPAN:

use strict; use warnings;

use Math::SymbolicX::Statistics::Distributions qw/normal_distribution/;

my $norm = normal_distribution(qw/mean sd/);
print $norm->value(mean => 4, sd => 10, x => 0), "\n";

# curry it with the parameter values
$norm->implement(mean => 4, sd => 10);
print $norm->value(x => 0),"\n"; # prints the same as above

The normal_distribution() function from that module is a generator for functions. $norm will be a Math::Symbolic (::Operator) object that you can modify. For example with implement, which, in the above example, replaces the two parameter variables with constants.

Note, however as Dirk pointed out, that you probably want the cumulative function of the normal distribution. Or more generally the integral in a certain range.

Unfortunately, Math::Symbolic can't do integration symbolically. Therefore, you'd have to resort to numerical integration with the likes of Math::Integral::Romberg. (Alternatively, search CPAN for an implementation of the error function.) This may be slow, but it's still easy to do. Add this to the above snippet:

use Math::Integral::Romberg 'integral';

my ($int_sub) = $norm->to_sub(); # compile to a faster Perl sub
print $int_sub->(0),"\n";  # same number as above

print "p=" . integral($int_sub, -100., 0) . "\n";
# -100 is an arbitrary, small number

This should give you the ~0.344578258389676 from Dirk's answer.

回复收藏 0 原文

~没有更多了~

关于作者

花开浅夏

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

在 Perl 中如何计算给定正态分布的点的概率？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

在 Perl 中如何计算给定正态分布的点的概率？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。