确定随机变量的概率质量函数

发布于 2024-09-29 22:35:43 字数 72 浏览 5 评论 0原文

如果我们有一个离散随机变量 x 以及 X(n) 中与其相关的数据,那么在 matlab 中我们如何确定概率质量函数 pmf(X)?

If we have a discrete random variable x and the data pertaining to it in X(n), how in matlab can we determine the probability mass function pmf(X)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

南烟 2024-10-06 22:35:43

您可以通过至少八种不同的方式来做到这一点(其中一些已经在其他解决方案中提到过)。

假设我们有一个来自离散随机变量的样本:

X = randi([-9 9], [100 1]);

考虑这些等效解决方案(请注意,我没有假设任何可能值的范围,只是它们是整数):

[V,~,labels] = grp2idx(X);
mx = max(V);

%# TABULATE (internally uses HIST)
t = tabulate(V);
pmf1 = t(:, 3) ./ 100;

%# HIST (internally uses HISTC)
pmf2 = hist(V, mx)' ./ numel(V);                      %#'

%# HISTC
pmf3 = histc(V, 1:mx) ./ numel(V);

%# ACCUMARRAY
pmf4 = accumarray(V, 1) ./ numel(V);

%# SORT/FIND/DIFF
pmf5 = diff( find( [diff([0;sort(V)]) ; 1] ) ) ./ numel(V);

%# SORT/UNIQUE/DIFF
[~,idx] = unique( sort(V) );
pmf6 = diff([0;idx]) ./ numel(V);

%# ARRAYFUN
pmf7 = arrayfun(@(x) sum(V==x), 1:mx)' ./ numel(V);   %#'

%# BSXFUN
pmf8 = sum( bsxfun(@eq, V, 1:mx) )' ./ numel(V);      %#'

请注意,GRP2IDX 用于获取从 1 开始的对应索引到 pmf 的条目(映射由 labels 给出)。上面的结果是:

>> [labels pmf]
ans =
           -9         0.03
           -8         0.07
           -7         0.04
           -6         0.07
           -5         0.03
           -4         0.06
           -3         0.05
           -2         0.05
           -1         0.06
            0         0.05
            1         0.04
            2         0.07
            3         0.03
            4         0.09
            5         0.08
            6         0.02
            7         0.03
            8         0.08
            9         0.05

You can do this in at least eight different ways (some of them were already mentioned in the other solutions).

Say we have a sample from a discrete random variable:

X = randi([-9 9], [100 1]);

Consider these equivalent solutions (note that I don't assume anything about the range of possible values, just that they are integers):

[V,~,labels] = grp2idx(X);
mx = max(V);

%# TABULATE (internally uses HIST)
t = tabulate(V);
pmf1 = t(:, 3) ./ 100;

%# HIST (internally uses HISTC)
pmf2 = hist(V, mx)' ./ numel(V);                      %#'

%# HISTC
pmf3 = histc(V, 1:mx) ./ numel(V);

%# ACCUMARRAY
pmf4 = accumarray(V, 1) ./ numel(V);

%# SORT/FIND/DIFF
pmf5 = diff( find( [diff([0;sort(V)]) ; 1] ) ) ./ numel(V);

%# SORT/UNIQUE/DIFF
[~,idx] = unique( sort(V) );
pmf6 = diff([0;idx]) ./ numel(V);

%# ARRAYFUN
pmf7 = arrayfun(@(x) sum(V==x), 1:mx)' ./ numel(V);   %#'

%# BSXFUN
pmf8 = sum( bsxfun(@eq, V, 1:mx) )' ./ numel(V);      %#'

note that GRP2IDX was used to get indices starting at 1 corresponding to the entries of pmf (the mapping is given by labels). The result of the above is:

>> [labels pmf]
ans =
           -9         0.03
           -8         0.07
           -7         0.04
           -6         0.07
           -5         0.03
           -4         0.06
           -3         0.05
           -2         0.05
           -1         0.06
            0         0.05
            1         0.04
            2         0.07
            3         0.03
            4         0.09
            5         0.08
            6         0.02
            7         0.03
            8         0.08
            9         0.05
人生戏 2024-10-06 22:35:43

以下摘自 MATLAB 文档 显示了如何绘制直方图。对于离散概率函数,频率分布可能与直方图相同。

x = -4:0.1:4;
y = randn(10000,1);
n = hist(y,x);
pmf = n/sum(n);
plot(pmf,'o');

计算每个 bin 中所有元素的总和。将所有 bin 除以总和即可得到 pdf。通过添加所有元素来测试您的 pdf。结果一定是一。

希望我的说法是正确的。已经很久了……

The following excerpt from the MATLAB documentation shows how to plot a histogram. For a discrete probability function, the frequency distribution might be identical with the histogram.

x = -4:0.1:4;
y = randn(10000,1);
n = hist(y,x);
pmf = n/sum(n);
plot(pmf,'o');

Calculate the sum of all the elements in every bin. Divide all bins by the sum to get your pdf. Test your pdf by adding up all elements. The result must be one.

Hope I'm right with my statements. It's a long time since ...

故人爱我别走 2024-10-06 22:35:43

这个功能怎么样?

function Y = pmf(X)
A=tabulate(X)
A(:,3)=A(:,3)/100
Y=A(:,3)'

您认为这是正确的吗?

How about this function?

function Y = pmf(X)
A=tabulate(X)
A(:,3)=A(:,3)/100
Y=A(:,3)'

Is this correct in your opinion?

好久不见√ 2024-10-06 22:35:43

也许尝试只创建一个函数句柄,这样就不需要存储另一个数组:

pmf = @(x) arrayfun(@(y) nnz(DATA==y)/length(DATA),x);

Maybe try making just a function handle so you don't need to store another array:

pmf = @(x) arrayfun(@(y) nnz(DATA==y)/length(DATA),x);
辞别 2024-10-06 22:35:43

要添加另一个选项(因为有许多函数可以完成您想要的操作),您可以使用函数 ACCUMARRAY 如果您的离散值是大于 0 的整数:

pmf = accumarray(X(:),1)./numel(X);

下面是一个示例:

>> X = [1 1 1 1 2 2 2 3 3 4];          %# A sample distribution of values
>> pmf = accumarray(X(:),1)./numel(X)  %# Compute the probability mass function

pmf =

    0.4000      %# 1 occurs 40% of the time
    0.3000      %# 2 occurs 30% of the time
    0.2000      %# 3 occurs 20% of the time
    0.1000      %# 4 occurs 10% of the time

To add yet another option (since there are a number of functions available to do what you want), you could easily compute the pmf using the function ACCUMARRAY if your discrete values are integers greater than 0:

pmf = accumarray(X(:),1)./numel(X);

Here's an example:

>> X = [1 1 1 1 2 2 2 3 3 4];          %# A sample distribution of values
>> pmf = accumarray(X(:),1)./numel(X)  %# Compute the probability mass function

pmf =

    0.4000      %# 1 occurs 40% of the time
    0.3000      %# 2 occurs 30% of the time
    0.2000      %# 3 occurs 20% of the time
    0.1000      %# 4 occurs 10% of the time
嘿哥们儿 2024-10-06 22:35:43

如果我理解正确的话,你需要做的是估计 pdf,除非它不是连续的而是离散的值。

计算 X(n) 中不同值的出现次数并除以 n。为了说明我所说的,请允许我举一个例子。假设您有 10 个观察值:

X = [1 1 2 3 1 9 12 3 1 2]

那么您的 pmf 将如下所示:

pmf(X) = [0.4 0.2 0.2 0 0 0 0 0 0.1 0 0 0.1]

编辑: 这原则上是频率直方图,正如 @zellus 也指出的那样

If I understood correctly what you need to do is to estimate the pdf, except it is not continuous but discrete values.

Calculate the occurrences of different values in X(n) and divide by n. To illustrate what I am saying, please allow me to give an example. Assume that you have 10 observations:

X = [1 1 2 3 1 9 12 3 1 2]

then your pmf would look like this:

pmf(X) = [0.4 0.2 0.2 0 0 0 0 0 0.1 0 0 0.1]

edit: this is in principle a frequency histogram, as @zellus has also pointed out

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文