组合两个正态随机变量

发布于 2024-10-07 19:03:44 字数 558 浏览 5 评论 0原文

假设我有以下 2 个随机变量：

X，其中平均值 = 6 且标准偏差 = 3.5
Y 其中mean = -42 且stdev = 5

我想根据前两个创建一个新的随机变量Z，并知道：X 发生在 90% 的时间，Y 发生在 10% 的时间。

计算 Z 的平均值很容易： 0.9 * 6 + 0.1 * -42 = 1.2

但是是否可以在单个函数中生成 Z 的随机值？当然，我可以按照这些思路做一些事情：

if (randIntBetween(1,10) > 1)
    GenerateRandomNormalValue(6, 3.5);
else
    GenerateRandomNormalValue(-42, 5);

但我真的很想有一个函数来充当这样一个不一定是正态的随机变量（Z）的概率密度函数。

抱歉，蹩脚的伪代码

感谢您的帮助！

编辑：这是一个具体的询问：

假设我们将 Z 中 5 个连续值的结果相加。以大于 10 的数字结尾的概率是多少？

原文

suppose I have the following 2 random variables :

X where mean = 6 and stdev = 3.5
Y where mean = -42 and stdev = 5

I would like to create a new random variable Z based on the first two and knowing that : X happens 90% of the time and Y happens 10% of the time.

It is easy to calculate the mean for Z : 0.9 * 6 + 0.1 * -42 = 1.2

But is it possible to generate random values for Z in a single function?
Of course, I could do something along those lines :

if (randIntBetween(1,10) > 1)
    GenerateRandomNormalValue(6, 3.5);
else
    GenerateRandomNormalValue(-42, 5);

But I would really like to have a single function that would act as a probability density function for such a random variable (Z) that is not necessary normal.

sorry for the crappy pseudo-code

Thanks for your help!

Edit : here would be one concrete interrogation :

Let's say we add the result of 5 consecutives values from Z. What would be the probability of ending with a number higher than 10?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

难忘№最初的完美 2024-10-14 19:03:44

但是我真的很想拥有一个
单个函数将充当
这样的概率密度函数
随机变量 (Z) 不是
必要的正常。

好的，如果你想要密度，这里是：

rho = 0.9 * density_of_x + 0.1 * density_of_y

但是如果你不这样做，你就无法从这个密度中采样：1）计算它的CDF（很麻烦，但并非不可行）2）反转它（你将需要一个数值求解器）。或者您可以进行拒绝采样（或变体，例如重要性采样）。这是成本高昂且难以正确实施的。

因此，您应该使用“if”语句（即调用生成器 3 次），除非您有非常充分的理由不这样做（例如使用准随机序列）。

But I would really like to have a
single function that would act as a
probability density function for such
a random variable (Z) that is not
necessary normal.

Okay, if you want the density, here it is:

rho = 0.9 * density_of_x + 0.1 * density_of_y

But you cannot sample from this density if you don't 1) compute its CDF (cumbersome, but not infeasible) 2) invert it (you will need a numerical solver for this). Or you can do rejection sampling (or variants, eg. importance sampling). This is costly, and cumbersome to get right.

So you should go for the "if" statement (ie. call the generator 3 times), except if you have a very strong reason not to (using quasi-random sequences for instance).

回复收藏 0 原文

一江春梦 2024-10-14 19:03:44

如果随机变量表示为 x=(mean,stdev)，则以下代数适用，

number * x = ( number*mean, number*stdev )

x1 + x2 = ( mean1+mean2, sqrt(stdev1^2+stdev2^2) )

因此对于 X = (mx,sx)、Y= (my,sy) 的情况，线性组合为

Z = w1*X + w2*Y = (w1*mx,w1*sx) + (w2*my,w2*sy) = 
    ( w1*mx+w2*my, sqrt( (w1*sx)^2+(w2*sy)^2 ) ) =
    ( 1.2, 3.19 )

链接：正态分布查找杂项部分，第 1 项。

PS。抱歉使用了奇怪的符号。新的标准差是通过类似于毕达哥拉斯定理的方法计算的。它是平方和的平方根。

If a random variable is denoted x=(mean,stdev) then the following algebra applies

number * x = ( number*mean, number*stdev )

x1 + x2 = ( mean1+mean2, sqrt(stdev1^2+stdev2^2) )

so for the case of X = (mx,sx), Y= (my,sy) the linear combination is

Z = w1*X + w2*Y = (w1*mx,w1*sx) + (w2*my,w2*sy) = 
    ( w1*mx+w2*my, sqrt( (w1*sx)^2+(w2*sy)^2 ) ) =
    ( 1.2, 3.19 )

link: Normal Distribution look for Miscellaneous section, item 1.

PS. Sorry for the wierd notation. The new standard deviation is calculated by something similar to the pythagorian theorem. It is the square root of the sum of squares.

回复收藏 0 原文

鲜血染红嫁衣 2024-10-14 19:03:44

这是分布的形式：

ListPlot[BinCounts[Table[If[RandomReal[] < .9,
    RandomReal[NormalDistribution[6, 3.5]], 
    RandomReal[NormalDistribution[-42, 5]]], {1000000}], {-60, 20, .1}], 
    PlotRange -> Full, DataRange -> {-60, 20}]

alt text

它不是正常的，因为您没有添加正常变量，而只是选择以一定的概率选择其中之一。

编辑

这是使用此分布添加五个变量的曲线：

alt text

上部和下部峰值代表单独采用其中一种分布，中间峰值表示混合。

This is the form of the distribution:

ListPlot[BinCounts[Table[If[RandomReal[] < .9,
    RandomReal[NormalDistribution[6, 3.5]], 
    RandomReal[NormalDistribution[-42, 5]]], {1000000}], {-60, 20, .1}], 
    PlotRange -> Full, DataRange -> {-60, 20}]

alt text

It is NOT Normal, as you are not adding Normal variables, but just choosing one or the other with certain probability.

Edit

This is the curve for adding five vars with this distribution:

alt text

The upper and lower peaks represent taking one of the distributions alone, and the middle peak accounts for the mixing.

回复收藏 0 原文