在 c++ 中生成泊松变量

发布于 2024-11-01 08:27:27 字数 380 浏览 8 评论 0原文

我实现了这个函数来生成泊松随机变量，

typedef long unsigned int luint;
luint poisson(luint lambda) {
    double L = exp(-double(lambda));
    luint k = 0;
    double p = 1;
    do {
        k++;
        p *= mrand.rand();
    } while( p > L);
    return (k-1);
}

其中 mrand 是 MersenneTwister 随机数生成器。我发现，当我增加 lambda 时，预期分布将会出错，均值在 750 左右饱和。这是由于数值近似还是我犯了任何错误？

原文

I implemented this function to generate a poisson random variable

typedef long unsigned int luint;
luint poisson(luint lambda) {
    double L = exp(-double(lambda));
    luint k = 0;
    double p = 1;
    do {
        k++;
        p *= mrand.rand();
    } while( p > L);
    return (k-1);
}

where mrand is the MersenneTwister random number generator. I find that, as I increase lambda, the expected distribution is going to be wrong, with a mean that saturates at around 750. Is it due to numerical approximations or did I make any mistakes?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

没企图 2024-11-08 08:27:27

如果您选择“现有库”路线，您的编译器可能已经支持 C++11 std::random 包。以下是您如何使用它：

#include <random>
#include <ctime>
#include <iostream>

std::mt19937 mrand(std::time(0));  // seed however you want

typedef long unsigned int luint;

luint poisson(luint lambda)
{
    std::poisson_distribution<luint> d(lambda);
    return d(mrand);
}

int main()
{
    std::cout << poisson(750) << '\n';
    std::poisson_distribution<luint> d(750);
    std::cout << d(mrand) << '\n';
    std::cout << d(mrand) << '\n';
}

我在上面使用了两种方式：

我试图模仿您现有的界面。
如果您创建一个带有均值的 std::poisson_distribution，那么反复使用该分布来获得相同的均值（如 main() 中所做的那样）会更有效。

这是我的输出示例：

751
730
779

If you go the "existing library" route, your compiler may already support the C++11 std::random package. Here is how you use it:

#include <random>
#include <ctime>
#include <iostream>

std::mt19937 mrand(std::time(0));  // seed however you want

typedef long unsigned int luint;

luint poisson(luint lambda)
{
    std::poisson_distribution<luint> d(lambda);
    return d(mrand);
}

int main()
{
    std::cout << poisson(750) << '\n';
    std::poisson_distribution<luint> d(750);
    std::cout << d(mrand) << '\n';
    std::cout << d(mrand) << '\n';
}

I've used it two ways above:

I tried to imitate your existing interface.
If you create a std::poisson_distribution with a mean, it is more efficient to use that distribution over and over for the same mean (as done in main()).

Here is sample output for me:

751
730
779

回复收藏 0 原文

甜是你 2024-11-08 08:27:27

exp(-750) 是一个非常小的数字，非常接近最小可能的双精度数，因此您的问题是数值问题。无论如何，您的复杂性在 lambda 中都是线性的，因此该算法对于高 lambda 来说不是很有效。除非您有充分的理由自己编写此代码，否则使用现有的库实现可能是有意义的，因为这些数值算法往往对于您遇到的精度问题非常敏感。

回复收藏 0 原文

篱下浅笙歌 2024-11-08 08:27:27

由于您仅在表达式 (p>L) 中使用 L，因此您实际上是在测试 (log(p) > -lambda)。这不是一个很有帮助的转变。当然，您不再需要 exp(-750)，但您只会溢出 p 。

现在，p 就是 Π(mrand.rand())，log(p) 就是 log(Π(mrand.rand())) 就是 Σ(log(mrand.rand())。这为您提供了必要的转换：

double logp = 0;
do {
    k++;
    logp += log(mrand.rand());
} while( logp > -lambda);

double 只有 11 位指数，但有 52 位尾数，因此，这是数值稳定性的巨大提高，代价是您需要一个 log。 在每次迭代中，而不是前面的单个 exp 。

Since you only use L in the expression (p>L), you're essentially testing for (log(p) > -lambda). That's not a very helpful transformation. Sure, you don't need exp(-750) anymore, but you'll just overflow p instead.

Now, p is just Π(mrand.rand()), and log(p) is log(Π(mrand.rand())) is Σ(log(mrand.rand()). That gives you the necessary transformation:

double logp = 0;
do {
    k++;
    logp += log(mrand.rand());
} while( logp > -lambda);

double has only 11 bits of exponent, but a 52 bits mantissa. Therefore this is a massive increase in numerical stability. The price paid is that you need a log on every iteration, instead of a single exp up front.

回复收藏 0 原文

缱绻入梦 2024-11-08 08:27:27

来自另一个问题我之前问过，似乎您也可以将 poisson(750) 近似为 poisson(375) + poisson(375)。

回复收藏 0 原文

如果没结果 2024-11-08 08:27:27

在此类情况下，您无需多次调用随机数生成器。您所需要的只是一个累积概率表：

double c[k] = // the probability that X <= k (k = 0,...)

然后生成一个随机数 0 <= r < 1，并取第一个整数X，使得c[X]> r。您可以通过二分搜索找到这个X。

为了生成这个表，我们需要单独的概率。

p[k] = lambda^k / (k! e^lambda) // // the probability that X = k

如果 lambda 很大，这会变得非常不准确，正如您所发现的。但我们可以在这里使用一个技巧：从（或接近）最大值开始，使用 k = Floor[lambda]，并暂时假装 p[k]等于1。然后计算 p[i] for i > k 使用递推关系

p[i+1] = (p[i]*lambda) / (i+1)

并且对于 i k using

p[i-1] = (p[i]*i)/lambda

这可确保最大的概率具有最大可能的精度。

现在只需使用 c[i+1] = c[i] + p[i+1] 计算 c[i]，直到 c [i+1] 与 c[i] 相同。然后你可以通过除以这个限制值c[i]来标准化数组；或者您可以保留数组不变，并使用随机数 0 <= r < c[i]。

请参阅：http://en.wikipedia.org/wiki/Inverse_transform_sampling

In situations like these, you don't need to invoke the random number generator more than once. All you need is a table of cumulative probabilities:

double c[k] = // the probability that X <= k (k = 0,...)

Then generate a random number 0 <= r < 1, and take the first integer X such that c[X] > r. You can find this X with a binary search.

To generate this table, we need the individual probabilities

p[k] = lambda^k / (k! e^lambda) // // the probability that X = k

If lambda is large, this becomes wildly inaccurate, as you have found. But we can use a trick here: start at (or near) the largest value, with k = floor[lambda], and pretend for the moment that p[k] is equal to 1. Then calculate p[i] for i > k using the recurrence relation

p[i+1] = (p[i]*lambda) / (i+1)

and for i < k using

p[i-1] = (p[i]*i)/lambda

This ensures that the largest probabilities have the greatest possible precision.

Now just calculate c[i] using c[i+1] = c[i] + p[i+1], up to the point where c[i+1] is the same as c[i]. Then you can normalise the array by dividing by this limiting value c[i]; or you can leave the array as it is, and use a random number 0 <= r < c[i].

See: http://en.wikipedia.org/wiki/Inverse_transform_sampling

回复收藏 0 原文

~没有更多了~