当前位置：文江博客话题详情

根据分布生成随机数

发布于 2024-09-15 00:49:06 字数 27 浏览 10 评论 0原文

我想根据一些分布生成随机数。我该怎么做？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

香草可樂 2024-09-22 00:49:06

您拥有的标准随机数生成器（经过简单转换后的 C 语言中的 rand()，在许多语言中等效）是对 [0,1] 范围内的均匀分布的相当好的近似。如果这就是您所需要的，那么您就完成了。将其转换为在稍大的整数范围内生成的随机数也很简单。

均匀分布到正态分布的转换已经在 SO< /a>，就像指数分布一样。

[编辑]：对于三角分布，转换统一变量相对简单（在 C- like):

double triangular(double a,double b,double c) {
   double U = rand() / (double) RAND_MAX;
   double F = (c - a) / (b - a);
   if (U <= F)
      return a + sqrt(U * (b - a) * (c - a));
   else
      return b - sqrt((1 - U) * (b - a) * (b - c));
}

这只是转换维基百科页面上给出的公式。如果你想要其他的，那就从那里开始寻找；一般来说，您使用uniform变量在累积密度函数的垂直轴上选取一个点您想要的分布（假设它是连续的），然后反转 CDF 以获得具有所需分布的随机值。

The standard random number generator you've got (rand() in C after a simple transformation, equivalents in many languages) is a fairly good approximation to a uniform distribution over the range [0,1]. If that's what you need, you're done. It's also trivial to convert that to a random number generated over a somewhat larger integer range.

Conversion of a Uniform distribution to a Normal distribution has already been covered on SO, as has going to the Exponential distribution.

[EDIT]: For the triangular distribution, converting a uniform variable is relatively simple (in something C-like):

double triangular(double a,double b,double c) {
   double U = rand() / (double) RAND_MAX;
   double F = (c - a) / (b - a);
   if (U <= F)
      return a + sqrt(U * (b - a) * (c - a));
   else
      return b - sqrt((1 - U) * (b - a) * (b - c));
}

That's just converting the formula given on the Wikipedia page. If you want others, that's the place to start looking; in general, you use the uniform variable to pick a point on the vertical axis of the cumulative density function of the distribution you want (assuming it's continuous), and invert the CDF to get the random value with the desired distribution.

回复收藏 0 原文

娇柔作态 2024-09-22 00:49:06

正确的方法是将分布分解为 n-1 个二进制分布。也就是说，如果您有这样的分布：

A: 0.05
B: 0.10
C: 0.10
D: 0.20
E: 0.55

将其转换为 4 个二元分布：

1. A/E: 0.20/0.80
2. B/E: 0.40/0.60
3. C/E: 0.40/0.60
4. D/E: 0.80/0.20

从 n-1 个分布中均匀选择，然后根据每个符号在二元分布中的概率选择第一个或第二个符号。

代码在这里

The right way to do this is to decompose the distribution into n-1 binary distributions. That is if you have a distribution like this:

A: 0.05
B: 0.10
C: 0.10
D: 0.20
E: 0.55

You transform it into 4 binary distributions:

1. A/E: 0.20/0.80
2. B/E: 0.40/0.60
3. C/E: 0.40/0.60
4. D/E: 0.80/0.20

Select uniformly from the n-1 distributions, and then select the first or second symbol based on the probability if each in the binary distribution.

Code for this is here

回复收藏 0 原文

千纸鹤 2024-09-22 00:49:06

这实际上取决于分布。最一般的方法如下。令 P(X) 为根据分布生成的随机数小于 X 的概率

。首先生成介于 0 和 1 之间的均匀随机 X。之后，您找到 Y，使得 P(Y) = X 并输出 Y。您可以使用二分查找找到这样的 Y（因为 P(X) 是 X 的递增函数）。

这不是很有效，但适用于可以有效计算 P(X) 的分布。

回复收藏 0 原文

爱人如己 2024-09-22 00:49:06

您可以查找逆变换采样、拒绝采样以及 Devroye 的书“非均匀随机变量生成”/施普林格出版社 1986

回复收藏 0 原文

傲性难收 2024-09-22 00:49:06

在过去的几年里，SciPy 中添加了一些不错的新工具来解决 Python 中的此类问题。只需提供一些有关分布的信息（例如密度/pdf），您就可以轻松地从自定义连续或离散单变量分布生成样本。

不同方法的概述：
https://docs.scipy.org/doc/scipy/reference/ stats.sampling.html

教程：
https://docs.scipy.org/doc/scipy/tutorial/ stats/sampling.html

如果您使用 R，Runuran 中提供了非常相似的功能 (https://CRAN.R-project.org/package=Runuran）。

C 库 UNURAN：https://statmath.wu.ac.at/unuran/doc。 html

以下是 Python 中的示例：

import numpy as np
from scipy.stats.sampling import NumericalInversePolynomial
from matplotlib import pyplot as plt
from scipy.integrate import quad


class MyDist:
    def __init__(self, a):
        self.a = a

    def support(self):
        # distribution restricted to 0, 5, can be changed as needed
        return (0, 5)

    def pdf(self, x):
        # this is not a proper pdf, the normalizing
        # constant is missing (does not integrate to one)
        return x * (x + np.sin(5*x) + 2) * np.exp(-x**self.a)


dist = MyDist(0.5)
gen = NumericalInversePolynomial(dist)

# compute the missing normalizing constant to plot the pdf
const_pdf = quad(dist.pdf, *dist.support())[0]

r = gen.rvs(size=50000)
x = np.linspace(r.min(), r.max(), 500)

# show histogram together with the pdf
plt.plot(x, dist.pdf(x) / const_pdf)
plt.hist(r, density=True, bins=100)
plt.show()

In the past few years, nice new tools have been added to SciPy to address this kind of problem in Python. You can easily generate samples from custom continuous or discrete univariate distributions by just providing some information about the distribution, such as the density / pdf.

Overview of the different methods:
https://docs.scipy.org/doc/scipy/reference/stats.sampling.html

Tutorial:
https://docs.scipy.org/doc/scipy/tutorial/stats/sampling.html

If you are using R, very similar functionality is available in Runuran (https://CRAN.R-project.org/package=Runuran).

C library UNURAN: https://statmath.wu.ac.at/unuran/doc.html

Here is an example in Python:

import numpy as np
from scipy.stats.sampling import NumericalInversePolynomial
from matplotlib import pyplot as plt
from scipy.integrate import quad


class MyDist:
    def __init__(self, a):
        self.a = a

    def support(self):
        # distribution restricted to 0, 5, can be changed as needed
        return (0, 5)

    def pdf(self, x):
        # this is not a proper pdf, the normalizing
        # constant is missing (does not integrate to one)
        return x * (x + np.sin(5*x) + 2) * np.exp(-x**self.a)


dist = MyDist(0.5)
gen = NumericalInversePolynomial(dist)

# compute the missing normalizing constant to plot the pdf
const_pdf = quad(dist.pdf, *dist.support())[0]

r = gen.rvs(size=50000)
x = np.linspace(r.min(), r.max(), 500)

# show histogram together with the pdf
plt.plot(x, dist.pdf(x) / const_pdf)
plt.hist(r, density=True, bins=100)
plt.show()