根据分布生成随机数

发布于 2024-09-15 00:49:06 字数 27 浏览 4 评论 0原文

我想根据一些分布生成随机数。我该怎么做?

I want to generate random numbers according some distributions. How can I do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

香草可樂 2024-09-22 00:49:06

您拥有的标准随机数生成器(经过简单转换后的 C 语言中的 rand(),在许多语言中等效)是对 [0,1] 范围内的均匀分布的相当好的近似。如果这就是您所需要的,那么您就完成了。将其转换为在稍大的整数范围内生成的随机数也很简单。

均匀分布到正态分布的转换已经在 SO< /a>,就像指数分布一样。

[编辑]:对于 三角分布,转换统一变量相对简单(在 C- like):

double triangular(double a,double b,double c) {
   double U = rand() / (double) RAND_MAX;
   double F = (c - a) / (b - a);
   if (U <= F)
      return a + sqrt(U * (b - a) * (c - a));
   else
      return b - sqrt((1 - U) * (b - a) * (b - c));
}

这只是转换维基百科页面上给出的公式。如果你想要其他的,那就从那里开始寻找;一般来说,您使用uniform变量在累积密度函数的垂直轴上选取一个点您想要的分布(假设它是连续的),然后反转 CDF 以获得具有所需分布的随机值。

The standard random number generator you've got (rand() in C after a simple transformation, equivalents in many languages) is a fairly good approximation to a uniform distribution over the range [0,1]. If that's what you need, you're done. It's also trivial to convert that to a random number generated over a somewhat larger integer range.

Conversion of a Uniform distribution to a Normal distribution has already been covered on SO, as has going to the Exponential distribution.

[EDIT]: For the triangular distribution, converting a uniform variable is relatively simple (in something C-like):

double triangular(double a,double b,double c) {
   double U = rand() / (double) RAND_MAX;
   double F = (c - a) / (b - a);
   if (U <= F)
      return a + sqrt(U * (b - a) * (c - a));
   else
      return b - sqrt((1 - U) * (b - a) * (b - c));
}

That's just converting the formula given on the Wikipedia page. If you want others, that's the place to start looking; in general, you use the uniform variable to pick a point on the vertical axis of the cumulative density function of the distribution you want (assuming it's continuous), and invert the CDF to get the random value with the desired distribution.

娇柔作态 2024-09-22 00:49:06

正确的方法是将分布分解为 n-1 个二进制分布。也就是说,如果您有这样的分布:

A: 0.05
B: 0.10
C: 0.10
D: 0.20
E: 0.55

将其转换为 4 个二元分布:

1. A/E: 0.20/0.80
2. B/E: 0.40/0.60
3. C/E: 0.40/0.60
4. D/E: 0.80/0.20

从 n-1 个分布中均匀选择,然后根据每个符号在二元分布中的概率选择第一个或第二个符号。

代码在这里

The right way to do this is to decompose the distribution into n-1 binary distributions. That is if you have a distribution like this:

A: 0.05
B: 0.10
C: 0.10
D: 0.20
E: 0.55

You transform it into 4 binary distributions:

1. A/E: 0.20/0.80
2. B/E: 0.40/0.60
3. C/E: 0.40/0.60
4. D/E: 0.80/0.20

Select uniformly from the n-1 distributions, and then select the first or second symbol based on the probability if each in the binary distribution.

Code for this is here

千纸鹤 2024-09-22 00:49:06

这实际上取决于分布。最一般的方法如下。令 P(X) 为根据分布生成的随机数小于 X 的概率

。首先生成介于 0 和 1 之间的均匀随机 X。之后,您找到 Y,使得 P(Y) = X 并输出 Y。您可以使用二分查找找到这样的 Y(因为 P(X) 是 X 的递增函数)。

这不是很有效,但适用于可以有效计算 P(X) 的分布。

It actually depends on distribution. The most general way is the following. Let P(X) be the probability that random number generated according to your distribution is less than X.

You start with generating uniform random X between zero and one. After that you find Y such that P(Y) = X and output Y. You could find such Y using binary search (since P(X) is an increasing function of X).

This is not very efficient, but works for distributions where P(X) could be efficiently computed.

爱人如己 2024-09-22 00:49:06

您可以查找逆变换采样、拒绝采样以及 Devroye 的书“非均匀随机变量生成”/施普林格出版社 1986

You can look up inverse transform sampling, rejection sampling as well as the book by Devroye "Nonuniform random variate generation"/Springer Verlag 1986

傲性难收 2024-09-22 00:49:06

在过去的几年里,SciPy 中添加了一些不错的新工具来解决 Python 中的此类问题。只需提供一些有关分布的信息(例如密度/pdf),您就可以轻松地从自定义连续或离散单变量分布生成样本。

不同方法的概述:
https://docs.scipy.org/doc/scipy/reference/ stats.sampling.html

教程:
https://docs.scipy.org/doc/scipy/tutorial/ stats/sampling.html

如果您使用 R,Runuran 中提供了非常相似的功能 (https://CRAN.R-project.org/package=Runuran)。

C 库 UNURAN:https://statmath.wu.ac.at/unuran/doc。 html

以下是 Python 中的示例:

import numpy as np
from scipy.stats.sampling import NumericalInversePolynomial
from matplotlib import pyplot as plt
from scipy.integrate import quad


class MyDist:
    def __init__(self, a):
        self.a = a

    def support(self):
        # distribution restricted to 0, 5, can be changed as needed
        return (0, 5)

    def pdf(self, x):
        # this is not a proper pdf, the normalizing
        # constant is missing (does not integrate to one)
        return x * (x + np.sin(5*x) + 2) * np.exp(-x**self.a)


dist = MyDist(0.5)
gen = NumericalInversePolynomial(dist)

# compute the missing normalizing constant to plot the pdf
const_pdf = quad(dist.pdf, *dist.support())[0]

r = gen.rvs(size=50000)
x = np.linspace(r.min(), r.max(), 500)

# show histogram together with the pdf
plt.plot(x, dist.pdf(x) / const_pdf)
plt.hist(r, density=True, bins=100)
plt.show()

样本和密度

In the past few years, nice new tools have been added to SciPy to address this kind of problem in Python. You can easily generate samples from custom continuous or discrete univariate distributions by just providing some information about the distribution, such as the density / pdf.

Overview of the different methods:
https://docs.scipy.org/doc/scipy/reference/stats.sampling.html

Tutorial:
https://docs.scipy.org/doc/scipy/tutorial/stats/sampling.html

If you are using R, very similar functionality is available in Runuran (https://CRAN.R-project.org/package=Runuran).

C library UNURAN: https://statmath.wu.ac.at/unuran/doc.html

Here is an example in Python:

import numpy as np
from scipy.stats.sampling import NumericalInversePolynomial
from matplotlib import pyplot as plt
from scipy.integrate import quad


class MyDist:
    def __init__(self, a):
        self.a = a

    def support(self):
        # distribution restricted to 0, 5, can be changed as needed
        return (0, 5)

    def pdf(self, x):
        # this is not a proper pdf, the normalizing
        # constant is missing (does not integrate to one)
        return x * (x + np.sin(5*x) + 2) * np.exp(-x**self.a)


dist = MyDist(0.5)
gen = NumericalInversePolynomial(dist)

# compute the missing normalizing constant to plot the pdf
const_pdf = quad(dist.pdf, *dist.support())[0]

r = gen.rvs(size=50000)
x = np.linspace(r.min(), r.max(), 500)

# show histogram together with the pdf
plt.plot(x, dist.pdf(x) / const_pdf)
plt.hist(r, density=True, bins=100)
plt.show()

Samples and density

放飞的风筝 2024-09-22 00:49:06

您可以通过插值将离散 bin 转换为 float/double。简单的线性效果很好。如果表内存有限,可以使用其他插值方法。 -jlp

You can convert from discrete bins to float/double with interpolation. Simple linear works well. If your table memory is constrained other interpolation methods can be used. -jlp

尸血腥色 2024-09-22 00:49:06

这是标准的教科书问题。请参阅此处了解一些信息代码,或此处第 3.2 节获取一些参考数学背景(实际上非​​常快并且简单易读)。

It's a standard textbook matter. See here for some code, or here at Section 3.2 for some reference mathematical background (actually very quick and simple to read).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文