如何使用mysql用户定义函数生成高斯分布

发布于 2024-08-24 05:02:52 字数 270 浏览 9 评论 0原文

我喜欢用MySQL做定量分析和统计。 我想创建一个以下形式的 MySQL 用户定义函数: 返回单个随机样本的sample_gaussian(mean, stdev) 从具有平均值和标准的高斯分布中采样的值 用户输入参数的偏差。 MySQL 已经有一个 函数 rand() 返回一个随机数,所以我只需要 知道一些用于约束/转换该值的伪代码 从而使其落入正确的分布。 有什么建议吗?

顺便说一句-这是我的第一个 stackoverflow 问题,所以请原谅 如果这个问题对这个网站上的用户提出了太多要求,我会问。

I like to use MySQL to do quantitative analysis and statistics.
I would like to make a MySQL user-defined function of the form:
sample_gaussian(mean, stdev) that returns a single randomized
value sampled from a gaussian distribution having mean and standard
deviation of the user-entered arguments. MySQL already has a
function rand() that returns a random number, so I just need to
know some pseudocode for constraining/transforming that value
so that it falls into the right distribution.
Any suggestions?

BTW- This is my first stackoverflow question, so please forgive
me if this question is asking too much of users on this site.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

沉鱼一梦 2024-08-31 05:02:52

为了回答我自己的问题,这里有一个 MySQL 用户定义的函数,它返回从具有给定均值和标准差的高斯分布中采样的单个随机值。

DROP FUNCTION IF EXISTS gauss;
DELIMITER //
CREATE FUNCTION gauss(mean float, stdev float) RETURNS float
BEGIN
set @x=rand(), @y=rand();
set @gaus = ((sqrt(-2*log(@x))*cos(2*pi()*@y))*stdev)+mean;
return @gaus;
END
//
DELIMITER ;

要验证这实际上返回高斯分布,您可以生成一系列这些分布,然后绘制直方图:

create temporary table temp (id int, rando float);
insert into temp (rando) select gauss(2,1); # repeat this operation 500 times
insert into temp (rando) select gauss(2,1) from any_table_with_500+_entries limit 500;
select round(temp,1), count(*) from temp group by round(temp,1) # creates a histogram

如果您在 Excel 或所选图形工具中绘制该直方图,您将看到钟形正态曲线。

In answer to my own question, here is a MySQL user-defined function that returns a single random value sampled from a Gaussian distribution with a given mean and standard deviation.

DROP FUNCTION IF EXISTS gauss;
DELIMITER //
CREATE FUNCTION gauss(mean float, stdev float) RETURNS float
BEGIN
set @x=rand(), @y=rand();
set @gaus = ((sqrt(-2*log(@x))*cos(2*pi()*@y))*stdev)+mean;
return @gaus;
END
//
DELIMITER ;

To verify that this is in fact returning a Gaussian distribution, you can generate a series of these, then plot a histogram:

create temporary table temp (id int, rando float);
insert into temp (rando) select gauss(2,1); # repeat this operation 500 times
insert into temp (rando) select gauss(2,1) from any_table_with_500+_entries limit 500;
select round(temp,1), count(*) from temp group by round(temp,1) # creates a histogram

If you plot that histogram in excel or graphing tool of choice, you'll see the bell shaped normal curve.

单调的奢华 2024-08-31 05:02:52

rand() 返回 0 到 1 之间均匀分布的随机变量(您应该验证这一点,因为我不确定 - 这就是它在 Sybase 中的工作方式)。您可以使用 rand() 生成一个或多个均值为零、标准差(和方差)为一的正态分布随机变量 r,即 r ~ N(0,1),实现上述方法之一 此处

当您从 N(0,1) 生成随机变量时,您可以将其去标准化(求解 X在公式此处中)从 N(my_mean,my_std) 中获取随机变量,即乘以 my_std,然后加上 my_mean。

rand() returns a uniformly distributed random variable between 0 and 1 (you should verify this because i am not sure - this is how it works in Sybase). You can use rand() to generate one or more normally distributed random variables r with mean zero and standard deviation (and variance) one, i.e. r ~ N(0,1), implementing one of the methods mentioned here

When you have generated a random variable from N(0,1), you can de-standardize it (solve for X in the formula here) to get a random variable from N(my_mean,my_std), that is by multiplying it by my_std and then adding my_mean.

素衣风尘叹 2024-08-31 05:02:52

Box-Muller 变换是一种使用初等函数生成标准正态随机变量的方法。它一次生成两个,这有时很浪费,但我发现它非常优雅。

The Box-Muller transform is a way to generate standard normal random variates using elementary functions. It generates two at a time, which is sometimes wasteful, but I find it very elegant.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文