在Java中生成具有最大值、最小值和平均值的随机数
我需要生成具有以下属性的随机数。
最小值应为 200
最大值应为 20000
平均值为 500。
可选:第 75 个百分位数为 5000
绝对不是均匀分布,也不是高斯分布。我需要给予一些左偏。
I need to generate random numbers with following properties.
Min should be 200
Max should be 20000
Average(mean) is 500.
Optional: 75th percentile to be 5000
Definitely it is not uniform distribution, nor gaussian. I need to give some left skewness.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
Java Random 可能不起作用,因为它只提供正态(高斯)分布。
您可能正在寻找的是 f 分布(见下文)。您可以在此处使用distlib库并选择f 分布。您可以使用随机 方法来获取随机数。
Java Random probably won't work because it only gives you normal(gaussian) distributions.
What you're probably looking for is an f distribution (see below). You can probably use the distlib library here and choose the f distribution. You can use the random method to get your random number.
假设 X 是您的目标变量,让我们通过执行 Y=(X-200)/(20000-200) 来标准化范围。所以现在你需要一些
Y
随机变量,它的值在[0,1]
中,平均值为(500-200)/(20000-200)=1/ 66。
你有很多选择,在我看来,最自然的一个是 Beta 发行版,
Y ~ Beta(a,b)
和a/(a+b) = 1/66
- 您有额外的自由度,您可以选择满足最后四分位要求。之后,您只需将 X 返回为
Y*(20000-200)+200
要生成 Beta 随机变量,您可以使用 Apache Commons 或参见 此处。
Say
X
is your target variable, lets normalize the range by doingY=(X-200)/(20000-200)
. So now you want someY
random variable that takes values in[0,1]
with mean(500-200)/(20000-200)=1/66
.You have many options, the most natural one seems to me a Beta distribution,
Y ~ Beta(a,b)
witha/(a+b) = 1/66
- you have an extra degree of freedom, which you can choose either to fit the last quartile requirement.After that, you simply return X as
Y*(20000-200)+200
To generate a Beta random variable, you can use Apache Commons or see here.
这可能不是您正在寻找的答案,但具有 3 个均匀分布的具体情况:
(忽略左边的数字,但它是按比例缩放的!)
我如何得到这些数字
首先,曲线下的面积在 200-500 和 500-20000 之间相等。这意味着高度关系是
300 * leftHeight == 19500 * rightHeight
使得leftHeight == 65 * rightHeight
这给了我们 1/66 的机会选择右边,并且65/66 的机会选择左边。
然后,我对第 75 个百分位数进行了相同的计算,只不过比率为
500-5000 机会 == 5000-20000 机会 * 10 / 3
。同样,这意味着我们有 10/13 的机会处于 50-75 百分位数,有 3/13 的机会处于 75-100 百分位数。感谢@Stas - 我正在使用他的“包容性随机”功能。
是的,我意识到我的数字是错误的,因为这种方法适用于离散数字,而我的计算是连续的。如果有人能纠正我的边境案件,那就太好了。
This may not be the answer you're looking for, but the specific case with 3 uniform distributions:
(Ignore the numbers on the left, but it is to scale!)
How I got the numbers
First, the area under the curve is equal between 200-500 and 500-20000. This means that the height relationship is
300 * leftHeight == 19500 * rightHeight
makingleftHeight == 65 * rightHeight
This gives us a 1/66 chance to choose right, and a 65/66 chance to choose left.
I then made the same calculation for the 75th percentile, except the ratio was
500-5000 chance == 5000-20000 chance * 10 / 3
. Again, this means we have a 10/13 chance to be in 50-75 percentile, and a 3/13 chance to be in 75-100.Kudos to @Stas - I am using his 'inclusive random' function.
And yes, I realise my numbers are wrong as this method works with discrete numbers, and my calculations were continuous. It would be good if someone could correct my border cases.
你可以有一个函数 f 在 [0;1] 上工作,比如
我猜这种形式的函数
可能是一个解决方案,你只需要解决相关的系统。
然后,您执行
f(uniform_random(0,1))
就可以了!You can have a function f working on [0;1] such as
I guess a function of the form
could be a solution, you just have to solve the related system.
Then, you do
f(uniform_random(0,1))
and there you are !您的问题很模糊,因为存在许多具有给定最小值、最大值和平均值的随机分布。
事实上,众多解决方案中的一种是选择概率为
(mean-min)/(max-min)
的max
,否则选择min
。也就是说,该解决方案仅生成两个数字之一——最小值和最大值。下面是另一种解决方案。
PERT 分布(或beta-PERT 分布< /em>) 被设计为采取最小值、最大值和估计模式。它是三角分布的“平滑”版本,从该分布生成随机变量可以按如下方式实现:
其中 -
startpt
是最小值,midpt
是众数(不一定是平均值或均值),endpt
是最大值,shape
是 0 或更大的数字,但通常为 4,而BetaDist(X, Y )
返回带有参数X
和Y
的 beta 分布的随机变量。给定已知平均值 (
mean
),midpt
可以通过以下方式计算:Your question is vague as there are numerous random distributions with a given minimum, maximum, and mean.
Indeed, one solution among many is to choose
max
with probability(mean-min)/(max-min)
andmin
otherwise. That is, this solution generates one of only two numbers — the minimum and the maximum.The following is another solution.
The PERT distribution (or beta-PERT distribution) is designed to take a minimum and maximum and estimated mode. It's a "smoothed-out" version of the triangular distribution, and generating a random variate from that distribution can be implemented as follows:
where—
startpt
is the minimum,midpt
is the mode (not necessarily average or mean),endpt
is the maximum,shape
is a number 0 or greater, but usually 4, andBetaDist(X, Y)
returns a random variate from the beta distribution with parametersX
andY
.Given a known mean (
mean
),midpt
can be calculated by: