随机生成一组长度为n、总计为x的数字
我正在开发一个有趣的项目,我需要一个算法来执行以下操作: 生成长度为 n
的数字列表,加起来为 x
我会选择整数列表,但理想情况下,我希望留下一组浮点数数字。
如果这个问题没有得到深入研究,我会感到非常惊讶,但我不知道要寻找什么。
我过去曾解决过类似的问题,但这个问题本质上截然不同。在我生成加起来为 x 的数字列表的不同组合之前。我确信我可以简单地暴力破解这个问题,但这似乎不是理想的解决方案。
任何人都知道这可能被称为什么,或者如何处理它?谢谢大家!
编辑:为了澄清,我的意思是列表的长度应该是 N,而数字本身可以是任何大小。
edit2:抱歉我对“set”的使用不当,我将它用作列表或数组的笼统术语。我知道这造成了混乱,我很抱歉。
I'm working on a project for fun and I need an algorithm to do as follows:
Generate a list of numbers of Length n
which add up to x
I would settle for list of integers, but ideally, I would like to be left with a set of floating point numbers.
I would be very surprised if this problem wasn't heavily studied, but I'm not sure what to look for.
I've tackled similar problems in the past, but this one is decidedly different in nature. Before I've generated different combinations of a list of numbers that will add up to x. I'm sure that I could simply bruteforce this problem but that hardly seems like the ideal solution.
Anyone have any idea what this may be called, or how to approach it? Thanks all!
Edit: To clarify, I mean that the list should be length N while the numbers themselves can be of any size.
edit2: Sorry for my improper use of 'set', I was using it as a catch all term for a list or an array. I understand that it was causing confusion, my apologies.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这是在 Python 中执行此操作的方法
基本上,您选择 n 个随机数,计算它们的总和并计算比例因子,以便总和成为您想要的值。
请注意,这种方法不会产生“均匀”的切片,即您将获得的分布往往比在给定总和的所有分布中随机选取的分布更“平等”。
要了解原因,您可以想象一下算法在两个数字具有规定总和(例如 1)的情况下会做什么:
点
P
是通过选取两个随机数获得的通用点,它在正方形[0,1]x[0,1]内是均匀的
。点Q
是对P
进行缩放得到的点,要求总和为1。从图中可以清楚地看到,靠近中心的点有一个更高的概率;例如,通过投影对角线上的任意点(0,0)-(1,1)
可以找到正方形的确切中心,而点(0, 1)
code> 将被发现仅投影(0,0)-(0,1)
中的点...对角线长度为sqrt(2)=1.4142...
而正方形的边只有1.0
。This is how to do it in Python
Basically you pick n random numbers, compute their sum and compute a scale factor so that the sum will be what you want it to be.
Note that this approach will not produce "uniform" slices, i.e. the distribution you will get will tend to be more "egalitarian" than it should be if it was picked at random among all distribution with the given sum.
To see the reason you can just picture what the algorithm does in the case of two numbers with a prescribed sum (e.g. 1):
The point
P
is a generic point obtained by picking two random numbers and it will be uniform inside the square[0,1]x[0,1]
. The pointQ
is the point obtained by scalingP
so that the sum is required to be 1. As it's clear from the picture the points close to the center of the have an higher probability; for example the exact center of the squares will be found by projecting any point on the diagonal(0,0)-(1,1)
, while the point(0, 1)
will be found projecting only points from(0,0)-(0,1)
... the diagonal length issqrt(2)=1.4142...
while the square side is only1.0
.实际上,您需要将 x 的分区生成 n 个部分。这通常通过以下方式完成:将x划分为n个非负部分可以用以下方式表示:保留n + x 个空闲位置,将 n 个边界放置在任意位置,并将石头放置在其余位置。石头组加起来为x,因此可能的分区数就是二项式系数 (n + x \atop n)。
因此,您的算法可以如下所示:选择 (n + x) 集的任意 n 子集,它唯一地确定x 分成 n 部分。
在 Knuth 的 TAOCP 中,第 3.4.2 章讨论了随机抽样。请参阅此处的 Algortihm S。
算法S:(从总共N条记录中选择n条任意记录)
非整数的解决方案在算法上很简单:您只需选择总和不等于 0 的任意 n 个数字,并通过它们的总和对它们进行标准化。
Actually, you need to generate a partition of x into n parts. This is usually done the in following way: The partition of x into n non-negative parts can be represented in the following way: reserve n + x free places, put n borders to some arbitrary places, and stones to the rest. The stone groups add up to x, thus the number of possible partitions is the binomial coefficient (n + x \atop n).
So your algorithm could be as follows: choose an arbitrary n-subset of (n + x)-set, it determines uniquely a partition of x into n parts.
In Knuth's TAOCP the chapter 3.4.2 discusses random sampling. See Algortihm S there.
Algorithm S: (choose n arbitrary records from total of N)
The solution for non-integers is algorithmically trivial: you just select arbitrary n numbers that don't sum up to 0, and norm them by their sum.
如果您想在由
x1 + x2 + ... + xN = x
定义的N-1
维空间区域中均匀采样,那么您正在查看从狄利克雷分布采样的特殊情况。采样过程比为xi
生成均匀偏差要复杂一些。这是一种在 Python 中实现的方法:如果您不太关心结果的采样属性,您可以只生成均匀偏差并随后纠正它们的总和。
If you want to sample uniformly in the region of
N-1
-dimensional space defined byx1 + x2 + ... + xN = x
, then you're looking at a special case of sampling from a Dirichlet distribution. The sampling procedure is a little more involved than generating uniform deviates for thexi
. Here's one way to do it, in Python:If you don't care too much about the sampling properties of your results, you can just generate uniform deviates and correct their sum afterwards.
这是上述算法的 Javascript 版本,
您可以使用 来调用它
,然后使用来检查它
Here is a version of the above algorithm in Javascript
You can call it with
And then check it with
这段代码完成了合理的工作。我认为它产生的分布与 6502 的答案不同,但我不确定哪个更好或更自然。当然他的代码更清晰/更好。
This code does a reasonable job. I think it produces a different distribution than 6502's answer, but I am not sure which is better or more natural. Certainly his code is clearer/nicer.
在 python 中:
a: 创建一个(随机 #'s 0 到 1)次总计的列表;将 0 和总计追加到列表中
b:对列表进行排序,测量每个元素之间的距离
c:对列表元素进行舍入
产量:
据我所知,此分布是均匀的
In python:
a: create a list of (random #'s 0 to 1) times total; append 0 and total to the list
b: sort the list, measure the distance between each element
c: round the list elements
yields:
to the best of my knowledge this distribution is uniform