当前位置：文江博客话题详情

随机生成一组长度为n、总计为x的数字

发布于 2024-11-30 01:35:16 字数 400 浏览 3 评论 0原文

我正在开发一个有趣的项目，我需要一个算法来执行以下操作：生成长度为 n 的数字列表，加起来为 x

我会选择整数列表，但理想情况下，我希望留下一组浮点数数字。

如果这个问题没有得到深入研究，我会感到非常惊讶，但我不知道要寻找什么。

我过去曾解决过类似的问题，但这个问题本质上截然不同。在我生成加起来为 x 的数字列表的不同组合之前。我确信我可以简单地暴力破解这个问题，但这似乎不是理想的解决方案。

任何人都知道这可能被称为什么，或者如何处理它？谢谢大家！

编辑：为了澄清，我的意思是列表的长度应该是 N，而数字本身可以是任何大小。

edit2：抱歉我对“set”的使用不当，我将它用作列表或数组的笼统术语。我知道这造成了混乱，我很抱歉。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

丑丑阿 2024-12-07 01:35:16

这是在 Python 中执行此操作的方法

import random

def random_values_with_prescribed_sum(n, total):
    x = [random.random() for i in range(n)]
    k = total / sum(x)
    return [v * k for v in x]

基本上，您选择 n 个随机数，计算它们的总和并计算比例因子，以便总和成为您想要的值。

请注意，这种方法不会产生“均匀”的切片，即您将获得的分布往往比在给定总和的所有分布中随机选取的分布更“平等”。

要了解原因，您可以想象一下算法在两个数字具有规定总和（例如 1）的情况下会做什么：

enter image description here

点P是通过选取两个随机数获得的通用点，它在正方形[0,1]x[0,1]内是均匀的。点Q是对P进行缩放得到的点，要求总和为1。从图中可以清楚地看到，靠近中心的点有一个更高的概率；例如，通过投影对角线上的任意点 (0,0)-(1,1) 可以找到正方形的确切中心，而点 (0, 1) code> 将被发现仅投影 (0,0)-(0,1) 中的点...对角线长度为 sqrt(2)=1.4142...而正方形的边只有1.0。

This is how to do it in Python

import random

def random_values_with_prescribed_sum(n, total):
    x = [random.random() for i in range(n)]
    k = total / sum(x)
    return [v * k for v in x]

Basically you pick n random numbers, compute their sum and compute a scale factor so that the sum will be what you want it to be.

Note that this approach will not produce "uniform" slices, i.e. the distribution you will get will tend to be more "egalitarian" than it should be if it was picked at random among all distribution with the given sum.

To see the reason you can just picture what the algorithm does in the case of two numbers with a prescribed sum (e.g. 1):

enter image description here

The point P is a generic point obtained by picking two random numbers and it will be uniform inside the square [0,1]x[0,1]. The point Q is the point obtained by scaling P so that the sum is required to be 1. As it's clear from the picture the points close to the center of the have an higher probability; for example the exact center of the squares will be found by projecting any point on the diagonal (0,0)-(1,1), while the point (0, 1) will be found projecting only points from (0,0)-(0,1)... the diagonal length is sqrt(2)=1.4142... while the square side is only 1.0.

回复收藏 0 原文

骄兵必败 2024-12-07 01:35:16

实际上，您需要将 x 的分区生成 n 个部分。这通常通过以下方式完成：将x划分为n个非负部分可以用以下方式表示：保留n + x 个空闲位置，将 n 个边界放置在任意位置，并将石头放置在其余位置。石头组加起来为x，因此可能的分区数就是二项式系数 (n + x \atop n)。

因此，您的算法可以如下所示：选择 (n + x) 集的任意 n 子集，它唯一地确定x 分成 n 部分。

在 Knuth 的 TAOCP 中，第 3.4.2 章讨论了随机抽样。请参阅此处的 Algortihm S。

算法S：（从总共N条记录中选择n条任意记录）

t = 0，m = 0；
u = 随机，均匀分布在 (0, 1) 上
如果 (N - t)*u >= n - m，则跳过第 t 条记录并将 t 加 1；否则将第 t 条记录包含在样本中，
如果 M < ，则将 m 和 t 加 1 n，返回 2，否则算法完成

非整数的解决方案在算法上很简单：您只需选择总和不等于 0 的任意 n 个数字，并通过它们的总和对它们进行标准化。

回复收藏 0 原文

触ぅ动初心 2024-12-07 01:35:16

如果您想在由 x1 + x2 + ... + xN = x 定义的 N-1 维空间区域中均匀采样，那么您正在查看从狄利克雷分布采样的特殊情况。采样过程比为 xi 生成均匀偏差要复杂一些。这是一种在 Python 中实现的方法：

xs = [random.gammavariate(1,1) for a in range(N)]
xs = [x*v/sum(xs) for v in xs]

如果您不太关心结果的采样属性，您可以只生成均匀偏差并随后纠正它们的总和。

If you want to sample uniformly in the region of N-1-dimensional space defined by x1 + x2 + ... + xN = x, then you're looking at a special case of sampling from a Dirichlet distribution. The sampling procedure is a little more involved than generating uniform deviates for the xi. Here's one way to do it, in Python:

xs = [random.gammavariate(1,1) for a in range(N)]
xs = [x*v/sum(xs) for v in xs]

If you don't care too much about the sampling properties of your results, you can just generate uniform deviates and correct their sum afterwards.

回复收藏 0 原文

通知家属抬走 2024-12-07 01:35:16

这是上述算法的 Javascript 版本，

    function getRandomArbitrary(min, max) {
        return Math.random() * (max - min) + min;
    };
    function getRandomArray(min, max, n) {
        var arr = [];
        for (var i = 0, l = n; i < l; i++) {
            arr.push(getRandomArbitrary(min, max))
        };
        return arr;
    };
    function randomValuesPrescribedSum(min, max, n, total) {
        var arr = getRandomArray(min, max, n);
        var sum = arr.reduce(function(pv, cv) { return pv + cv; }, 0);
        var k = total/sum;
        var delays = arr.map(function(x) { return k*x; })
        return delays;
    };

您可以使用来调用它

var myarray = randomValuesPrescribedSum(0,1,3,3);

，然后使用来检查它

var sum = myarray.reduce(function(pv, cv) { return pv + cv;},0);

Here is a version of the above algorithm in Javascript

    function getRandomArbitrary(min, max) {
        return Math.random() * (max - min) + min;
    };
    function getRandomArray(min, max, n) {
        var arr = [];
        for (var i = 0, l = n; i < l; i++) {
            arr.push(getRandomArbitrary(min, max))
        };
        return arr;
    };
    function randomValuesPrescribedSum(min, max, n, total) {
        var arr = getRandomArray(min, max, n);
        var sum = arr.reduce(function(pv, cv) { return pv + cv; }, 0);
        var k = total/sum;
        var delays = arr.map(function(x) { return k*x; })
        return delays;
    };

You can call it with

var myarray = randomValuesPrescribedSum(0,1,3,3);

And then check it with

var sum = myarray.reduce(function(pv, cv) { return pv + cv;},0);

回复收藏 0 原文

同尘 2024-12-07 01:35:16

这段代码完成了合理的工作。我认为它产生的分布与 6502 的答案不同，但我不确定哪个更好或更自然。当然他的代码更清晰/更好。

import random

def parts(total_sum, num_parts):
  points = [random.random() for i in range(num_parts-1)]
  points.append(0)
  points.append(1)
  points.sort()

  ret = []
  for i in range(1, len(points)):
    ret.append((points[i] - points[i-1]) * total_sum)
  return ret

def test(total_sum, num_parts):
  ans = parts(total_sum, num_parts)
  assert abs(sum(ans) - total_sum) < 1e-7
  print ans

test(5.5, 3)
test(10, 1)
test(10, 5)

This code does a reasonable job. I think it produces a different distribution than 6502's answer, but I am not sure which is better or more natural. Certainly his code is clearer/nicer.

import random

def parts(total_sum, num_parts):
  points = [random.random() for i in range(num_parts-1)]
  points.append(0)
  points.append(1)
  points.sort()

  ret = []
  for i in range(1, len(points)):
    ret.append((points[i] - points[i-1]) * total_sum)
  return ret

def test(total_sum, num_parts):
  ans = parts(total_sum, num_parts)
  assert abs(sum(ans) - total_sum) < 1e-7
  print ans

test(5.5, 3)
test(10, 1)
test(10, 5)

回复收藏 0 原文

久伴你 2024-12-07 01:35:16

在 python 中：

a: 创建一个（随机 #'s 0 到 1）次总计的列表；将 0 和总计追加到列表中

b：对列表进行排序，测量每个元素之间的距离

c：对列表元素进行舍入

import random
import time

TOTAL       = 15
PARTS       = 4
PLACES      = 3

def random_sum_split(parts, total, places):

    a = [0, total] + [random.random()*total for i in range(parts-1)]
    a.sort()
    b = [(a[i] - a[i-1]) for i in range(1, (parts+1))]
    if places == None:
        return b
    else:    
        b.pop()
        c = [round(x, places) for x in b]  
        c.append(round(total-sum(c), places))
        return c

def tick():

    if info.tick == 1:

        start = time.time()
        alpha = random_sum_split(PARTS, TOTAL, PLACES)
        end = time.time()  

        log('alpha: %s' % alpha)
        log('total: %.7f' % sum(alpha))
        log('parts: %s' % PARTS)
        log('places: %s' % PLACES)
        log('elapsed: %.7f' % (end-start))

产量：

[2014-06-13 01:00:00] alpha: [0.154, 3.617, 6.075, 5.154]
[2014-06-13 01:00:00] total: 15.0000000
[2014-06-13 01:00:00] parts: 4
[2014-06-13 01:00:00] places: 3
[2014-06-13 01:00:00] elapsed: 0.0005839

据我所知，此分布是均匀的

In python:

a: create a list of (random #'s 0 to 1) times total; append 0 and total to the list

b: sort the list, measure the distance between each element

c: round the list elements

import random
import time

TOTAL       = 15
PARTS       = 4
PLACES      = 3

def random_sum_split(parts, total, places):

    a = [0, total] + [random.random()*total for i in range(parts-1)]
    a.sort()
    b = [(a[i] - a[i-1]) for i in range(1, (parts+1))]
    if places == None:
        return b
    else:    
        b.pop()
        c = [round(x, places) for x in b]  
        c.append(round(total-sum(c), places))
        return c

def tick():

    if info.tick == 1:

        start = time.time()
        alpha = random_sum_split(PARTS, TOTAL, PLACES)
        end = time.time()  

        log('alpha: %s' % alpha)
        log('total: %.7f' % sum(alpha))
        log('parts: %s' % PARTS)
        log('places: %s' % PLACES)
        log('elapsed: %.7f' % (end-start))

yields:

[2014-06-13 01:00:00] alpha: [0.154, 3.617, 6.075, 5.154]
[2014-06-13 01:00:00] total: 15.0000000
[2014-06-13 01:00:00] parts: 4
[2014-06-13 01:00:00] places: 3
[2014-06-13 01:00:00] elapsed: 0.0005839

to the best of my knowledge this distribution is uniform

回复收藏 0 原文

~没有更多了~