生成 [0,8000] 范围内 1000 个不同整数的算法?

发布于 2024-08-27 07:19:29 字数 407 浏览 10 评论 0原文

可能的重复:
怎么做您可以有效地生成一个包含 0 到上限 N 之间的 K 个非重复整数的列表

有哪些替代方法可以生成 [0,8000] 范围内的 1000 个不同的随机整数,而不是以下方法:

  1. 朴素方法:生成一个数字并检查它是否已经在数组中。 O(n^2)
  2. 线性shuffle:生成序列0到8000,shuffle,取前1000。O(n)

Possible Duplicate:
How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N

What are some alternative methods to generate 1000 distinct random integers in the range [0,8000] as opposed to the following:

  1. naive method: generating a number and checking if it's already in the array. O(n^2)
  2. linear shuffle: generate sequence 0 to 8000, shuffle, take the first 1000. O(n)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

眼睛会笑 2024-09-03 07:19:30

没有排序的排序列表,O(n)

如果你想对整数进行排序,我在 另一个问题,有很多帮助。您可以使用指数变量来完成此操作,从而避免任何排序。结果是 O(n):

来自 Alok 的答案< /a> 和 Dan Dyer 的评论 事实证明对一组增量使用指数分布给出了整数序列的均匀分布。

因此,您只需开始生成数字,然后在最后缩放它们。在增量中加 1 可确保您永远不会重复某个值。

import random,sys,math

def genSortedInts(mini,maxi,vals):
    running = 0
    deltas = [random.expovariate(1.0) for i in range(0,vals+1)]
    floats = []
    for d in deltas:
        running += d
        floats.append(running)
    upper = floats.pop()
    valRange = maxi-mini-(vals-1)
    ints = [mini+int(f/upper*valRange)+id for id,f in enumerate(floats)]
    return ints

if __name__ == "__main__":
    vals = 10
    maxi = 80
    mini = 0
    print(genSortedInts(mini,maxi,vals))

请注意 random.expovariate(1.0) 的使用,即 Python指数分布随机数生成器(非常有用!)。在这里,它的调用平均值为 1.0(arg 为 1/mean),但由于脚本针对序列中的最后一个数字进行标准化,因此平均值本身并不重要。

10 个值(最多 80 个)的输出(公平掷骰子):

[3, 5, 10, 16, 25, 37, 41, 45, 57, 70]

Sorted list with no sort, O(n)

If you want the integers sorted, I got to this answer in another question with a lot of help. You can do it using an exponential variate and thereby avoid any sort. As a result it is O(n):

From Alok's answer and Dan Dyer's comment it turns out that using an exponential distribution for a set of deltas gives a uniform distribution of integers in sequence.

So, you just start generating numbers and then scale them at the end. Adding 1 to the delta ensures you never repeat a value.

import random,sys,math

def genSortedInts(mini,maxi,vals):
    running = 0
    deltas = [random.expovariate(1.0) for i in range(0,vals+1)]
    floats = []
    for d in deltas:
        running += d
        floats.append(running)
    upper = floats.pop()
    valRange = maxi-mini-(vals-1)
    ints = [mini+int(f/upper*valRange)+id for id,f in enumerate(floats)]
    return ints

if __name__ == "__main__":
    vals = 10
    maxi = 80
    mini = 0
    print(genSortedInts(mini,maxi,vals))

Note the use of random.expovariate(1.0), a Python exponential distribution random number generator (very useful!). Here it's called with a mean of 1.0 (arg is 1/mean), but since the script normalises against the last number in the sequence, the mean itself doesn't matter.

Output (fair dice roll) for 10 values up to 80:

[3, 5, 10, 16, 25, 37, 41, 45, 57, 70]
奶茶白久 2024-09-03 07:19:29

您可以使用通过交换实现的部分Fisher-Yates shuffle。该算法的一个很好的功能是,如果您在 k 交换后停止,则前 k 数字是大小为 k 的随机样本全套。

You can use a partial Fisher-Yates shuffle implemented using swaps. One of the nice features of this algorithm is that if you stop after k swaps, the first k numbers are a random sample of size k from the complete set.

執念 2024-09-03 07:19:29

您可以创建一个包含数字 0 到 8000 的列表。

然后循环 1000 次生成一个介于 0 和列表长度之间的随机数。

从列表中删除该元素并将其添加到输出列表中。

通过删除该元素,您可以确保您的选择是唯一的。

while (outputList.Count < 1000)
{
    index = random.Next(0, inputList.Count);
    outputList.Add(inputList[index]);
    inputList.RemoveAt(index);
}

You could create a list containing the numbers 0 to 8000.

Then looping 1000 times generate a random number between 0 and the length of the list.

Remove that element from the list and add it to an output list.

By removing the element you ensure that your selections are unique.

while (outputList.Count < 1000)
{
    index = random.Next(0, inputList.Count);
    outputList.Add(inputList[index]);
    inputList.RemoveAt(index);
}
苍暮颜 2024-09-03 07:19:29

这是来自 Knuth 的《编程艺术》(来自 Jon Bentley 的《Programming Pearls》),用 Python 实现:

import random

# randomly select m numbers from n candidates    
def random_select(m, n):
    select = m
    result = []
    for i in xrange(n):
        if random.randint(0, n-i) < select:
            result.append(i)
            select -= 1
    return result

random_select(1000, 8000)

这将生成一个按数字顺序排列的随机数列表。它的工作原理是迭代 0-n(即 0-8000)的所有整数,并以(剩余选择数/剩余候选数)的概率随机选择它们。它的运行时间为 O(n),因此如果 n 与 m 相比非常大,请不要尝试它 - 例如从十亿个数字中选择十个。除了结果列表 (m) 和一些局部变量之外,它不使用任何内存,这与依赖于对长度为 n 的列表进行混洗的解决方案不同。

如果您希望结果按随机顺序排列,请随后对列表进行打乱。

This is from Knuth's the Art of Programming (via Jon Bentley's Programming Pearls), implemented in Python:

import random

# randomly select m numbers from n candidates    
def random_select(m, n):
    select = m
    result = []
    for i in xrange(n):
        if random.randint(0, n-i) < select:
            result.append(i)
            select -= 1
    return result

random_select(1000, 8000)

this will generate a list of random numbers in numerical order. It works by iterating over all the integers from 0-n (i.e 0-8000), and randomly selecting them with a probability of(number left to select / number of remaining candidates). It runs in O(n), so do not try it if n is very large compared to m - e.g. selecting ten numbers out of a billion. It uses no memory other than the result list (m) and a few local variables, unlike solutions that rely on shuffling a list of length n.

If you want the result in a random order then shuffle the list afterwards.

往昔成烟 2024-09-03 07:19:29

部分 Fisher-Yates,如 @Mark 建议,稍微改动一下,一路存储交换。
这样,它至多会消耗与结果列表 O(m) 一样多的内存。
它也将在 O(m) 中运行 - 而不是 O(n),就像枚举整个范围的其他解决方案一样 - 因此它在更大的范围上不应该出现问题。
这样,您就可以两全其美。

/// <summary>
/// Generates unique random numbers
/// <remarks>
/// Worst case memory usage is O(min((emax-imin)/2, num))
/// </remarks>
/// </summary>
/// <param name="random">Random source</param>
/// <param name="imin">Inclusive lower bound</param>
/// <param name="emax">Exclusive upper bound</param>
/// <param name="num">Number of integers to generate</param>
/// <returns>Sequence of unique random numbers</returns>
public static IEnumerable<int> UniqueRandoms(
    Random random, int imin, int emax, int num)
{
    int dictsize = num;
    long half = (emax - (long)imin + 1) / 2;
    if (half < dictsize)
        dictsize = (int)half;
    Dictionary<int, int> trans = new Dictionary<int, int>(dictsize);
    for (int i = 0; i < num; i++)
    {
        int current = imin + i;
        int r = random.Next(current, emax);
        int right;
        if (!trans.TryGetValue(r, out right))
        {
            right = r;
        }
        int left;
        if (trans.TryGetValue(current, out left))
        {
            trans.Remove(current);
        }
        else
        {
            left = current;
        }
        if (r > current)
        {
            trans[r] = left;
        }
        yield return right;
    }
}

Partial Fisher-Yates, as @Mark has suggested, with a little twist, storing the swaps along the way.
This way, it will at most consume as much memory as the result list O(m).
It will also run in O(m) - not O(n), as other solutions that enumerate the whole range - so it should not have problems on larger ranges.
This way, you can have the best of both worlds.

/// <summary>
/// Generates unique random numbers
/// <remarks>
/// Worst case memory usage is O(min((emax-imin)/2, num))
/// </remarks>
/// </summary>
/// <param name="random">Random source</param>
/// <param name="imin">Inclusive lower bound</param>
/// <param name="emax">Exclusive upper bound</param>
/// <param name="num">Number of integers to generate</param>
/// <returns>Sequence of unique random numbers</returns>
public static IEnumerable<int> UniqueRandoms(
    Random random, int imin, int emax, int num)
{
    int dictsize = num;
    long half = (emax - (long)imin + 1) / 2;
    if (half < dictsize)
        dictsize = (int)half;
    Dictionary<int, int> trans = new Dictionary<int, int>(dictsize);
    for (int i = 0; i < num; i++)
    {
        int current = imin + i;
        int r = random.Next(current, emax);
        int right;
        if (!trans.TryGetValue(r, out right))
        {
            right = r;
        }
        int left;
        if (trans.TryGetValue(current, out left))
        {
            trans.Remove(current);
        }
        else
        {
            left = current;
        }
        if (r > current)
        {
            trans[r] = left;
        }
        yield return right;
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文