生成 [0,8000] 范围内 1000 个不同整数的算法?
有哪些替代方法可以生成 [0,8000] 范围内的 1000 个不同的随机整数,而不是以下方法:
- 朴素方法:生成一个数字并检查它是否已经在数组中。 O(n^2)
- 线性shuffle:生成序列0到8000,shuffle,取前1000。O(n)
Possible Duplicate:
How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N
What are some alternative methods to generate 1000 distinct random integers in the range [0,8000] as opposed to the following:
- naive method: generating a number and checking if it's already in the array. O(n^2)
- linear shuffle: generate sequence 0 to 8000, shuffle, take the first 1000. O(n)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
没有排序的排序列表,O(n)
如果你想对整数进行排序,我在 另一个问题,有很多帮助。您可以使用指数变量来完成此操作,从而避免任何排序。结果是 O(n):
来自 Alok 的答案< /a> 和 Dan Dyer 的评论 事实证明对一组增量使用指数分布给出了整数序列的均匀分布。
因此,您只需开始生成数字,然后在最后缩放它们。在增量中加 1 可确保您永远不会重复某个值。
请注意
random.expovariate(1.0)
的使用,即 Python指数分布随机数生成器(非常有用!)。在这里,它的调用平均值为 1.0(arg 为 1/mean),但由于脚本针对序列中的最后一个数字进行标准化,因此平均值本身并不重要。10 个值(最多 80 个)的输出(公平掷骰子):
Sorted list with no sort, O(n)
If you want the integers sorted, I got to this answer in another question with a lot of help. You can do it using an exponential variate and thereby avoid any sort. As a result it is O(n):
From Alok's answer and Dan Dyer's comment it turns out that using an exponential distribution for a set of deltas gives a uniform distribution of integers in sequence.
So, you just start generating numbers and then scale them at the end. Adding 1 to the delta ensures you never repeat a value.
Note the use of
random.expovariate(1.0)
, a Python exponential distribution random number generator (very useful!). Here it's called with a mean of 1.0 (arg is 1/mean), but since the script normalises against the last number in the sequence, the mean itself doesn't matter.Output (fair dice roll) for 10 values up to 80:
您可以使用通过交换实现的部分Fisher-Yates shuffle。该算法的一个很好的功能是,如果您在
k
交换后停止,则前k
数字是大小为k
的随机样本全套。You can use a partial Fisher-Yates shuffle implemented using swaps. One of the nice features of this algorithm is that if you stop after
k
swaps, the firstk
numbers are a random sample of sizek
from the complete set.您可以创建一个包含数字 0 到 8000 的列表。
然后循环 1000 次生成一个介于 0 和列表长度之间的随机数。
从列表中删除该元素并将其添加到输出列表中。
通过删除该元素,您可以确保您的选择是唯一的。
You could create a list containing the numbers 0 to 8000.
Then looping 1000 times generate a random number between 0 and the length of the list.
Remove that element from the list and add it to an output list.
By removing the element you ensure that your selections are unique.
这是来自 Knuth 的《编程艺术》(来自 Jon Bentley 的《Programming Pearls》),用 Python 实现:
这将生成一个按数字顺序排列的随机数列表。它的工作原理是迭代 0-n(即 0-8000)的所有整数,并以(剩余选择数/剩余候选数)的概率随机选择它们。它的运行时间为 O(n),因此如果 n 与 m 相比非常大,请不要尝试它 - 例如从十亿个数字中选择十个。除了结果列表 (m) 和一些局部变量之外,它不使用任何内存,这与依赖于对长度为 n 的列表进行混洗的解决方案不同。
如果您希望结果按随机顺序排列,请随后对列表进行打乱。
This is from Knuth's the Art of Programming (via Jon Bentley's Programming Pearls), implemented in Python:
this will generate a list of random numbers in numerical order. It works by iterating over all the integers from 0-n (i.e 0-8000), and randomly selecting them with a probability of(number left to select / number of remaining candidates). It runs in O(n), so do not try it if n is very large compared to m - e.g. selecting ten numbers out of a billion. It uses no memory other than the result list (m) and a few local variables, unlike solutions that rely on shuffling a list of length n.
If you want the result in a random order then shuffle the list afterwards.
部分 Fisher-Yates,如 @Mark 建议,稍微改动一下,一路存储交换。
这样,它至多会消耗与结果列表 O(m) 一样多的内存。
它也将在 O(m) 中运行 - 而不是 O(n),就像枚举整个范围的其他解决方案一样 - 因此它在更大的范围上不应该出现问题。
这样,您就可以两全其美。
Partial Fisher-Yates, as @Mark has suggested, with a little twist, storing the swaps along the way.
This way, it will at most consume as much memory as the result list O(m).
It will also run in O(m) - not O(n), as other solutions that enumerate the whole range - so it should not have problems on larger ranges.
This way, you can have the best of both worlds.