随机排列
我想尽快生成随机排列。 问题:O(n)
的 Knuth shuffle 涉及生成 n
随机数。 由于生成随机数非常昂贵,我想找到一个包含固定 O(1)
数量的随机数的 O(n)
函数。
我意识到这个问题以前已经被问过,但我没有看到任何相关的答案。
只是强调一点:我并不是在寻找低于 O(n)
的东西,只是一种涉及较少生成随机数的算法。
谢谢。
I would like to generate a random permutation as fast as possible.
The problem: The Knuth shuffle which is O(n)
involves generating n
random numbers.
Since generating random numbers is quite expensive, I would like to find an O(n)
function involving a fixed O(1)
amount of random numbers.
I realize that this question has been asked before, but I did not see any relevant answers.
Just to stress a point: I am not looking for anything less than O(n)
, just an algorithm involving less generation of random numbers.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
创建每个排列到从 1 到 n 的数字的 1-1 映射! (n 阶乘)。生成1到n!之间的随机数,利用映射,得到排列。
对于映射,也许这会很有用: http://en.wikipedia.org/wiki/Permutation #Numbering_permutations
当然,这很快就会失控,因为 n!很快就会变得非常大。
Create a 1-1 mapping of each permutation to a number from 1 to n! (n factorial). Generate a random number in 1 to n!, use the mapping, get the permutation.
For the mapping, perhaps this will be useful: http://en.wikipedia.org/wiki/Permutation#Numbering_permutations
Of course, this would get out of hand quickly, as n! can become really large soon.
你说生成随机数需要很长时间? Javas Random.nextInt 的实现大致是这样,
是不是每个元素要做的工作太多了?
Generating a random number takes long time you say? The implementation of Javas Random.nextInt is roughly
Is that too much work to do for each element?
请参阅 https://doi.org/10.1145/3009909 对随机位数进行仔细分析需要生成随机排列。 (它是开放获取的,但阅读起来并不容易!底线:如果仔细实施,生成随机排列的所有常用方法在使用随机位方面都是有效的。)
并且......如果您的目标是生成随机排列对于大 N 的快速排列,我建议您尝试 MergeShuffle 算法。 2015 年发表的文章声称在并行和顺序方面比 Fisher-Yates 加速了两倍实现,并且与他们测试的其他标准算法(Rao-Sandelius)相比,顺序计算显着加速。
MergeShuffle(以及常用的 Fisher-Yates 和 Rao-Sandelius 算法)的实现可在 https:/ /github.com/axel-bacher/mergeshuffle。但买者自负!作者是理论家,而不是软件工程师。他们已将实验代码发布到 github 但不进行维护。有一天,我想象有人(也许是你!)会将 MergeShuffle 添加到 GSL 中。目前
gsl_ran_shuffle()
是 Fisher-Yates 的一个实现,参见 https://www.gnu.org/software/gsl/doc/html/randist.html?highlight=gsl_ran_shuffle。< img src="https://i.sstatic.net/cB4se.jpg" alt="我们在 100 次试验中实现各种随机排列算法所使用的平均随机位数。">
See https://doi.org/10.1145/3009909 for a careful analysis of the number of random bits required to generate a random permutation. (It's open-access, but it's not easy reading! Bottom line: if carefully implemented, all of the usual methods for generating random permutations are efficient in their use of random bits.)
And... if your goal is to generate a random permutation rapidly for large N, I'd suggest you try the MergeShuffle algorithm. An article published in 2015 claimed a factor-of-two speedup over Fisher-Yates in both parallel and sequential implementations, and a significant speedup in sequential computations over the other standard algorithm they tested (Rao-Sandelius).
An implementation of MergeShuffle (and of the usual Fisher-Yates and Rao-Sandelius algorithms) is available at https://github.com/axel-bacher/mergeshuffle. But caveat emptor! The authors are theoreticians, not software engineers. They have published their experimental code to github but aren't maintaining it. Someday, I imagine someone (perhaps you!) will add MergeShuffle to GSL. At present
gsl_ran_shuffle()
is an implementation of Fisher-Yates, see https://www.gnu.org/software/gsl/doc/html/randist.html?highlight=gsl_ran_shuffle.不是您确切要求的,但如果提供的随机数生成器不能满足您的要求,可能您应该尝试不同的东西。一般来说,伪随机数的生成可以非常简单。
可能是最著名的算法
http://en.wikipedia.org/wiki/Linear_congruential_generator
更多
http://en.wikipedia.org/wiki/List_of_pseudorandom_number_generators
Not what you asked exactly, but if provided random number generator doesn't satisfy you, may be you should try something different. Generally, pseudorandom number generation can be very simple.
Probably, best-known algorithm
http://en.wikipedia.org/wiki/Linear_congruential_generator
More
http://en.wikipedia.org/wiki/List_of_pseudorandom_number_generators
正如其他答案所建议的,您可以创建一个 0 到 N 范围内的随机整数!并用它来产生随机播放。虽然理论上是正确的,但一般来说这不会更快,因为 N!增长得很快,你将把所有的时间都花在做 bigint 算术上。
如果您想要速度并且不介意牺牲一些随机性,那么使用不太好的随机数生成器会更好。线性同余生成器(参见 http://en.wikipedia.org/wiki/Linear_congruential_generator)会在几个周期内给你一个随机数。
As other answers suggest, you can make a random integer in the range 0 to N! and use it to produce a shuffle. Although theoretically correct, this won't be faster in general since N! grows fast and you'll spend all your time doing bigint arithmetic.
If you want speed and you don't mind trading off some randomness, you will be much better off using a less good random number generator. A linear congruential generator (see http://en.wikipedia.org/wiki/Linear_congruential_generator) will give you a random number in a few cycles.
通常不需要全范围的下一个随机值,因此要使用完全相同的随机量,您可以使用下一个方法(我猜这几乎就像随机(0,N!)):
PS当然会有可能存在一些与除以不同于 2^n 的值相关的错误,但它们将分布在结果样本中。
Usually there is no need in full-range of next random value, so to use exactly the same amount of randomness you can use next approach (which is almost like random(0,N!), I guess):
P.S. of course there will be some errors related with division by value different from 2^n, but they will be distributed among resulted samples.
在进行计算之前生成 N 个数字(N <您需要的随机数的数量),或者使用缓慢但良好的随机生成器将它们作为数据存储在数组中;然后选择一个数字,只需在计算循环内增加数组的索引即可;如果您需要不同的种子,请创建多个表。
Generate N numbers (N < of the number of random number you need) before to do the computation, or store them in an array as data, with your slow but good random generator; then pick up a number simply incrementing an index into the array inside your computing loop; if you need different seeds, create multiple tables.
您确定您解决问题的数学和算法方法是正确的吗?
我遇到了完全相同的问题,费舍尔-耶茨洗牌在极端情况下将成为瓶颈。但对我来说,真正的问题是暴力算法,它不能很好地适应所有问题。下面的故事解释了我迄今为止提出的问题和优化。
4 名玩家发牌
可能的发牌数量为96 位。这给随机数生成器带来了相当大的压力,以避免在从生成的交易样本集中选择游戏计划时出现静态异常。我选择使用来自 /dev/random 的 2xmt19937_64 种子,因为网络上的长期广告和大量广告表明它有利于科学模拟。
简单的方法是使用 Fisher–Yates shuffle 来生成交易并过滤掉与已收集的信息不匹配的交易。 Knuth shuffle 每笔交易大约需要 1400 个 CPU 周期,主要是因为我必须生成 51 个随机数并交换表中的 51 次条目。
这对于正常情况来说并不重要,我只需要在 7 分钟内生成 10000-100000 笔交易。但在极端情况下,过滤器可能只选择非常小的手牌子集,从而需要生成大量交易。
对多张卡使用单个数字
当使用 callgrind (valgrind) 进行分析时,我注意到主要的减慢速度是 C++ 随机数生成器(在从第一个瓶颈的 std::uniform_int_distribution 切换之后)。
然后我想到可以对多张卡使用单个随机数。这个想法是首先使用数字中最不重要的信息,然后删除该信息。
当然,这只是次要的优化,因为生成仍然是 O(N)。
使用位排列生成
下一个想法正是此处提出的解决方案,但我最终仍然使用 O(N) 但成本比原始洗牌更大。但让我们看看解决方案以及为什么它失败得如此惨烈。
我决定使用想法 Dealing All the Deals by John Christman
到目前为止还不错看起来很漂亮,但 select_deal 实现是 PITA。
现在我已经完成了 O(N) 排列解决方案来证明算法可以工作,我开始搜索从随机数到位排列的 O(1) 映射。太糟糕了,看起来唯一的解决方案是使用巨大的查找表,这会杀死 CPU 缓存。对于将使用大量缓存进行双虚拟分析器的人工智能来说,这听起来不是一个好主意。
数学解决方案
在努力弄清楚如何生成随机位排列之后,我决定回到数学。在发牌之前应用过滤器是完全可能的。这需要将交易分割为可管理数量的分层集合,并在过滤掉不可能的集合后根据它们的相对概率在集合之间进行选择。
我还没有准备好代码来测试在过滤器选择交易主要部分的常见情况下我浪费了多少周期。但我相信这种方法可以提供最稳定的发电性能,并将成本保持在 0.1% 以下。
Are you sure that your mathematical and algorithmical approach to the problem is correct?
I hit exactly same problem where Fisher–Yates shuffle will be bottleneck in corner cases. But for me the real problem is brute force algorithm that doesn't scale well to all problems. Following story explains the problem and optimizations that I have come up with so far.
Dealing cards for 4 players
Number of possible deals is 96 bit number. That puts quite a stress for random number generator to avoid statical anomalies when selecting play plan from generated sample set of deals. I choose to use 2xmt19937_64 seeded from /dev/random because of the long period and heavy advertisement in web that it is good for scientific simulations.
Simple approach is to use Fisher–Yates shuffle to generate deals and filter out deals that don't match already collected information. Knuth shuffle takes ~1400 CPU cycles per deal mostly because I have to generate 51 random numbers and swap 51 times entries in the table.
That doesn't matter for normal cases where I would only need to generate 10000-100000 deals in 7 minutes. But there is extreme cases when filters may select only very small subset of hands requiring huge number of deals to be generated.
Using single number for multiple cards
When profiling with callgrind (valgrind) I noticed that main slow down was C++ random number generator (after switching away from std::uniform_int_distribution that was first bottleneck).
Then I came up with idea that I can use single random number for multiple cards. The idea is to use least significant information from the number first and then erase that information.
Of course that is only minor optimization because generation is still O(N).
Generation using bit permutations
Next idea was exactly solution asked in here but I ended up still with O(N) but with larger cost than original shuffle. But lets look into solution and why it fails so miserably.
I decided to use idea Dealing All the Deals by John Christman
So far good and pretty good looking but select_deal implementation is PITA.
Now that I had the O(N) permutation solution done to prove algorithm could work I started searching for O(1) mapping from random number to bit permutation. Too bad it looks like only solution would be using huge lookup tables that would kill CPU caches. That doesn't sound good idea for AI that will be using very large amount of caches for double dummy analyzer.
Mathematical solution
After all hard work to figure out how to generate random bit permutations I decided go back to maths. It is entirely possible to apply filters before dealing cards. That requires splitting deals to manageable number of layered sets and selecting between sets based on their relative probabilities after filtering out impossible sets.
I don't yet have code ready for that to tests how much cycles I'm wasting in common case where filter is selecting major part of deal. But I believe this approach gives the most stable generation performance keeping the cost less than 0.1%.
生成一个
32
位整数。对于每个索引i
(可能最多只有数组中元素数量的一半),如果位i % 32
为1
,则交换 <代码>i与n - i - 1
。当然,对于您的目的来说,这可能不够随机。您可以通过不与
n - i - 1
交换,而是通过应用于n
和i
的另一个函数来改进这一点,从而提供更好的分布。您甚至可以使用两个函数:一个用于当该位为0
时,另一个用于当该位为1
时。Generate a
32
bit integer. For each indexi
(maybe only up to half the number of elements in the array), if biti % 32
is1
, swapi
withn - i - 1
.Of course, this might not be random enough for your purposes. You could probably improve this by not swapping with
n - i - 1
, but rather by another function applied ton
andi
that gives better distribution. You could even use two functions: one for when the bit is0
and another for when it's1
.