为什么 Fisher Yates 是最有用的洗牌算法?
您认为现代版本的 Fisher Yates 是最公正的洗牌算法吗? 您如何解释数组中每个元素位于其原始位置的概率为 1/n?
Would you say modern version of fisher yates is the most unbiased shuffling algorithm?
How would you explain that each element in the array has a probability of 1/n being in its original spot?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
给定一个完美的伪随机数生成器(Mersenne Twister 非常接近),Fisher-Yates算法是完全无偏的,因为每个排列都有相同的发生概率。使用归纳法很容易证明这一点。 Fisher-Yates 算法可以递归地编写如下(使用 Python 语法伪代码):
每个索引都有相等的概率被选为
firstElementIndex
。当您递归时,您现在有相同的概率选择仍然剩下的任何元素。编辑:该算法已在数学上被证明是无偏见的。由于该算法是不确定的,因此测试实现是否正常工作的最佳方法是统计。我会采用一些任意但小尺寸的数组,将其洗牌多次(每次都以与输入相同的排列开始)并计算每个输出排列发生的次数。然后,我将使用 Pearson 卡方检验 来测试此分布的均匀性。
Given a perfect pseudo-random number generator (the Mersenne Twister is very close), the Fisher-Yates algorithm is perfectly unbiased in that every permutation has an equal probability of occurring. This is easy to prove using induction. The Fisher-Yates algorithm can be written recursively as follows (in Python syntax pseudocode):
Each index has an equal probability of being selected as
firstElementIndex
. When you recurse, you now have an equal probability of choosing any of the elements that are still left.Edit: The algorithm has been mathematically proven to be unbiased. Since the algorithm is non-deterministic, the best way to test whether an implementation works properly is statistically. I would take an array of some arbitrary but small size, shuffle it a bunch of times (starting with the same permutation as input each time) and count the number of times each output permutation occurs. Then, I'd use Pearson's Chi-square Test to test this distribution for uniformity.
(现代,又名“Knuth”)Fisher–Yates shuffle
我们还想从算法中得到什么(嗯,是的,当排列数量变得巨大时,人们可能会尝试其他方法,但大多数情况不涉及如此巨大的计数)?
编辑:
' 只是注意到这个答案响应了问题的标题,而不是内容。 (这就是为什么问题的这两部分更好地匹配是件好事......)
简而言之,洗牌将与用于实现算法的特定 RNG 一样随机。
直观的解释是,对于一个有 m 个元素的数组,即使当 n 时,循环的递减控制变量下降到 1,位置 n 的单元格可能被交换的单元格也随之减小,该单元格被交换的概率也随之减小。很容易以完全相同的比例移动。换句话说,数组的最后一个元素可能会出现在数组中的任何位置,但它只有一次机会被移动(在第一次迭代时)。倒数第二个要移动的元素少了一个位置,但有 1/m 的概率它可能在第一次迭代期间很容易被移动。 ETC。
the (Modern, aka "Knuth") Fisher–Yates shuffle is
What else could we want out of an algorithm (well, yeah, when the number of permutations grows huge, one may try something else, but most cases do not involve such huge counts) ?
Edit:
' just noticed that this answer responds to the title of the question, not its content. (Which is why it is good to have these two parts of the question to match better...)
In a nutshell, the shuffle will be as random as the particular RNG used to implement the algorithm.
An intuitive explanation is that for an array with m element, even though as n, the decreasing control variable of the loop goes down towards 1, the possible cells where the cell at position n may be swapped with diminishes, the probability that this very cell has readily been moved increases in the exact same proportion. In other words, the last element of the array could end-up anywhere in the array, but it has only one chance to be moved (upon the very first iteration). The second to last element to be moved has one less place to go but there is a probability of 1/m that it may readily have been been moved during the very first iteration. etc.