费希尔耶茨变体
经典的 Fisher Yates 看起来像这样:
void shuffle1(std::vector<int>& vec)
{
int n = vec.size();
for (int i = n - 1; i > 0; --i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
昨天,我错误地“向后”实现了迭代:
void shuffle2(std::vector<int>& vec)
{
int n = vec.size();
for (int i = 1; i < n; ++i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
这个版本是否比第一个版本更差(或更好)?它会扭曲结果概率吗?
The classic Fisher Yates looks something like this:
void shuffle1(std::vector<int>& vec)
{
int n = vec.size();
for (int i = n - 1; i > 0; --i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Yesterday, I implemented the iteration "backwards" by mistake:
void shuffle2(std::vector<int>& vec)
{
int n = vec.size();
for (int i = 1; i < n; ++i)
{
std::swap(vec[i], vec[rand() % (i + 1)]);
}
}
Is this version in any way worse (or better) than the first? Does it skew the resulting probabilities?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
是的,假设
rand()
是均匀分布。我们将通过证明每个输入可以以相等的概率生成每个排列来证明这一点。N=2很容易证明。
我们将其绘制为一棵树,其中子级代表通过将逗号后的字符插入最左边的字符串可以获得的每个字符串。
对于 N,我们将拥有 N-1 的所有排列,并随机交换 N 的最后一个字符。
这个糟糕的归纳应该会导致它具有均匀分布。
示例:
N=2:
N=3:
N=4:(弯曲以帮助阅读)
等
Yes it's even distribution assuming
rand()
is. We will prove this by showing that each input can generate each permutation with equal probability.N=2 can be easily proven.
We will draw it as a tree where the the children represent each string you can get by inserting the character after comma into the leftmost string.
For N, we will have every permutations for N-1, and randomly swapping the last character for N
This shitty induction should lead you to it having even distribution.
Example:
N=2:
N=3:
N=4: (curved to aid reading)
etc
对我来说看起来不错(假设 rand() % N 是无偏的,但事实并非如此)。似乎应该可以证明输入的每个排列都是由 1 个随机选择序列产生的,其中每个随机选择都是平衡的。
将此与错误的实现进行比较,例如
Here 你可以看到有 nn 同等可能的方式来产生 n!排列,并且由于 nn 不能被 n 整除!其中n> 2、其中一些排列必须比其他排列更频繁地产生。
Looks OK to me (assuming rand() % N is unbiased, which it isn't). It seems it should be possible to demonstrate that every permutation of input is produced by exactly 1 sequence of random choices, where each random choice is balanced.
Compare this with a faulty implementation, such as
Here you can see that there are nn equally likely ways to produce n! permutations, and since nn is not evenly divisible by n! where n > 2, some of those permutations have to be produced more often than others.