验证 Knuth 洗牌算法尽可能无偏

发布于 2024-08-10 10:11:26 字数 1317 浏览 6 评论 0原文

我正在为 C++ 项目实现 Knuth shuffle正在努力。我试图从我的洗牌中获得最公正的结果（而且我不是（伪）随机数生成方面的专家）。我只是想确保这是最公正的洗牌实现。

draw_t 是字节类型（typedef 转换为 unsigned char）。 items 是列表中的项目数。我在下面添加了 random::get(draw_t max) 的代码。

for( draw_t pull_index = (items - 1); pull_index > 1; pull_index-- )
{
    draw_t push_index = random::get( pull_index );

    draw_t push_item = this->_list[push_index];
    draw_t pull_item = this->_list[pull_index];

    this->_list[push_index] = pull_item;
    this->_list[pull_index] = push_item;
}

我使用的随机函数已经过修改，以消除模偏差。 RAND_MAX 分配给 random::_internal_max。

draw_t random::get( draw_t max )
{
    if( random::_is_seeded == false )
    {
        random::seed( );
    }

    int rand_value = random::_internal_max;
    int max_rand_value = random::_internal_max - ( max - ( random::_internal_max % max ) );

    do
    {
        rand_value = ::rand( );
    } while( rand_value >= max_rand_value );

    return static_cast< draw_t >( rand_value % max );
}

原文

I'm implementing a Knuth shuffle for a C++ project I'm working on. I'm trying to get the most unbiased results from my shuffle (and I'm not an expert on (pseudo)random number generation). I just want to make sure this is the most unbiased shuffle implementation.

draw_t is a byte type (typedef'd to unsigned char). items is the count of items in the list. I've included the code for random::get( draw_t max ) below.

for( draw_t pull_index = (items - 1); pull_index > 1; pull_index-- )
{
    draw_t push_index = random::get( pull_index );

    draw_t push_item = this->_list[push_index];
    draw_t pull_item = this->_list[pull_index];

    this->_list[push_index] = pull_item;
    this->_list[pull_index] = push_item;
}

The random function I'm using has been modified to eliminate modulo bias. RAND_MAX is assigned to random::_internal_max.

draw_t random::get( draw_t max )
{
    if( random::_is_seeded == false )
    {
        random::seed( );
    }

    int rand_value = random::_internal_max;
    int max_rand_value = random::_internal_max - ( max - ( random::_internal_max % max ) );

    do
    {
        rand_value = ::rand( );
    } while( rand_value >= max_rand_value );

    return static_cast< draw_t >( rand_value % max );
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

仅一夜美梦 2024-08-17 10:11:26

好吧，作为黑盒测试，您可以做的一件事是采用一些相对较小的数组大小，对其执行大量洗牌，计算观察每个排列的次数，然后执行 Pearson 卡方检验以确定结果是否均匀分布在排列空间上。

另一方面，只要索引所来自的随机数生成器是无偏的，Knuth 洗牌（又称 Fisher-Yates 洗牌）就被证明是无偏的。

回复收藏 0 原文

颜 2024-08-17 10:11:26

如果我没看错的话，您的 random::get (max) 不包括 max。

此行：

draw_t push_index = random::get( pull_index );

然后会产生一个“经典”的相差一错误，因为您的 pull_index 和 push_index 永远不可能相同。这会产生一个微妙的偏差，即您永远无法将项目放在洗牌之前的位置。在一个极端的例子中，这种“洗牌”下的两项列表总是会颠倒过来。

If I see that right, your random::get (max) doesn't include max.

This line:

draw_t push_index = random::get( pull_index );

then produces a "classical" off-by-one error, as your pull_index and push_index erroneously can never be the same. This produces a subtle bias that you can never have an item where it was before the shuffle. In an extreme example, two-item lists under this "shuffle" would always be reversed.

回复收藏 0 原文

空城旧梦 2024-08-17 10:11:26

看看 Jeff Atwood 的这篇文章：

洗牌
http://www.codinghorror.com/blog/archives/001008.html

另请参阅：

天真的危险
http://www.codinghorror.com/blog/archives/001015.html

回复收藏 0 原文

莫言歌 2024-08-17 10:11:26

Knuth 洗牌本身被证明是无偏的：恰好存在一系列操作来产生每种可能的洗牌。然而，您的 PRNG 不太可能有足够的状态位来表达每种可能的洗牌，因此真正的问题是您的 PRNG 就其实际产生的洗牌集而言是否“足够随机”，以及您的播种策略是否足够安全。

只有您可以决定这一点，因为这取决于不够随机的洗牌的后果。例如，如果您处理的是真钱，我建议改用加密安全的 PRNG 并改进您的播种策略。尽管大多数内置 PRNG 都会产生良好的随机性，但它们也很容易进行逆向工程，并且调用不带参数的 seed() 可能会根据当前时间进行播种，这很容易预测。

回复收藏 0 原文

别念他 2024-08-17 10:11:26

#include <cstdlib> // srand() && rand()

/** Shufle the first 'dim' values in array 'V[]'.
    - Implements the Fisher–Yates_shuffle.
    - Uses the standard function 'rand()' for randomness.
    - Initialices the random sequence using 'seed'.
    - Uses 'dim' swaps.
    \see http://stackoverflow.com/questions/1685339/
    \see http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm
*/
template <class T>
void Fisher_Yates_shuffle( T* V, unsigned dim , unsigned seed ) {
    srand(seed);
    T temp;
    unsigned i,iPP;

    i   = dim-1;
    iPP = dim;
    while ( i>0 ) {
        unsigned j = rand() % iPP;
        if ( i!=j ) { // swap
            temp = V[i]; V[i] = V[j]; V[j] = temp;
        }
        iPP = i;
        --i;
    }
/*
    This implementation depends on the randomness of the random number
    generator used ['rand()' in this case].
*/
}

#include <cstdlib> // srand() && rand()

/** Shufle the first 'dim' values in array 'V[]'.
    - Implements the Fisher–Yates_shuffle.
    - Uses the standard function 'rand()' for randomness.
    - Initialices the random sequence using 'seed'.
    - Uses 'dim' swaps.
    \see http://stackoverflow.com/questions/1685339/
    \see http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm
*/
template <class T>
void Fisher_Yates_shuffle( T* V, unsigned dim , unsigned seed ) {
    srand(seed);
    T temp;
    unsigned i,iPP;

    i   = dim-1;
    iPP = dim;
    while ( i>0 ) {
        unsigned j = rand() % iPP;
        if ( i!=j ) { // swap
            temp = V[i]; V[i] = V[j]; V[j] = temp;
        }
        iPP = i;
        --i;
    }
/*
    This implementation depends on the randomness of the random number
    generator used ['rand()' in this case].
*/
}

回复收藏 0 原文

~没有更多了~