当前位置：文江博客话题详情

随机排列

发布于 2024-09-06 12:15:59 字数 264 浏览 7 评论 0原文

我想尽快生成随机排列。问题：O(n) 的 Knuth shuffle 涉及生成 n 随机数。由于生成随机数非常昂贵，我想找到一个包含固定 O(1) 数量的随机数的 O(n) 函数。

我意识到这个问题以前已经被问过，但我没有看到任何相关的答案。

只是强调一点：我并不是在寻找低于 O(n) 的东西，只是一种涉及较少生成随机数的算法。

谢谢。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧情勿念 2024-09-13 12:15:59

创建每个排列到从 1 到 n 的数字的 1-1 映射！（n 阶乘）。生成1到n!之间的随机数，利用映射，得到排列。

对于映射，也许这会很有用： http://en.wikipedia.org/wiki/Permutation #Numbering_permutations

当然，这很快就会失控，因为 n!很快就会变得非常大。

回复收藏 0 原文

柠檬心 2024-09-13 12:15:59

你说生成随机数需要很长时间？ Javas Random.nextInt 的实现大致是这样，

oldseed = seed;
nextseed = (oldseed * multiplier + addend) & mask;
return (int)(nextseed >>> (48 - bits));

是不是每个元素要做的工作太多了？

Generating a random number takes long time you say? The implementation of Javas Random.nextInt is roughly

oldseed = seed;
nextseed = (oldseed * multiplier + addend) & mask;
return (int)(nextseed >>> (48 - bits));

Is that too much work to do for each element?

回复收藏 0 原文

花海 2024-09-13 12:15:59

请参阅 https://doi.org/10.1145/3009909 对随机位数进行仔细分析需要生成随机排列。（它是开放获取的，但阅读起来并不容易！底线：如果仔细实施，生成随机排列的所有常用方法在使用随机位方面都是有效的。）

并且......如果您的目标是生成随机排列对于大 N 的快速排列，我建议您尝试 MergeShuffle 算法。 2015 年发表的文章声称在并行和顺序方面比 Fisher-Yates 加速了两倍实现，并且与他们测试的其他标准算法（Rao-Sandelius）相比，顺序计算显着加速。

MergeShuffle（以及常用的 Fisher-Yates 和 Rao-Sandelius 算法）的实现可在 https:/ /github.com/axel-bacher/mergeshuffle。但买者自负！作者是理论家，而不是软件工程师。他们已将实验代码发布到 github 但不进行维护。有一天，我想象有人（也许是你！）会将 MergeShuffle 添加到 GSL 中。目前 gsl_ran_shuffle() 是 Fisher-Yates 的一个实现，参见 https://www.gnu.org/software/gsl/doc/html/randist.html?highlight=gsl_ran_shuffle。

< img src="https://i.sstatic.net/cB4se.jpg" alt="我们在 100 次试验中实现各种随机排列算法所使用的平均随机位数。">

回复收藏 0 原文

随波逐流 2024-09-13 12:15:59

不是您确切要求的，但如果提供的随机数生成器不能满足您的要求，可能您应该尝试不同的东西。一般来说，伪随机数的生成可以非常简单。

可能是最著名的算法
http://en.wikipedia.org/wiki/Linear_congruential_generator

回复收藏 0 原文

北城半夏 2024-09-13 12:15:59

正如其他答案所建议的，您可以创建一个 0 到 N 范围内的随机整数！并用它来产生随机播放。虽然理论上是正确的，但一般来说这不会更快，因为 N!增长得很快，你将把所有的时间都花在做 bigint 算术上。

如果您想要速度并且不介意牺牲一些随机性，那么使用不太好的随机数生成器会更好。线性同余生成器（参见 http://en.wikipedia.org/wiki/Linear_congruential_generator）会在几个周期内给你一个随机数。

回复收藏 0 原文

习惯成性 2024-09-13 12:15:59

通常不需要全范围的下一个随机值，因此要使用完全相同的随机量，您可以使用下一个方法（我猜这几乎就像随机（0，N！））：

// ...
m = 1; // range of random buffer (single variant)
r = 0; // random buffer (number zero)
// ... 
for(/* ... */) {
    while (m < n) { // range of our buffer is too narrow for "n"
        r = r*RAND_MAX + random(); // add another random to our random-buffer
        m *= RAND_MAX; // update range of random-buffer
    }
    x = r % n; // pull-out  next random with range "n"
    r /= n; // remove it from random-buffer
    m /= n; // fix range of random-buffer
    // ...
}

PS当然会有可能存在一些与除以不同于 2^n 的值相关的错误，但它们将分布在结果样本中。

Usually there is no need in full-range of next random value, so to use exactly the same amount of randomness you can use next approach (which is almost like random(0,N!), I guess):

// ...
m = 1; // range of random buffer (single variant)
r = 0; // random buffer (number zero)
// ... 
for(/* ... */) {
    while (m < n) { // range of our buffer is too narrow for "n"
        r = r*RAND_MAX + random(); // add another random to our random-buffer
        m *= RAND_MAX; // update range of random-buffer
    }
    x = r % n; // pull-out  next random with range "n"
    r /= n; // remove it from random-buffer
    m /= n; // fix range of random-buffer
    // ...
}

P.S. of course there will be some errors related with division by value different from 2^n, but they will be distributed among resulted samples.

回复收藏 0 原文

匿名。 2024-09-13 12:15:59

在进行计算之前生成 N 个数字（N <您需要的随机数的数量），或者使用缓慢但良好的随机生成器将它们作为数据存储在数组中；然后选择一个数字，只需在计算循环内增加数组的索引即可；如果您需要不同的种子，请创建多个表。

回复收藏 0 原文

も星光 2024-09-13 12:15:59

您确定您解决问题的数学和算法方法是正确的吗？

我遇到了完全相同的问题，费舍尔-耶茨洗牌在极端情况下将成为瓶颈。但对我来说，真正的问题是暴力算法，它不能很好地适应所有问题。下面的故事解释了我迄今为止提出的问题和优化。

4 名玩家发牌

可能的发牌数量为96 位。这给随机数生成器带来了相当大的压力，以避免在从生成的交易样本集中选择游戏计划时出现静态异常。我选择使用来自 /dev/random 的 2xmt19937_64 种子，因为网络上的长期广告和大量广告表明它有利于科学模拟。

简单的方法是使用 Fisher–Yates shuffle 来生成交易并过滤掉与已收集的信息不匹配的交易。 Knuth shuffle 每笔交易大约需要 1400 个 CPU 周期，主要是因为我必须生成 51 个随机数并交换表中的 51 次条目。

这对于正常情况来说并不重要，我只需要在 7 分钟内生成 10000-100000 笔交易。但在极端情况下，过滤器可能只选择非常小的手牌子集，从而需要生成大量交易。

对多张卡使用单个数字

当使用 callgrind (valgrind) 进行分析时，我注意到主要的减慢速度是 C++ 随机数生成器（在从第一个瓶颈的 std::uniform_int_distribution 切换之后）。

然后我想到可以对多张卡使用单个随机数。这个想法是首先使用数字中最不重要的信息，然后删除该信息。

int number = uniform_rng(0, 52*51*50*49);
int card1 = number % 52;
number /= 52;
int cards2 = number % 51;
number /= 51;
......

当然，这只是次要的优化，因为生成仍然是 O(N)。

使用位排列生成

下一个想法正是此处提出的解决方案，但我最终仍然使用 O(N) 但成本比原始洗牌更大。但让我们看看解决方案以及为什么它失败得如此惨烈。

我决定使用想法 Dealing All the Deals by John Christman

void Deal::generate()
{
    // 52:26 split, 52!/(26!)**2 = 495,918,532,948,1041
    max = 495918532948104LU;
    partner = uniform_rng(eng1, max);

    // 2x 26:13 splits, (26!)**2/(13!)**2 = 10,400,600**2
    max = 10400600LU*10400600LU;
    hands = uniform_rng(eng2, max);

    // Create 104 bit presentation of deal (2 bits per card)
    select_deal(id, partner, hands);
}

到目前为止还不错看起来很漂亮，但 select_deal 实现是 PITA。

void select_deal(Id &new_id, uint64_t partner, uint64_t hands)
{
    unsigned idx;
    unsigned e, n, ns = 26;

    e = n = 13;

    // Figure out partnership who owns which card
    for (idx = CARDS_IN_SUIT*NUM_SUITS; idx > 0; ) {
        uint64_t cut = ncr(idx - 1, ns);

        if (partner >= cut) {
            partner -= cut;
            // Figure out if N or S holds the card
            ns--;
            cut = ncr(ns, n) * 10400600LU;

            if (hands > cut) {
                hands -= cut;
                n--;
            } else
                new_id[idx%NUM_SUITS] |= 1 << (idx/NUM_SUITS);

        } else
            new_id[idx%NUM_SUITS + NUM_SUITS] |= 1 << (idx/NUM_SUITS);

        idx--;
    }

    unsigned ew = 26;
    // Figure out if E or W holds a card
    for (idx = CARDS_IN_SUIT*NUM_SUITS; idx-- > 0; ) {
        if (new_id[idx%NUM_SUITS + NUM_SUITS] & (1 << (idx/NUM_SUITS))) {
            uint64_t cut = ncr(--ew, e);

            if (hands >= cut) {
                hands -= cut;
                e--;
            } else
                new_id[idx%NUM_SUITS] |= 1 << (idx/NUM_SUITS);

        }
    }
}

现在我已经完成了 O(N) 排列解决方案来证明算法可以工作，我开始搜索从随机数到位排列的 O(1) 映射。太糟糕了，看起来唯一的解决方案是使用巨大的查找表，这会杀死 CPU 缓存。对于将使用大量缓存进行双虚拟分析器的人工智能来说，这听起来不是一个好主意。

数学解决方案

在努力弄清楚如何生成随机位排列之后，我决定回到数学。在发牌之前应用过滤器是完全可能的。这需要将交易分割为可管理数量的分层集合，并在过滤掉不可能的集合后根据它们的相对概率在集合之间进行选择。

我还没有准备好代码来测试在过滤器选择交易主要部分的常见情况下我浪费了多少周期。但我相信这种方法可以提供最稳定的发电性能，并将成本保持在 0.1% 以下。

Are you sure that your mathematical and algorithmical approach to the problem is correct?

I hit exactly same problem where Fisher–Yates shuffle will be bottleneck in corner cases. But for me the real problem is brute force algorithm that doesn't scale well to all problems. Following story explains the problem and optimizations that I have come up with so far.

Dealing cards for 4 players

Number of possible deals is 96 bit number. That puts quite a stress for random number generator to avoid statical anomalies when selecting play plan from generated sample set of deals. I choose to use 2xmt19937_64 seeded from /dev/random because of the long period and heavy advertisement in web that it is good for scientific simulations.

Simple approach is to use Fisher–Yates shuffle to generate deals and filter out deals that don't match already collected information. Knuth shuffle takes ~1400 CPU cycles per deal mostly because I have to generate 51 random numbers and swap 51 times entries in the table.

That doesn't matter for normal cases where I would only need to generate 10000-100000 deals in 7 minutes. But there is extreme cases when filters may select only very small subset of hands requiring huge number of deals to be generated.

Using single number for multiple cards

When profiling with callgrind (valgrind) I noticed that main slow down was C++ random number generator (after switching away from std::uniform_int_distribution that was first bottleneck).

Then I came up with idea that I can use single random number for multiple cards. The idea is to use least significant information from the number first and then erase that information.

int number = uniform_rng(0, 52*51*50*49);
int card1 = number % 52;
number /= 52;
int cards2 = number % 51;
number /= 51;
......

Of course that is only minor optimization because generation is still O(N).

Generation using bit permutations

Next idea was exactly solution asked in here but I ended up still with O(N) but with larger cost than original shuffle. But lets look into solution and why it fails so miserably.

I decided to use idea Dealing All the Deals by John Christman

void Deal::generate()
{
    // 52:26 split, 52!/(26!)**2 = 495,918,532,948,1041
    max = 495918532948104LU;
    partner = uniform_rng(eng1, max);

    // 2x 26:13 splits, (26!)**2/(13!)**2 = 10,400,600**2
    max = 10400600LU*10400600LU;
    hands = uniform_rng(eng2, max);

    // Create 104 bit presentation of deal (2 bits per card)
    select_deal(id, partner, hands);
}

So far good and pretty good looking but select_deal implementation is PITA.

void select_deal(Id &new_id, uint64_t partner, uint64_t hands)
{
    unsigned idx;
    unsigned e, n, ns = 26;

    e = n = 13;

    // Figure out partnership who owns which card
    for (idx = CARDS_IN_SUIT*NUM_SUITS; idx > 0; ) {
        uint64_t cut = ncr(idx - 1, ns);

        if (partner >= cut) {
            partner -= cut;
            // Figure out if N or S holds the card
            ns--;
            cut = ncr(ns, n) * 10400600LU;

            if (hands > cut) {
                hands -= cut;
                n--;
            } else
                new_id[idx%NUM_SUITS] |= 1 << (idx/NUM_SUITS);

        } else
            new_id[idx%NUM_SUITS + NUM_SUITS] |= 1 << (idx/NUM_SUITS);

        idx--;
    }

    unsigned ew = 26;
    // Figure out if E or W holds a card
    for (idx = CARDS_IN_SUIT*NUM_SUITS; idx-- > 0; ) {
        if (new_id[idx%NUM_SUITS + NUM_SUITS] & (1 << (idx/NUM_SUITS))) {
            uint64_t cut = ncr(--ew, e);

            if (hands >= cut) {
                hands -= cut;
                e--;
            } else
                new_id[idx%NUM_SUITS] |= 1 << (idx/NUM_SUITS);

        }
    }
}

Now that I had the O(N) permutation solution done to prove algorithm could work I started searching for O(1) mapping from random number to bit permutation. Too bad it looks like only solution would be using huge lookup tables that would kill CPU caches. That doesn't sound good idea for AI that will be using very large amount of caches for double dummy analyzer.

Mathematical solution

After all hard work to figure out how to generate random bit permutations I decided go back to maths. It is entirely possible to apply filters before dealing cards. That requires splitting deals to manageable number of layered sets and selecting between sets based on their relative probabilities after filtering out impossible sets.

I don't yet have code ready for that to tests how much cycles I'm wasting in common case where filter is selecting major part of deal. But I believe this approach gives the most stable generation performance keeping the cost less than 0.1%.

回复收藏 0 原文