除了 Fisher-Yates 和寻找“下一个排列”之外,还存在哪些洗牌算法?

发布于 2024-09-16 07:32:03 字数 247 浏览 2 评论 0原文

特别是在同一类型的一维项目集的域中,例如整数向量。

例如,您有一个大小为 32,768 的向量,其中包含排序后的整数 0 到 32,767。

我所说的“下一个排列”是指在词汇排序系统中执行下一个排列。

维基百科 列出了两个,我想知道是否还有更多(除了一些 bogo :P )

Specifically in the domain of one-dimensional sets of items of the same type, such as a vector of integers.

Say, for example, you had a vector of size 32,768 containing the sorted integers 0 through 32,767.

What I mean by "next permutation" is performing the next permutation in a lexical ordering system.

Wikipedia lists two, and I'm wondering if there are any more (besides something bogo :P)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

随心而道 2024-09-23 07:32:03

O(N)实施
这是基于 Eyal Schneider 的映射 Zn! -> P(n)

def get_permutation(k, lst):
    N = len(lst)
    while N:
        next_item = k/f(N-1)
        lst[N-1], lst[next_item] = lst[next_item], lst[N-1]
        k = k - next_item*f(N-1)
        N = N-1
    return lst

通过将转换步骤与查找排列相结合,减少了 O(N^2) 算法。它本质上与 Fisher-Yates 具有相同的形式,但用映射的下一步替换了对随机的调用。如果映射实际上是双射(我正在努力证明),那么这是一个比 Fisher-Yates 更好的算法,因为它只调用伪随机数生成器一次,因此会更有效。另请注意,这会返回排列 (N! - k) 的操作,而不是排列 k,但这影响不大,因为如果 k 在 [0, N!] 上是一致的,那么 N! 也是一致的。 - k.

旧答案

这与“下一个”排列的想法略有相关。如果这些项目可以很好地排序,那么就可以在排列上构造字典顺序。这允许您构建从整数到排列空间的映射。

那么找到一个随机排列就相当于选择一个0到N之间的随机整数!并构造相应的排列。该算法将与计算相关集合的第 n 个排列一样高效(并且难以实现)。如果我们选择的 n 是统一的,那么这很简单地给出了统一的排列选择。

关于排序排列的更多细节。给定一个集合S = {abcd},数学家将S的排列集合视为具有组合运算的群。如果 p 是一种排列,比如说 (bacd),那么 p 通过将 b 带到 a 对 S 进行操作,a到c,c到d和d到b。如果 q 是另一种排列,假设 (dbca) 则通过首先应用 q 然后 pq 获得 pq code>p 给出 (dab)(c)。例如,q 将 d 转化为 b,p 将 b 转化为 a,这样 pq 将 d 转化为 a。您将看到 pq 有两个周期,因为它需要 b 到 d 并修复 c。通常省略 1 个周期,但为了清楚起见我将其保留。

我们将使用群论中的一些事实。

  1. 不相交的循环通勤。 (ab)(cd)(cd)(ab) 相同,
  2. 我们可以以任何循环顺序在循环中排列元素。也就是 (abc) = (bca) = (cab)

因此,给定一个排列,对循环进行排序,使最大的循环排在前面。当两个循环的长度相同时,排列它们的项目,使得最大的(我们总是可以排序可数集,即使是任意的)项目排在第一位。然后我们首先按循环的长度进行字典顺序,然后按其内容进行排序。这是良好排序的,因为由相同循环组成的两个排列必须是相同的排列,因此如果q 和 q > p 然后p = q

该算法可以在 O(N!logN! + N!) 时间内轻松执行。只需构建所有排列(编辑:为了清楚起见,当我提出这个建议时,我戴着数学家的帽子,无论如何这是开玩笑的),对它们进行快速排序并找到第 n 个。它与您提到的两种算法不同。

O(N) implementation
This is based on Eyal Schneider's mapping Zn! -> P(n)

def get_permutation(k, lst):
    N = len(lst)
    while N:
        next_item = k/f(N-1)
        lst[N-1], lst[next_item] = lst[next_item], lst[N-1]
        k = k - next_item*f(N-1)
        N = N-1
    return lst

It reduces his O(N^2) algorithm by integrating the conversion step with finding the permutation. It essentially has the same form as Fisher-Yates but replaces a call to random with the next step of the mapping. If the mapping is in fact a bijection (which I'm working to prove) then this is a better algorithm than Fisher-Yates because it only calls out to pseudo random number generator once and so will be more efficient. Note also that this returns the action of permutation (N! - k) rather than permutation k but that's of little consequence because if k is uniform on [0, N!], then so is N! - k.

old answer

This is slightly related to the idea of "next" permutation. If the items can be well ordered, then one can construct lexicographical ordering on the permutations. This allows you to construct a map from the integers into the space of permutations.

Then finding a random permutation is equivalent to choosing a random integer between 0 and N! and constructing the corresponding permutation. This algorithm will be as efficient as (and as difficult to implement) as calculating the n'th permutation of the set in question. This trivially gives a uniform choice of permutation if our choice of n is uniform.

A little more detail about ordering the permutations. given a set S = {a b c d}, mathematicians view the set of permutations of S as a group with the operation of composition. if p is one permutation, lets say (b a c d), then p operates on S by taking b to a, a to c, c to d and d to b. if q is another permutation, lets say (d b c a) then pq is obtained by first applying q and then p which gives (d a b)(c). for example, q takes d to b and p takes b to a so that pq takes d to a. You'll see that pq has two cycles because it takes b to d and fixes c. It's customary to omit 1-cycles but I left it in for clarity.

We're going to use some facts from group theory.

  1. disjoint cycles commute. (a b)(c d) is the same as (c d)(a b)
  2. we can arrange elements in a cycle in any cyclic order. that is (a b c) = (b c a) = (c a b)

So given a permutation, order the cycles so that the largest cycles come first. When two cycles are the same length, arrange their items so that the largest (we can always order a denumerable set, even if arbitrarily so) item comes first. Then we just have a lexicographical ordering first on the length of the cycles, then on their contents. This is well ordered because two permutations that consist of the same cycles must be the same permutation so if p > q and q > p then p = q.

This algorithm can be trivially executed in O(N!logN! + N!) time. just construct all the permutations (EDIT: Just to be clear, I had my mathematician hat on when I proposed this and it was tongue in cheek anyway) , quicksort them and find the n'th. It is a different algorithm than the two you mention though.

空宴 2024-09-23 07:32:03

以下是关于如何改进aaronasterling的答案的想法。它避免生成所有 N!排列并根据字典顺序对它们进行排序,因此具有更好的时间复杂度。

在内部,它使用一种不寻常的排列表示,模拟选择和排序。从收缩数组中删除过程。例如,序列<0,1,0>。表示从 [0,1,2] 中删除项目 #0,然后从 [1,2] 中删除项目 #1,然后从 [1] 中删除项目 #0 所产生的排列。所得排列是<0,2,1>。利用这种表示,第一个排列将始终为<0,0,...0>,最后一个排列将始终为。我将这种特殊表示称为“数组表示”。

显然,通过使用数组并在必要时缩小它,可以在 O(N^2) 时间内将大小为 N 的数组表示转换为标准排列表示。

以下函数可用于返回数组表示形式中 {0,1,2...,N-1} 上的第 K 个排列:

getPermutation(k, N) {
    while(N > 0) {
        nextItem = floor(k / (N-1)!)
        output nextItem
        k = k - nextItem * (N-1)!
        N = N - 1
    }
}

该算法的工作时间为 O(N^2)(由于表示形式转换) ,而不是 O(N! log N) 时间。

--示例--

getPermutation(4,3) 返回 <2,0,0>。该数组表示对应于,它实际上是 {A,B,C} 上排列的有序列表中索引 4 处的排列:

ABC
ACB
BAC
BCA
CAB
CBA

Here is an idea on how to improve aaronasterling's answer. It avoids generating all N! permutations and sorting them according to their lexicographic order, and therefore has a much better time complexity.

Internally it uses an unusual permutation representation, that simulates a selection & removal process from a shrinking array. For example, the sequence <0,1,0> represents a permutation resulting from removing item #0 from [0,1,2], then removing item #1 from [1,2], and then removing item #0 from [1]. The resulting permutation is <0,2,1>. With this representation, the first permutation will always be <0,0,...0>, and the last one will always be <N-1,N-2,...0>. I will call this special representation the "array representation".

Clearly, an array representation of size N can be converted to a standard permutation representation in O(N^2) time, by using an array and shrinking it when necessary.

The following function can be used to return the Kth permutation on {0,1,2...,N-1}, in the array representation:

getPermutation(k, N) {
    while(N > 0) {
        nextItem = floor(k / (N-1)!)
        output nextItem
        k = k - nextItem * (N-1)!
        N = N - 1
    }
}

This algorithm works in O(N^2) time (due to the representation conversion), instead of O(N! log N) time.

--Example--

getPermutation(4,3) returns <2,0,0>. This array representation corresponds to <C,A,B>, which is really the permutation at index 4 in the ordered list of permutations on {A,B,C}:

ABC
ACB
BAC
BCA
CAB
CBA
方觉久 2024-09-23 07:32:03

您可以调整合并排序,使其随机打乱输入而不是对其进行排序。

特别是,当合并两个列表时,您随机选择新的头元素,而不是选择它作为最小的头元素。从第一个列表中选择元素的概率必须为 n/(n+m),其中 n 是第一个列表的长度,m使其起作用的第二个列表的长度。

我在这里写了详细的解释:随机排列和排序

You can adapt merge sort such that it will shuffle the input randomly instead of sorting it.

In particular, when merging two lists, you choose the new head element at random instead of choosing it to be the smallest head element. The probability of choosing the element from the first list must be n/(n+m) where n is the length of the first and m the length of the second list for this to work.

I've written a detailed explanation here: Random Permutations and Sorting.

夜深人未静 2024-09-23 07:32:03

另一种可能性是构建一个 LFSR 或 PRNG,其周期等于您想要的项目数。

Another possibility is to build an LFSR or PRNG with a period equal to the number of items you want.

倾城月光淡如水﹏ 2024-09-23 07:32:03

从排序数组开始。选择 2 个随机索引,交换这些索引处的元素。重复 O(n lg n) 次。

您需要重复 O(n lg n) 次以确保分布接近均匀。 (您需要确保每个索引至少被选择一次,这是一个球进箱子的问题。)

Start with a sorted array. Pick 2 random indexes, switch the elements at those indexes. Repeat O(n lg n) times.

You need to repeat O(n lg n) times to ensure that the distribution approaches uniform. (You need to make sure that each index is picked at least once, which is a balls-in-bins problem.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文