有没有更好的方法来进行字符串排列?
void permute(string elems, int mid, int end)
{
static int count;
if (mid == end) {
cout << ++count << " : " << elems << endl;
return ;
}
else {
for (int i = mid; i <= end; i++) {
swap(elems, mid, i);
permute(elems, mid + 1, end);
swap(elems, mid, i);
}
}
}
上面的函数显示了 str
的排列(以 str[0..mid-1]
作为固定前缀,str[mid..end]< /code> 作为可替换后缀)。因此我们可以使用 permute(str, 0, str.size() - 1) 来显示一个字符串的所有排列。
但该函数使用了递归算法;也许它的性能可以提高?
有没有更好的方法来排列字符串?
void permute(string elems, int mid, int end)
{
static int count;
if (mid == end) {
cout << ++count << " : " << elems << endl;
return ;
}
else {
for (int i = mid; i <= end; i++) {
swap(elems, mid, i);
permute(elems, mid + 1, end);
swap(elems, mid, i);
}
}
}
The above function shows the permutations of str
(with str[0..mid-1]
as a steady prefix, and str[mid..end]
as a permutable suffix). So we can use permute(str, 0, str.size() - 1)
to show all the permutations of one string.
But the function uses a recursive algorithm; maybe its performance could be improved?
Are there any better methods to permute a string?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(20)
以下是来自 Wikipedia 条目的 C++ 非递归算法,用于排列的无序生成。对于长度为
n
的字符串s
,对于从0
到n 的任何
包含在内,以下修改k
! - 1s
以提供唯一的排列(即,与为该范围内的任何其他 k 值生成的排列不同)。要生成所有排列,请对所有 n! 运行它。 s 原始值的k
值。这里
swap(s, i, j)
交换字符串 s 的位置 i 和 j。Here is a non-recursive algorithm in C++ from the Wikipedia entry for unordered generation of permutations. For the string
s
of lengthn
, for anyk
from0
ton! - 1
inclusive, the following modifiess
to provide a unique permutation (that is, different from those generated for any other k value on that range). To generate all permutations, run it for all n!k
values on the original value of s.Here
swap(s, i, j)
swaps position i and j of the string s.为什么不尝试
std::next_permutation()
或std::prev_permutation()
?
链接:
std::next_permutation()
std::prev_permutation()
一个简单的示例:
输出:
Why dont you try
std::next_permutation()
orstd::prev_permutation()
?
Links:
std::next_permutation()
std::prev_permutation()
A simple example:
Output:
我想第二个Permaquid的答案。他引用的算法的工作方式与已提供的各种排列枚举算法根本不同。它不会生成 n 个对象的所有排列,而是生成一个独特的特定排列,给定
0 到 n!-1
之间的整数。如果您只需要一种特定的排列,那么它比枚举所有排列然后选择一个排列要快得多。即使您确实需要所有排列,它也提供了单个排列枚举算法所不提供的选项。我曾经写过一个强力密码破解程序,它尝试了所有可能的字母与数字的分配。对于
base-10
问题,这已经足够了,因为只有10!
排列可供尝试。但对于base-11
问题需要几分钟,而对于base-12
问题则需要近一个小时。我使用 Permaquid 引用的算法,将一直使用的排列枚举算法替换为简单的
i=0--to--N-1
for 循环。结果只是稍微慢了一点。但随后我将整数范围分成四份,并同时运行四个 for 循环,每个循环都在一个单独的线程中。在我的四核处理器上,生成的程序运行速度几乎是原来的四倍。正如使用排列枚举算法找到单个排列很困难一样,生成所有排列集合的描述子集也很困难。 Permaquid 引用的算法使这两件事变得非常简单
I'd like to second Permaquid's answer. The algorithm he cites works in a fundamentally different way from the various permutation enumeration algorithms that have been offered. It doesn't generate all of the permutations of n objects, it generates a distinct specific permutation, given an integer between
0 and n!-1
. If you need only a specific permutation, it's much faster than enumerating them all and then selecting one.Even if you do need all permutations, it provides options that a single permutation enumeration algorithm does not. I once wrote a brute-force cryptarithm cracker, that tried every possible assignment of letters to digits. For
base-10
problems, it was adequate, since there are only10!
permutations to try. But forbase-11
problems took a couple of minutes andbase-12
problems took nearly an hour.I replaced the permutation enumeration algorithm that I had been using with a simple
i=0--to--N-1
for-loop, using the algorithm Permaquid cited. The result was only slightly slower. But then I split the integer range in quarters, and ran four for-loops simultaneously, each in a separate thread. On my quad-core processor, the resulting program ran nearly four times as fast.Just as finding an individual permutation using the permutation enumeration algorithms is difficult, generating delineated subsets of the set of all permutations is also difficult. The algorithm that Permaquid cited makes both of these very easy
特别是,您需要 std::next_permutation。
...或者类似的东西...
In particular, you want std::next_permutation.
... or something like that...
任何生成排列的算法都将在多项式时间内运行,因为 n 长度字符串中的字符排列数量为
(n!)
。也就是说,有一些非常简单的就地算法来生成排列。查看 Johnson-Trotter 算法。Any algorithm for generating permutations is going to run in polynomial time, because the number of permutations for characters within an n-length string is
(n!)
. That said, there are some pretty simple in-place algorithms for generating permutations. Check out the Johnson-Trotter algorithm.Knuth 随机洗牌算法值得研究。
The Knuth random shuffle algorithm is worth looking into.
任何使用或生成所有排列的算法都将花费 O(N!*N) 时间,至少需要 O(N!) 来生成所有排列,并且 O(N) 来使用结果,这非常慢。请注意,打印字符串也是 O(N) afaik。
实际上,无论您使用什么方法,一秒钟内您只能处理最多 10 或 11 个字符的字符串。由于 11!*11 = 439084800 次迭代(在大多数机器上在一秒钟内执行这么多次迭代会推动它)和 12!*12 = 5748019200 次迭代。因此,即使最快的实现对于 12 个字符也需要大约 30 到 60 秒。
Factorial 增长得太快了,您无法希望通过编写更快的实现来获得任何东西,您最多只能获得一个字符。所以我建议 Prasoon 的推荐。它很容易编码并且速度相当快。尽管坚持你的代码也完全没问题。
我只是建议您注意不要无意中在字符串中包含额外的字符,例如空字符。因为这会让你的代码变慢 N 倍。
Any algorithm that makes use of or generates all permutations will take O(N!*N) time, O(N!) at the least to generate all permutations and O(N) to use the result, and that's really slow. Note that printing the string is also O(N) afaik.
In a second you can realistically only handle strings up to a maximum of 10 or 11 characters, no matter what method you use. Since 11!*11 = 439084800 iterations (doing this many in a second on most machines is pushing it) and 12!*12 = 5748019200 iterations. So even the fastest implementation would take about 30 to 60 seconds on 12 characters.
Factorial just grows too fast for you to hope to gain anything by writing a faster implementation, you'd at most gain one character. So I'd suggest Prasoon's recommendation. It's easy to code and it's quite fast. Though sticking with your code is completely fine as well.
I'd just recommend that you take care that you don't inadvertantly have extra characters in your string such as the null character. Since that will make your code a factor of N slower.
我最近写了一个排列算法。它使用 T 类型的向量(模板)而不是字符串,并且它不是超快,因为它使用递归并且存在大量复制。但也许你可以从代码中汲取一些灵感。您可以在此处。
I've written a permutation algorithm recently. It uses a vector of type T (template) instead of a string, and it's not super-fast because it uses recursion and there's a lot of copying. But perhaps you can draw some inspiration for the code. You can find the code here.
显着提高性能的唯一方法是首先找到一种避免迭代所有排列的方法!
排列是一个不可避免的缓慢操作(O(n!),或者更糟,取决于你对每个排列所做的事情),不幸的是你无法改变这个事实。
另请注意,启用优化后,任何现代编译器都会使递归变平,因此手动优化带来的(小)性能增益会进一步降低。
The only way to significantly improve performance is to find a way to avoid iterating through all the permutations in the first place!
Permuting is an unavoidably slow operation (O(n!), or worse, depending on what you do with each permutation), unfortunately nothing you can do will change this fact.
Also, note that any modern compiler will flatten out your recursion when optimisations are enabled, so the (small) performance gains from hand-optimising are reduced even further.
您想遍历所有排列,还是计算排列的数量?
对于前者,请按照其他人的建议使用
std::next_permutation
。每个排列都需要 O(N) 时间(但摊销时间较少),并且除了其调用帧之外没有内存,而递归函数则需要 O(N) 时间和 O(N) 内存。整个过程是 O(N!) 并且你不能做得比这更好,正如其他人所说,因为你不能在少于 O(X) 的时间内从程序中获得超过 O(X) 的结果!无论如何,没有量子计算机。对于后者,您只需要知道字符串中有多少个唯一元素即可。
速度受到查找重复元素操作的限制,对于 char 来说,可以使用查找表在 O(N) 时间内完成该操作。
Do you want to run through all the permutations, or count the number of permutations?
For the former, use
std::next_permutation
as suggested by others. Each permutation takes O(N) time (but less amortized time) and no memory except its callframe, vs O(N) time and O(N) memory for your recursive function. The whole process is O(N!) and you can't do better than this, as others said, because you can't get more than O(X) results from a program in less than O(X) time! Without a quantum computer, anyway.For the latter, you just need to know how many unique elements are in the string.
Speed is bounded by the operation of finding duplicate elements, which for
char
s can be done in O(N) time with a lookup table.我不认为这更好,但它确实有效并且不使用递归:
有趣的是,它从排列到排列使用的唯一状态是排列数、排列总数和原始字符串。这意味着它可以轻松地封装在迭代器或类似的东西中,而不必仔细保留确切的正确状态。它甚至可以是随机访问迭代器。
当然, ::std::next_permutation 将状态存储在元素之间的关系中,但这意味着它不能处理无序的事物,我真的很想知道如果序列中有两个相同的事物它会做什么。当然,您可以通过排列索引来解决这个问题,但这会稍微增加一些复杂性。
我的将适用于任何随机访问迭代器范围,只要它足够短。如果不是,你将永远无法完成所有的排列。
该算法的基本思想是N个项目的每一个排列都可以被枚举。总数为N!或
事实(N)
。任何给定的排列都可以被认为是源索引从原始序列到新序列中的一组目标索引的映射。一旦枚举了所有排列,剩下要做的唯一一件事就是将每个排列数映射到实际排列。置换列表中的第一个元素可以是原始列表中的 N 个元素中的任何一个。第二个元素可以是剩余 N-1 个元素中的任何一个,依此类推。该算法使用
%
运算符将排列数分解为一组具有这种性质的选择。首先,它对排列数取 N 的模,从 [0,N) 中得到一个数。它通过除以 N 来丢弃余数,然后将其除以列表的大小 - 1 以从 [0,N-1) 中获取数字,依此类推。这就是 for (i = 循环的作用。第二步是将每个数字转换为原始列表的索引。第一个数字很简单,因为它只是一个直接索引。第二个数字是列表中的索引,其中包含除第一个索引处删除的元素之外的所有元素,依此类推,这就是
for (o =
循环的作用。residuals
是 。idxs
的索引列表是原始列表的索引列表。residuals 和中的值之间存在一对一映射。它们各自代表不同“坐标空间”中的相同值。
您选择的答案所指向的答案具有相同的基本思想,但有一种比我的字面意思更优雅的方式来完成映射。这种方法会比我的方法稍快一些,但它们的速度大致相同,并且它们都具有随机访问排列空间的相同优势,这使得很多事情变得更容易,包括(作为你的答案)指出)并行算法。
I don't think this is better, but it does work and does not use recursion:
The fun thing about this is that the only state it uses from permutation to permutation is the number of the permutation, the total number of permutations, and the original string. That means it can be easily encapsulated in an iterator or something like that without having to carefully preserve the exact correct state. It can even be a random access iterator.
Of course ::std::next_permutation stores the state in the relationships between elements, but that means it can't work on unordered things, and I would really wonder what it does if you have two equal things in the sequence. You can solve that by permuting indexes of course, but that adds slightly more complication.
Mine will work with any random access iterator range provided it's short enough. And if it isn't, you'll never get through all the permutations anyway.
The basic idea of this algorithm is that every permutation of N items can be enumerated. The total number is N! or
fact(N)
. And any given permutation can be thought of as a mapping of source indices from the original sequence into a set of destination indices in the new sequence. Once you have an enumeration of all permutations the only thing left to do is map each permutation number into an actual permutation.The first element in the permuted list can be any of the N elements from the original list. The second element can be any of the N - 1 remaining elements, and so on. The algorithm uses the
%
operator to pull apart the permutation number into a set of selections of this nature. First it modulo's the permutation number by N to get a number from [0,N). It discards the remainder by dividing by N, then it modulo's it by the size of the list - 1 to get a number from [0,N-1) and so on. That is what thefor (i =
loop is doing.The second step is translating each number into an index into the original list. The first number is easy because it's just a straight index. The second number is an index into a list that contains every element but the one removed at the first index, and so on. That is what the
for (o =
loop is doing.residuals
is a list of indices into the successively smaller lists.idxs
is a list of indices into the original list. There is a one-one mapping between values inresiduals
andidxs
. They each represent the same value in different 'coordinate spaces'.The answer pointed to by the answer you picked has the same basic idea, but has a much more elegant way of accomplishing the mapping than my rather literal and brute force method. That way will be slightly faster than my method, but they are both about the same speed and they both have the same advantage of random access into permutation space which makes a whole number of things easier, including (as the answer you picked pointed out) parallel algorithms.
实际上你可以使用 Knuth 洗牌算法来做到这一点!
Actually you can do it using Knuth shuffling algo!
这篇文章:http://cplusplus.co.il/2009/11/ 14/enumerate-permutations/ 处理几乎任何东西的排列,而不仅仅是字符串。帖子本身和下面的评论内容非常丰富,我不想复制和粘贴..
This post: http://cplusplus.co.il/2009/11/14/enumerating-permutations/ deals with permuting just about anything, not only strings. The post itself and the comments below are pretty informative and I wouldn't want to copy&paste..
如果您对排列生成感兴趣,我不久前对此做了一篇研究论文:http: //www.oriontransfer.co.nz/research/permutation- Generation
它带有完整的源代码,并且实现了大约 5 种不同的方法。
If you are interested in permutation generation I did a research paper on it a while back : http://www.oriontransfer.co.nz/research/permutation-generation
It comes complete with source code, and there are 5 or so different methods implemented.
即使我第一次发现很难理解递归版本,我花了一些时间来寻找 berre 方法。更好的找到方法(我能想到的)是使用 Narayana Pandita。基本思想是:
第 4 点和第 5 点执行相同的操作,但第 4 点的时间复杂度为 O(n*n!),第 5 点的时间复杂度为 O(n^2*n!)。
上述算法甚至可以应用于字符串中存在重复字符的情况。 :
显示字符串所有排列的代码:
Even I found it difficult to understand that recursive version of the first time and it took me some time to search for a berre way.Better method to find (that I can think of) is to use the algorithm proposed by Narayana Pandita. The basic idea is:
Point 4 and 5 do the same thing but the time complexity in case of point 4 is O(n*n!) and that in case of point 5 is O(n^2*n!).
The above algorithm can even be applied to the case when we have duplicate characters in the string. :
The code for displaying all the permutation of a string :
这是我刚刚沙沙作响的一个!
Here's one I just rustled up!!
这不是最好的逻辑,但是,我是一个初学者。如果有人给我关于这段代码的建议,我会非常高兴和感激
This is not the best logic, but then, i am a beginner. I'll be quite happy and obliged if anyone gives me suggestions on this code
这是另一个用于字符串排列的递归函数:
该函数收集向量 res 中的所有排列。
这个想法可以推广到使用模板和迭代器的不同类型的容器:
这可能是一个有趣的编程练习,但在生产代码中,您应该使用非递归版本的 permutation ,例如 next_permutation 。
Here yet another recursive function for string permutations:
The function collects all permutations in vector res.
The idea can be generalized for different type of containers using templates and iterators:
This may be an interesting programming exercise, but in production code you should use a non recusrive version of permutation , like next_permutation.