当前位置：文江博客话题详情

haskell permutation

有人可以解释此排列生成代码背后的逻辑吗？

发布于 2025-02-14 00:50:07 字数 2605 浏览 0 评论 0 原文

Frnakly，对我来说看起来像黑魔法。当然，我对Haskell的了解相当有限，但我可以理解基本原则。仅仅因为它使我尝试通过测试输入工作变得容易得多，但是在这种情况下，这仍然没有意义。

背景：大约一年前>关于我正在尝试的置换生成代码，积累了我的“重复置换”代码的递归呼叫，以堵塞RAM？）开始工作。我从这个奇妙的社区那里得到了很多帮助，并继续了我的项目。但是，我最近重新探讨的用户“彼得·普特”（Peter Pun）的最后一个答案是我当时没有太多想法的。

一部分答案，生成排列的重要一个是，这

permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insert) [[]]
    where insert y = foldr combine [[y]] . (zip <*> tail . tails) -- tails through importing Data.List
              where combine (x, xs) xss = (y : x : xs) :
                        if y == x then [] else map (x :) xss

使我能够以对我更有意义的方式改革该功能，以便我可以看到每个论点并参考它，因为原始应用中的部分应用使我困惑的可变名称。请告诉我解释是否正确。

permutationsNub :: Eq a => [a] -> [[[a]]]
permutationsNub digits = scanr (concatMap . insert) [[]] digits
    where insert digit perms = foldr combine [[digit]] $ (zip <*> tail . tails) perms -- zip will produce  tuples of the form (a, [a])
              where combine (permX, tailOfPerms) digitAsDoubleList = (digit : permX : tailOfPerms) : 
                      if digit == permX then [] else map (permX :) digitAsDoubleList

我用SCANR替换了Main FoldR函数（也是输出类型）以查看中间步骤，因此，如果我运行 permuntuntNub [2,7,7,7,8] 结果是 [[[[2,7,7,8]，[7,2,7,8]，[7,7,2,8]，[7,7,8,2]，[2,7，[2,7， 8,7]，[7,2,8,7]，[7,8,2,7]，[7,8,7,2]，[2,8， 7,7]，[8,2,7,7]，[8,7,2,7]，[8,7,7,2]，[[7,7,8]，[7,8，[7,8， 7]，[8,7,7]]，[7,8]，[8,7]]，[[8]]，[[]] 给出一些洞察力。

因此，我必须问：

首先，为什么使用 zip> zip 的应用符号（＆lt;*＆gt; ）？它是由使用毫无用处符号所决定的吗？应用程序如何渗入？
我对变量的重命名，使它们对我来说很有意义，因为功能有效，又名与原始输出相同。但是，有些事情很奇怪，例如 Digit 如何根据组合函数在比较检查中等于 permx ？前者是int，后者是[int]，否则像 digit：permx：... 在同一函数中无法使用，对吗？例如，如果我们调用 permuntunnub [2,7,7,8] ，在主foldr（或scanr）函数的第二遍中，数字将为 7 和permx将 [8] ，那么这两个如何相等？
从这里变得陌生。好吧，他们俩都可以被束缚到尾流，这是一个[[int]]，但是在括号中封闭的整个事情都与... [[int]]有关？再次？

实际上， map（permx :) digitasdoublelist 似乎不起作用（但是，它以某种方式确实...），因为我们正在尝试将[int]映射到[[int]]，例如是结果，然后是先前的[[int]]的缺点，我看不出它是如何适合的。例如，在 permuntuarnnub的第二次通过，将7个作为下一个数字，由主foldr/scanr函数选中， insert的第一个通过将为（7：[8]：[] ）：[[8]：[[7]]，我无法理解。

注意：如A 更高级别说明代码的作者所说的基本思想是，鉴于列表的尾巴的唯一排列，我们可以通过插入来计算整个列表的唯一排列它进入所有尾巴的排列，在不遵循头部出现的每个位置（避免创建重复）。因此，例如，假设尾巴的当前排列是[[7,8]，[8,7]（请参见Pass 2），下一个数字为7。因此，算法将在第一个置换开始时插入7（生成[7,7,8]，然后在最后（生产[7,8,7]，然后仅在第二个排列开始时（生产[8,7,7]）？它怎么能说出它不需要其他地方？作者的含义是 遵循头部的位置？

没有

原文

Frnakly, it looks like black magic to me. Granted, my knowledge of Haskell is fairly limited, but I can understand basic principles. Just because it makes matters that much easier I try to work through a test input, but in this case it still doesn't make sense.

Background:
About a year ago I posted this question (Are recursive calls in my "permutations with repetition" code accumulated to clog the RAM?) about a piece of permutation generation code I was trying to get working. I received quite a lot of help from this wonderful community and went on with my project. However, there was one last answer I didn't think much of at the time, from user "peter pun" that I revisited recently.

Part of the answer, the important one, that generates permutations is this

permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insert) [[]]
    where insert y = foldr combine [[y]] . (zip <*> tail . tails) -- tails through importing Data.List
              where combine (x, xs) xss = (y : x : xs) :
                        if y == x then [] else map (x :) xss

Allow me to reform the function in a way that makes more sense for me, so that I can see every argument and refer to it, because of partial application in the original and variable names that confused me. Please tell me if the interpretation is correct.

permutationsNub :: Eq a => [a] -> [[[a]]]
permutationsNub digits = scanr (concatMap . insert) [[]] digits
    where insert digit perms = foldr combine [[digit]] $ (zip <*> tail . tails) perms -- zip will produce  tuples of the form (a, [a])
              where combine (permX, tailOfPerms) digitAsDoubleList = (digit : permX : tailOfPerms) : 
                      if digit == permX then [] else map (permX :) digitAsDoubleList

I replaced the main foldr function with scanr (the output type also) to see intermediate steps, so if I run permutationNub [2,7,7,8] the result is [[[2,7,7,8],[7,2,7,8],[7,7,2,8],[7,7,8,2],[2,7,8,7],[7,2,8,7],[7,8,2,7],[7,8,7,2],[2,8,7,7],[8,2,7,7],[8,7,2,7],[8,7,7,2]],[[7,7,8],[7,8,7],[8,7,7]],[[7,8],[8,7]],[[8]],[[]]] which gives some insight.

So, I have to ask:

First of all, why is applicative notation (<*>) used next to zip? Is it dictated by the use of point free notation? How do applicatives creep into this?
My renaming of variables so that they make sense to me seems fine, because the function works, aka produces the same output as the original. However, some things are strange, e.g how can digit ever be equal to permX as per the comparison check in the combine function? The former is an Int and the latter is a [Int], otherwise cons-ing like digit:permX:... in the same function wouldn't work, right? For example, if we call permutationNub [2,7,7,8], on the second pass of the main foldr (or scanr) function, digit will be 7 and permX will [8], so how can these two ever be equal?
It gets stranger from here. Well, both of them can then be cons-ed to tailOfPerms, which is an [[Int]], but then this whole thing, enclosed in brackets is cons-ed to an ... [[Int]]? Again?

Actually, map (permX :) digitAsDoubleList doesn't seem to work (yet, somehow it does...) because we're trying to map [Int] to a [[Int]], as is the result, and then cons to that the previous [[Int]], which I don't see how it could fit.
For example, on the second pass of the permutationNub, with 7 as the next digit, picked by the main foldr/scanr function, first pass of insert` will be (7:[8]:[]):[[8]:[[7]]], which I can't make sense of.

Note: As a higher level explanation the author of the code says The basic idea is that, given the unique permutations of a list's tail, we can compute the unique permutations of the whole list by inserting its head into all of the tail's permutations, in every position that doesn't follow an occurrence of the head (to avoid creating duplicates). So, as an example, let's say that current permutations of the tail
are [[7,8],[8,7]] (see pass 2) and the next digit is 7. So, the algorithm will insert 7 at the beginning of the first permutation (producing [7,7,8], then at the end (producing [7,8,7] and then only at the beginning of the second permutation (producing [8,7,7])? How can it tell that it doesn't need to elsewhere? What does the author mean by in every position that doesn't follow an occurrence of the head?

If anyone can shed some light into this piece of coding wizardry, please do so!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

猛虎独行 2025-02-21 00:50:07

该算法只是基于经典的插入的修改，例如，您可以在。令我惊讶的是，尽管它很简单，但我无法在任何地方找到它。

首先，对于插入，我没有使用明确的递归（被认为有害），而是笨拙地模拟了列表。然后，我发现，如果我们在发现要插入的值后停止产生插入，我们可以防止创建重复项，同时又产生每个置换量。我希望以下代码阐明了事情：

parafoldr :: ((a, [a]) -> b -> b) -> b -> [a] -> b
parafoldr f y xs = foldr f y $ zip xs $ tail $ tails xs

insertions :: a -> [a] -> [[a]]
insertions y = parafoldr combine [[y]]
    where combine (x, xs) xss = (y : x : xs) : map (x :) xss
--or equivalently
insertions y xs = (y : xs) : (parafoldr combine [] xs)
    where combine (x, xs) xss = (x : y : xs) : map (x :) xss

permutations :: [a] -> [[a]]
permutations = foldr (concatMap . insertions) [[]]

insertions' :: Eq a => a -> [a] -> [[a]]
insertions' y = parafoldr combine [[y]]
    where combine (x, xs) xss = (y : x : xs) :
              if x == y then [] else map (x :) xss
--or equivalently
insertions' y xs = (y : xs) : (parafoldr combine [] xs)
    where combine (x, xs) xss
              | x == y    = []
              | otherwise = (x : y : xs) : map (x :) xss

permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insertions') [[]]

并不难显示：

通过诱导表明：置换的正确性， promistutionsnub 产生 code> promistations ， >生产，当然，没有其他列表
由 permututationsNub 产生的排列是成对的，

但实际上，我间接地到达了该算法。置换自然适合关系环境（请查看答案的结尾以获取参考文献）。在下文中，我使用非正式的伪哈斯尔表示法来写有关关系。我用 a〜＆gt表示； b 从 a 到 b 的关系类型。我写 r（x，y） <代码> r 将 x 与 y 相关联。在定义关系时，我认为定义的关系是满足给定条件的最少的关系。我用 relfoldr （ relunfoldr ）对列表和 Converse （猜猜是什么）相反操作的关系性血统（变形）的操作。

因此，请考虑以下关系：

排列:: [a]〜＆gt; [a] ; 置换（x，y）如果 y 是 x
insertion :: lay（a，[a]）〜＆gt的置换; [a] ; 插入（Nothing，[]）和插入（Just（X，XS），XS'）如果 XS'插入的结果x xs
插入'与插入相同的某个地方 x xs 中的某个地方，但在发生它之后
” ＆gt;也许（a，[a]）; 选择（[]，nothing）和选择（xs，Just（x，xs'））如果 XS'删除出现的结果 x in xs
选择'与选择相同，但与“ XS'从删除 第一个 出现 x 在 xs 中的“

请注意 permuntion == Converse Demboint ， insertion == Converse Selection 和插入'== Converse Selection'。知道 relfoldr（Converse R）== Converse（RelunFoldr r），我大致进行了如下：

permutation ==
relfoldr insertion ==
converse (relunfoldr selection) ==
converse (relunfoldr selection') ==
relfoldr insertion'

已知第一个平等。由于置换的对称，我们具有置换== relunFoldr选择，这有助于我们证明 relunfoldr selection == relunfoldr选择'，因此第三平等。

现在，有了一些想象力，我们可以看到：

relunfoldr选择提供了经典的基于选择的算法
relunfoldr选择'提供了类似的基于选择的算法Create Replicates
relfoldr插入给出了经典的基于插入的算法（命名 permututations 上面）
relfoldr插入'给出了类似的基于插入的算法，该算法不会创建重复项（上面命名 permutationsnub 上面）

的一些优势，基于基于选择的基于选择的无限制算法是：

后者需要某种内存才能检查是否已经进行了选择，这使其效率较低，并且可能较低（取决于内存的类型，可能需要 ord ， hashable a 或 int 而不是 eq a ）
前者使用 foldr

在回答旧问题之前，我试图模拟 rel cpo 中的几件事，并在Haskell中实现它们，但我发现存在一些局限性和一个我需要的很多细节锻炼。我也有兴趣通过懒惰正确处理无穷大，但是我尝试过的一些技巧，例如对角/搭说，使用 nonepty 列表，我也不起作用，我也不确定适当的理论框架。如果我找到时间/能量/动机，我可能会在此领域进行一些时间重新访问该领域。

最后，这里有一些参考文献：

经典“ nofollow noreferrer”> bananas paper ）在 cpo 中。
该书“ nofollow noreferrer”>“ nofollow noreferrer”>代数专注于关系环境。在第130-131页中，它涉及排列并得出经典的基于插入的算法。
这本书 haskell的算法设计强调方程推理。排列在IT中具有独特的存在，并且均基于选择/插入的基本算法。
在讲义中，您可以找到一些相关事项的教学典范，“计划构建数学中的代数和煤炭方法”。第5章采用功能方法（< strong> set ， cpo ）而第8章采用关系。

This algorithm is just a modification of the classic insertion-based one, which you can find for example in Rosetta Code. To my surprise, despite its simplicity, I wasn't able to find it mentioned anywhere.

First of all, for the insertions, instead of using explicit recursion (considered harmful), I simulated, somewhat clumsily, a paramorphism for lists. Then I found that, if we stop producing insertions after finding an occurrence of the value to be inserted, we can prevent the creation of duplicates while at the same time producing every permutation. I hope the following code clarifies things:

parafoldr :: ((a, [a]) -> b -> b) -> b -> [a] -> b
parafoldr f y xs = foldr f y $ zip xs $ tail $ tails xs

insertions :: a -> [a] -> [[a]]
insertions y = parafoldr combine [[y]]
    where combine (x, xs) xss = (y : x : xs) : map (x :) xss
--or equivalently
insertions y xs = (y : xs) : (parafoldr combine [] xs)
    where combine (x, xs) xss = (x : y : xs) : map (x :) xss

permutations :: [a] -> [[a]]
permutations = foldr (concatMap . insertions) [[]]

insertions' :: Eq a => a -> [a] -> [[a]]
insertions' y = parafoldr combine [[y]]
    where combine (x, xs) xss = (y : x : xs) :
              if x == y then [] else map (x :) xss
--or equivalently
insertions' y xs = (y : xs) : (parafoldr combine [] xs)
    where combine (x, xs) xss
              | x == y    = []
              | otherwise = (x : y : xs) : map (x :) xss

permutationsNub :: Eq a => [a] -> [[a]]
permutationsNub = foldr (concatMap . insertions') [[]]

It's not too hard to show by induction that:

given the correctness of permutations, permutationsNub produces every permutation that permutations produces and, of course, no other lists
the permutations produced by permutationsNub are pairwise different

But actually, I arrived at this algorithm indirectly. Permutations fit naturally in a relational setting (look at the answer's end for references). In the following I use an informal pseudo-Haskell notation to write about relations. I denote by a ~> b the type of relations from a to b. I write r(x, y) when r relates x to y. When defining a relation, I suppose that the relation defined is the least one satisfying the given conditions. I denote by relfoldr (relunfoldr) the operation of relational catamorphism (anamorphism) for lists and by converse (guess what) the converse operation.

So consider the following relations:

permutation :: [a] ~> [a]; permutation(x, y) if y is a permutation of x
insertion :: Maybe (a, [a]) ~> [a]; insertion(Nothing, []) and insertion(Just (x, xs), xs') if xs' results from inserting x somewhere in xs
insertion' same as insertion but with "xs' results from inserting x somewhere in xs but not after an occurrence of it "
selection :: [a] ~> Maybe (a, [a]); selection([], Nothing) and selection(xs, Just (x, xs')) if xs' results from deleting an occurrence of x in xs
selection' same as selection but with "xs' results from deleting the first occurrence of x in xs"

Notice that permutation == converse permutation, insertion == converse selection and insertion' == converse selection'. Knowing that relfoldr (converse r) == converse (relunfoldr r), I proceeded roughly as follows:

permutation ==
relfoldr insertion ==
converse (relunfoldr selection) ==
converse (relunfoldr selection') ==
relfoldr insertion'

The first equality is known. Because of the symmetry of permutation we have permutation == relunfoldr selection, which helps us justify that relunfoldr selection == relunfoldr selection' and thus the third equality.

Now, with a bit of imagination, we can see that:

relunfoldr selection gives the classic selection-based algorithm
relunfoldr selection' gives a similar selection-based algorithm that doesn't create duplicates
relfoldr insertion gives the classic insertion-based algorithm (named permutations above)
relfoldr insertion' gives a similar insertion-based algorithm that doesn't create duplicates (named permutationsNub above)

Some advantages of the insertion-based no-duplicate algorithm over the selection-based one are that:

the latter needs some kind of memory to check if a selection has already been made, which makes it less efficient and possibly less general (depending on the memory's type, it may require Ord a, Hashable a or Int instead of just Eq a)
the former has a simple implementation using foldr

Before answering the old question, I made an attempt to simulate several things from REL in CPO and implement them in Haskell but I found out that there were some limitations and a lot of details that I needed to work out. I was also interested in properly handling infinity via laziness but some tricks I tried, like diagonalization/dovetailing and using NonEmpty lists, didn't work and I wasn't sure about the appropriate theoretical framework either. I might revisit this area some time, if I find the time/energy/motivation.

Finally, here are a few references:

The classic bananas paper treats recursion schemes (and more) in CPO.
The book Algebra of Programming focuses on the relational setting. In pages 130-131 it touches on permutations and derives the classic insertion-based algorithm.
The book Algorithm Design with Haskell is more practically-oriented but still gives an emphasis on equational reasoning. Permutations have a distinct presence in it and both selection/insertion-based basic algorithms are presented.
In the lecture notes "Algebraic and Coalgebraic Methods in the Mathematics of Program Construction" you can find some quite pedagogical expositions of related matters. Chapter 5 takes the functional approach (SET, CPO) while chapter 8 adopts the relational one.

回复收藏 0 原文

喜爱纠缠 2025-02-21 00:50:07

αfter几天，有些事情更有意义，我意识到我犯了两个错误。

第一个是关于和解值。无论出于何种原因，我都错误地认为连续值必须是连续的嵌套级别，例如：[b]：[[c]]，而仅适用于Cons Operator的最后一个值，例如A Cons Operator的最后一个值，：b：c：[d]。

第二是，由于与辅助图不熟悉，我误解了插入函数。让我们假设（例如，如我的问题中提到的话，我们运行permuntnub [2,7,7,8]”）主折叠/scanr的第二次通过，那里的ACC现在被[[7,8]，[8,7]]填充（因为Scanr的使用向我们告知了我们）。使用ConcatMap意味着插入将应用于该ACC列表中的每个参数，而不是同时，而不是同时将映射函数的结果串联。因此，对于该特定的ACC内容，插入将两次执行，第一次将作为插入7 [7,8] ，第二次为 insert 7 [8,7] 。在任何情况下都不是一旦插入7 [[7,8]，[8,7]] 。

这导致我再次改革整个功能

permutationsNub :: Eq a => [a] -> [[[a]]]
permutationsNub digits = scanr (concatMap . insert) [[]] digits 
    where insert digit permutationN = foldr combine [[digit]] $ (zip <*> tail . tails) permutationN 
              where combine (x, xs) accOfInsert = (digit : x : xs) : -- (x, xs) of permutationN e.g [7,[8]]
                      if digit == x then [] else map (x:) accOfInsert

之后，事情变得更加好。对于插入7 [7,8] zipping零件将产生 [（7，[8]），（8，[]）] 。之后，插入自己的foldr函数将一一选择这些元组，并在每个元组上调用 compine 。

例如，在同一执行过程中，联合收割机也将两次称为 combine（8，[]）[[7]）[[7]] ，并且一次为 combine（7，[[[[7，[ 8]）[[7,8,7]，[8,7]] （[[[7,8,7]，[8,7]]是第一个组合呼叫的结果，因此现在ACC在foldr循环中的值），它将导致[[7,7,8]。现在清楚地表明 digit 和 x 现在可以与 ints 相当，如果它们相同，则当前元素的尾巴置换将被忽略（被[]取代，在结束的第二次通过中发生。

同样，对于插入7 [8,7] [（8，[7]），（7，[[]）]，将联合收割称为 combine（7，[]）[[7]] and combine（8，8， [7]）[[7,7] ，这将导致[[7,8,7]，[8,7,7]

。 [[[7,7,8]，[[7,8,7]，[8,7,7]]将被串联为[[7,7,8]，[7,8,7]，[7,8,7]， ]。

[ 8,7,7 这算法是一种出色的方法，当然我无法提出自己的意思。置换生成代码。不过，仍然还没有完全掌握它。也许我错过了一些必要的理论背景ATM。哦，好吧...

我希望这对某人的情况有所帮助。不用说，对任何其他反馈/评论都得到了高度赞赏。

干杯!

Αfter a couple of days, some things make more sense and I realise I've made two mistakes.

The first is about cons-ing values. I have mistakenly, for whatever reason, thought that consecutive values have to be of successive levels of nesting, like a:[b]:[[c]] while that only applies to the last value in the chain of cons operators, like a:b:c:[d].

The second is that I misunderstood, due to unfamiliarity with concatMap, how the insert function is called. Let's assume for example that (if as mentioned in my question we run permutationNub [2,7,7,8]") we' re at the end of the
second pass of the main foldr/scanr, where acc is now populated with [[7,8],[8,7]] (as the use of scanr has kindly informed us). The use of concatMap means that insert will be applied to every argument in that acc list, one by one, not at the same time, and the result of that mapping function will be concatenated. So for that particular acc content, insert will be executed twice, the first time as insert 7 [7,8] and the second as insert 7 [8,7]. Not in any case once insert 7 [[7,8],[8,7]].

Which leads me to once again to reform the whole function as such

permutationsNub :: Eq a => [a] -> [[[a]]]
permutationsNub digits = scanr (concatMap . insert) [[]] digits 
    where insert digit permutationN = foldr combine [[digit]] $ (zip <*> tail . tails) permutationN 
              where combine (x, xs) accOfInsert = (digit : x : xs) : -- (x, xs) of permutationN e.g [7,[8]]
                      if digit == x then [] else map (x:) accOfInsert

After that, things fall into place more nicely. For insert 7 [7,8] the zipping part will produce [(7,[8]),(8,[])]. Afterwards, insert's own foldr function will pick those tuples one by one and call combine on each of them.

For example, in the course of the same execution, combine will also be called twice, once as combine (8,[]) [[7]] and once as combine (7,[8]) [[7,8,7],[8,7]] ([[7,8,7],[8,7]] being the result of the first combine call, hence now acc value in the foldr loop), which will result in [[7,7,8]] . It is now crystal clear that digit and x are now comparable as Ints, and if they are the same, the tail of elements in the current permutation will be ignored (replaced by [], consed at the end, as it happens in the second pass of combine.

Similarly, for insert 7 [8,7] zipping will produce tuples [(8,[7]),(7,[])], and combine will be called as combine (7,[]) [[7]] and combine (8,[7]) [[7,7]], which will result in [[7,8,7],[8,7,7]].

Lastly the results of the mapping function, [[[7,7,8]],[[7,8,7],[8,7,7]]] will be concatenated into [[7,7,8],[7,8,7],[8,7,7]]. And the last loop with digit=2 will begin. Phew!

I can sort of see now what peter_pun means by "inserting the digit in every possible position of the permutation" etc, though I still consider the algorithm a brilliant approach, certainly something I couldn't have come up with myself. I became determined to understand it before using it, because I though it was somehow important, no less because it is at least an order of magnitude faster than my own permutation generation code. Still haven't fully grasped it, though. Maybe I'm missing some necessary theoritical background atm. Oh well...

I hope this helps someone in a similar situation. It also goes without saying that any additional feedback/comment is highly appreciated.

Cheers!

回复收藏 0 原文

~没有更多了~