将带有break-s/continue-s的命令式控制流转换为haskell
考虑下面的命令式代码,它在 3 位数字的乘积中找到最大的回文(是的,这是“[18 世纪杰出数学家]项目”网站的第一个任务):
curmax = 0
for i in range(999,100):
for j in range(999,100):
if ((i*j) < curmax): break
if (pal(i*j)):
curmax = i*j
break
print curmax
当我目前正在学习 Haskell 时,我的问题是,你如何将这个(以及基本上任何包含比简单迭代更复杂的内容的命令式构造,例如中断、继续、临时变量等等)转换为Haskell?
我的版本是
maxpal i curmax
| i < 100 = curmax
| otherwise = maxpal (i-1) (innerloop 999)
where
innerloop j
| (j < 100) || (p < curmax) = curmax
| pal p = p
| otherwise = innerloop (j-1)
where p = i*j
main = print $ maxpal 999 0
,但这看起来我们仍然处于势在必行的丑陋镇。
那么您有什么建议,处理此类 FP 式案例的方法是什么?
Consider the following imperative code which finds the largest palindrome among products of 3-digit numbers (yes, it's the one of the first tasks from "Project of [outstanding mathematician of 18th century]" site):
curmax = 0
for i in range(999,100):
for j in range(999,100):
if ((i*j) < curmax): break
if (pal(i*j)):
curmax = i*j
break
print curmax
As I'm learning Haskell currently, my question is, how do you translate this (and basically any imperative construct that contains something more complex than just plain iteration, e.g. breaks, continues, temporary variables and all this) to Haskell?
My version is
maxpal i curmax
| i < 100 = curmax
| otherwise = maxpal (i-1) (innerloop 999)
where
innerloop j
| (j < 100) || (p < curmax) = curmax
| pal p = p
| otherwise = innerloop (j-1)
where p = i*j
main = print $ maxpal 999 0
but this looks like we're still in imperative uglytown.
So what could you advise, what are the approaches of dealing with such cases FP-style?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
与 Daniel 和 sepp2k 的答案类似:
惰性函数式编程允许您以比在命令式控制流(如问题中的方式)中看到的更加模块化的方式编写程序。
例如,形成因子列表 999...100,然后是所有乘积,然后过滤以仅保留回文,然后计算最大值。
由于惰性,这些中间列表将仅在需要时产生,并且将被增量回收。
有关更多解释和示例,请参阅 John Hughes 的经典论文为什么函数式编程很重要。
Similar answer to Daniel's and sepp2k's:
Lazy functional programming lets you write programs in a much more modular way than you see in imperative control flow like the one in your question.
For instance, form the list of factors 999...100, then all products, then filter to retain only the palindromes, and then compute the maximum.
Thanks to laziness, these intermediate lists will come into being only as needed and will be incrementally recycled.
For more explanation and examples, see John Hughes's classic paper Why Functional Programming Matters.
如果我们取消所有优化,只将 100 到 999 之间的所有数字组合相乘,过滤掉非回文并取其中的最大值,我们可以非常简洁地将函数编写为:
当然,这基本上是效率最低的这样做的方法,但由于数字相对较小,因此在我的机器上仍然可以在半秒内完成。
但是,如果我们想要在算法上更符合您的 python 解决方案,我们可以这样做:
这里的外部循环与您的解决方案中的基本相同,但我们使用列表函数替换了内部循环。
我们使用
map (*i) [999, 998, ...]
为每个j
计数创建乘积i*j
从999
下降。使用takeWhile
表示,一旦某个值不大于curmax
,列表就应该停止。然后我们使用
find
来查看该列表中的任何项目是否是回文。如果是,列表中的第一个回文就是我们的新最大值。如果不是,我们将保留旧的最大值。 (find
返回一个Maybe
,fromMaybe
接受默认值和一个Maybe
并返回来自>Maybe
或默认值(如果Maybe
中没有值)If we do away with all the optimization and just multiply all combinations of numbers between 100 and 999, filter out the non-palindromes and take the maximum of that, we can write the function very concisely as:
Of course this is basically the least efficient way to do it, but since the numbers are relatively small, this still finishes in under half a second on my machine.
However if we want something that is more along the lines of your python solution algorithmically, we can do it like this:
Here the outer loop is basically the same as in your solution, but we replaced the inner loop using list functions instead.
We're using
map (*i) [999, 998, ...]
to create the producti*j
for everyj
counting down from999
. UsingtakeWhile
we're saying that the list should stop once a value is not greater thancurmax
.Then we're using
find
to see whether any item in that list is a palindrome. If it is, the first palindrome in the list is our new max. If it isn't we keep our old max. (find
returns aMaybe
andfromMaybe
takes a default value and aMaybe
and returns the value from theMaybe
or the default value if there is no value in theMaybe
)这里没有一刀切的答案。但让我们看一下这个具体的例子:
首先,考虑外循环:我们总是做整个范围,我们只关心最终的最大值,所以这很简单:
在内循环中,我们有一些 i 值和当前值最大限度。现在我们只关心 i*j 大于当前最大值的范围:
在核心逻辑中,我们得到 i*j 的一个值,我们知道它总是大于或等于当前最大值,所以所需要的就是检查下一个值是否是回文:如果是,我们就完成了,因为序列减少了。如果没有,请推迟决定:
There's no one-size-fits-all answer here. But let's walk through this specific example:
First, consider the outer loop: We always do the full range, and we only care about the final maximum, so that's easy enough:
In the inner loop, we have some value of i and a current maximum. Now we only care about the range where i*j is larger than the current maximum:
In the core logic we get a value for i*j which we know will always be greater then or equal to the current maximum, so all that's necessary is to check the next value to see if it's a palindrome: If so, we're done, because the sequence decreases. If not, defer the decision:
在我看来,范围对应于列表。例如:
现在
f
被定义为从 999 到 100 的数字序列。for
循环对应于不同的函数概念,具体取决于您在每次迭代中执行的操作。有时地图
是适当的模拟,有时是折叠
,有时是其他东西。很多时候它是多种因素的结合。在这种情况下,您可以有效地组合两个列表。在 Haskell 中实现此目的的一种方法是列表理解:这里 g 表示先前定义的序列的每个元素与其自身相结合的乘积的列表。换句话说,几乎就是
for
循环中发生的事情。从这里,您可能想要 过滤结果序列以仅包含回文值,然后计算该集合中的最大值。
In my mind a range corresponds to a list. For example:
Now
f
is defined as the sequence of numbers from 999 to 100.for
loops correspond to different functional concepts, depending on what you're doing in each iteration. Sometimes amap
is the appropriate analog, sometimes afold
, sometimes something else. Often times it's a combination of things. In this case, you're effectively combining two lists. One way to do that in Haskell is a list comprehension:Here
g
represents a list of the product of each element of the previously defined sequence combined with itself. In other words, pretty much what you've got going on within yourfor
loop.From here, you'd probably want to filter the resulting sequence to contain only values that are palindromes, and then calculate the maximum value from that set.
因此,从功能角度思考,您应该寻找方法将问题分解为函数,而不是循环和步骤。
因此,如果我们有一个函数
maxWhere f xs
,它返回x
且f x
为 true,我们可以这样写: maxWhere 是,
但是如果
f
比比较更昂贵,那么这很糟糕,因为我们将比原来调用更多的 f 。我们可以使用 Fold 将过滤器和最大值合并到一次传递中,并获得与命令式代码相同的行为。在这里使用零作为一个神奇的小数字是可怕的,但在这种情况下是有效的。
(我真的想拼写候选数字列表
(*) <$> [999,998..100] <*> [999,998..100]
,但这可能会引入一个这里不必要的复杂化。)So, thinking functionally you should be looking at ways to break your problem not into loops and steps but into functions.
So, if we had a function
maxWhere f xs
which returned the largestx
for whichf x
is true, we could write:A naive implementation of maxWhere is
but this is bad if
f
is more expensive than comparison as we'll be making more calls to f than in the original. We can use fold to combine the filter and the maximum into a single pass and get the same behaviour as the imperative code.The use of zero as a magic small number here is horrible, but works in that case.
(I really want to spell that list of candidate numbers
(*) <$> [999,998..100] <*> [999,998..100]
, but that may be introducing an unnecessary complication here.)嘎。被 sepp2k 击败,但我会回答你的一般问题:
临时变量也可以使用状态 monad 来表达,或者如果你有很多临时变量,则可以使用 ST monad 来表达。 FP 通常在简洁性和清晰度方面获胜,但在某些情况下却并非如此,例如,当需要处理多个局部变量时。
惰性可以模拟很多中断,但是在处理 IO 时,通常必须使用显式递归。然而,“List”包(来自 Hackage)相当聪明,允许您以函数式风格编写 IO 循环。
Gah. Beaten by sepp2k, but I'll answer your general question:
Temporary variables can also be expressed using the state monad, or ST monad if you have a lot of them. FP often wins in succintness and clarity, but in some cases it doesn't, e.g. when there are several local variables to juggle.
Laziness can emulate many breaks, but when dealing with IO, you generally have to use explicit recursion. However, the 'List' package (from Hackage) is rather clever at allowing you to write IO loops in a functional style.
这种循环很容易实现列表理解,如下所示:
我们可以像这样编写 isPalindrome :
这确实足够快,尽管有点不聪明,所以首先我们会注意到我们正在检查数字两次。假设 a*b 是最大的回文,那么我们将检查 x == a, y==b 和 x==b, y==a 的情况代码>.因此,首先我们通过将搜索的数字限制为仅 x >= y 的情况来阻止这种情况,如下所示:
这将测试的数字减少了一半。
在你的Python解决方案中,你还将下面的y限制为我们迄今为止发现的最大数字除以当前的x(
x*y => curmax
),而且你永远不会搜索超出找到的第一个y(如果更新 curmax,则中断内循环)。如果我们检查的第一个元素(x 平方)小于当前答案,我们可以通过不继续来进一步削减搜索,因为所有后续检查都较小,但这超出了列表理解中看起来不错的范围,因此我们将搜索移至它有自己的功能:值得注意的是我们的限制
(x*x)
(x*x)
(x*x)
(x*x)
(x*x) < curr
,实际上只是意味着从现在开始,[x*x,(x-1)*x..curr]
将是空的。正如您所看到的,通过 Python 代码中的中断强制执行的所有边界都适合 x 上的一次迭代(使用递归)和 x*y 值列表上的查找。它可能看起来不太好,但在我看来,它更明确地说明了我们对 x 和 y 所做的限制。运行它我们得到:
事实证明,当
x * x
x * x
时停止curr
是一个非常好的主意,因为 906609 的平方根是 952...this kind of loop lends itself easily to a list comprehension, like this:
Where we might write isPalindrome like this:
This is really fast enough, although sort of unsmartypantsy, so first off we will notice that we are checking the numbers twice. Let's say a*b is the biggest palindrome, then we will check both the case where
x == a, y==b
, andx==b, y==a
. So first we stop this by restricting the numbers we search through to only the cases where x >= y, like this:This cuts the numbers to test in half.
In your python solution you also bound y below by the biggest number we have found so far divided by the current x (
x*y => curmax
), also you never search beyond the first y found (breaking the inner loop if curmax is updated). We may cut the search further by not continuing if the first element we check (x squared) is less then our current answer, since all subsequent checks are smaller, but this is beyond what looks good in a list comprehension so we move our search into it's own function:It's worth noticing how our limitation,
(x*x) < curr
, really just means that from now on,[x*x,(x-1)*x..curr]
is going to be empty. As you can see, all of the bounds that were enforced by breaks in your python code fits inside one iteration on x (using recursion) and a find on a list of x*y values. It might not look nicer, but it seems to me to state more explicitly the restrictions we make on x and y.Running it we get:
Turns out that stopping when
x * x < curr
is a really good idea since the square root of 906609 is 952...正如 stephen tetley 在他的评论中指出的,在 FP 中,您可以使用连续传递样式来处理复杂的控制流(
Cont
monad 加上它的callCC
在某种程度上类似于break
...甚至goto
- 滥用 CPS 可能会导致相当难以理解的代码 - 请参阅下面的示例):两个 mfoldM(只是一个标准的 FoldM)其参数重新排列)对应于原始样本中的两个循环,并且在“内部循环”中使用
break
函数参数来在违反(i*j > 当前最大值)条件时退出它(返回由于“内循环”而导致的当前最大值)。这里我们只需要逃离一个“循环级别”,所以这里的 callCC 绝对是大材小用。相同的逻辑也可以用
find
实现(+ Haskell 的惰性):find pal
这里返回第一个回文数(它也满足 (>m) 条件)在 takeWhile 中) 或 Nothing (MonadPlus 的零) 以及在mplus
(或 Alternatice.<|>) 之后it
有效地返回一个新的最大回文或之前的最大值 (返回米)。由于一旦找到第一个满足要求的元素,find
就会停止搜索,因此该代码的行为与其命令式curmax
模拟完全相同。两个版本都在 0.5 秒内运行 [99999..10000] 范围。
更新:
只是为了好玩:同样的方法,但使用 StateT Integer (Cont Integer) () - Cont 来逃离“循环”,并使用 State 来传递最大回文数(加上使用 forM_< /code> 和
何时
)。相同效率:As noted by stephen tetley in his comment, in FP you can use continuation passing style to handle complex control flow (
Cont
monad plus itscallCC
which is somehow similar tobreak
. ...or evengoto
- abuse of CPS can lead to rather incomprehensible code - see my example below):Two mfoldM's (just a standard foldM with its arguments rearranged) correspond to two loops in the original sample and
break
function-argument is used in "inner loop" to exit it once (i*j > current max) condition is violated (returning the current max as a result of that "inner loop"). Here we need to escape from just one "loop level", so callCC here is definitely overkill.The same logic can also be implementing with
find
(+ laziness of Haskell):find pal
here returns either the first palindrome number (which will also satisfy (>m) condition in takeWhile) or Nothing (zero of MonadPlus) and aftermplus
(or Alternatice.<|>)it
effectively returns either a new maximum palindrome or the previous max (return m). Sincefind
stops searching once the first satisfying element is found, this code behave exactly as its imperativecurmax
analog.Both version run for [99999..10000] range in 0.5 second.
Update:
Just for fun: same approach but using
StateT Integer (Cont Integer) ()
- Cont to escape from the "loop" and State to pass max palindrome around (plus an ability to useforM_
andwhen
). Same efficiency:我认为你可以使用两个相互递归函数来做你想做的事情。
这是一个更简单的示例(取自ATS 教程):
上面编写的代码非常类似于您用 C 编写的代码(取自同一页面):
int main (int argc, char *argv[]) {
整数 i,j ;
正如
你所看到的,嵌套循环变成了相互递归的函数;可变变量 i 和 j 成为归纳变量。 Loop1对应于外循环,而loop2对应于内循环。
I think you can do what you want using two mutually recursive functions.
Here's a much simpler example (taken from a tutorial on ATS):
The code written above is very much like what you'd write in C (taken from that same page):
int main (int argc, char *argv[]) {
int i, j ;
}
As you see, nested loops become mutually recursive functions; and mutable variables i and j become induction variables. loop1 corresponds to the outer loop, whereas loop2 to the inner loop.