Haskell、Monads、堆栈空间、惰性——如何构造惰性代码?
这是一个人为的例子,但下面的代码演示了我在学习 Haskell 时不断遇到的一类问题。
import Control.Monad.Error
import Data.Char (isDigit)
countDigitsForList [] = return []
countDigitsForList (x:xs) = do
q <- countDigits x
qs <- countDigitsForList xs
return (q:qs)
countDigits x = do
if all isDigit x
then return $ length x
else throwError $ "Bad number: " ++ x
t1 = countDigitsForList ["1", "23", "456", "7890"] :: Either String [Int]
t2 = countDigitsForList ["1", "23", "4S6", "7890"] :: Either String [Int]
t1
给了我正确的答案,t2
正确识别了错误。
在我看来,对于足够长的列表,此代码将耗尽堆栈空间,因为它在 monad 内部运行,并且在每个步骤中它都会尝试在返回结果之前处理列表的其余部分。
累加器和尾递归似乎可以解决问题,但我反复读到,由于惰性求值,这两者在 Haskell 中都是不必要的。
如何将这种代码构造成不会出现堆栈空间问题和/或懒惰的代码?
A contrived example, but the below code demonstrates a class of problems I keep running into while learning Haskell.
import Control.Monad.Error
import Data.Char (isDigit)
countDigitsForList [] = return []
countDigitsForList (x:xs) = do
q <- countDigits x
qs <- countDigitsForList xs
return (q:qs)
countDigits x = do
if all isDigit x
then return $ length x
else throwError $ "Bad number: " ++ x
t1 = countDigitsForList ["1", "23", "456", "7890"] :: Either String [Int]
t2 = countDigitsForList ["1", "23", "4S6", "7890"] :: Either String [Int]
t1
gives me the right answer and t2
correctly identifies the error.
Seems to me that, for a sufficiently long list, this code is going to run out of stack space because it runs inside of a monad and at each step it tries to process the rest of the list before returning the result.
An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.
How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你不能让这个函数延迟处理列表,无论是否有 monad。这是
countDigitsForList
的直接翻译,使用模式匹配而不是do
表示法:在这里应该更容易看到这一点,因为在任何时候都有一个
Left
列表中的点使得整个列表返回该值,为了确定结果的最外层构造函数,必须遍历并处理整个列表;同样地处理每个元素。因为最终结果可能取决于最后一个字符串中的最后一个字符,所以编写的这个函数本质上是严格的,就像对数字列表求和一样。鉴于此,要做的事情是确保函数足够严格,以避免构建巨大的未计算表达式。想要了解这方面的信息,最好从讨论
foldr
、foldl
和foldl'
之间的差异开始。当您可以延迟生成、处理和使用列表时,两者都是不必要的;最简单的例子是
map
。对于不可能做到这一点的函数,严格评估的尾递归正是您想要的。You can't make this function process the list lazily, monads or no. Here's a direct translation of
countDigitsForList
to use pattern matching instead ofdo
notation:It should be easier to see here that, because a
Left
at any point in the list makes the whole thing return that value, in order to determine the outermost constructor of the result, the entire list must be traversed and processed; likewise for processing each element. Because the final result potentially depends on the last character in the last string, this function as written is inherently strict, much like summing a list of numbers.Given that, the thing to do is ensure that the function is strict enough to avoid building up a huge unevaluated expression. A good place to start for information on that is discussions on the difference between
foldr
,foldl
andfoldl'
.Both are unnecessary when you can instead generate, process, and consume a list lazily; the simplest example here being
map
. For a function where that's not possible, strictly-evaluated tail recursion is precisely what you want.camccann 是对的,函数本质上是严格的。但这并不意味着它不能在常量堆栈中运行!
这个累积参数版本是 camccann 代码的部分 cps 转换,我敢打赌,您也可以通过处理 cps 转换的任一 monad 来获得相同的结果。
编辑以考虑jwodder关于反向的更正。哎呀。正如 John L 指出的那样,隐式或显式差异列表也可以工作......
camccann is right that the function is inherently strict. But that doesn't mean that it can't run in constant stack!
This accumulating parameter version is a partial cps transform of camccann's code, and I bet that you could get the same result by working over a cps-transformed either monad as well.
Edited to take into account jwodder's correction regarding reverse. oops. As John L notes an implicit or explicit difference list would work as well...