Haskell、Monads、堆栈空间、惰性——如何构造惰性代码?

发布于 2024-11-23 15:16:14 字数 780 浏览 3 评论 0原文

这是一个人为的例子,但下面的代码演示了我在学习 Haskell 时不断遇到的一类问题。

import Control.Monad.Error
import Data.Char (isDigit)

countDigitsForList [] = return []
countDigitsForList (x:xs) = do
    q  <- countDigits x
    qs <- countDigitsForList xs
    return (q:qs)

countDigits x = do
    if all isDigit x
        then return $ length x
        else throwError $ "Bad number: " ++ x

t1 = countDigitsForList ["1", "23", "456", "7890"] :: Either String [Int]
t2 = countDigitsForList ["1", "23", "4S6", "7890"] :: Either String [Int]

t1 给了我正确的答案,t2 正确识别了错误。

在我看来,对于足够长的列表,此代码将耗尽堆栈空间,因为它在 monad 内部运行,并且在每个步骤中它都会尝试在返回结果之前处理列表的其余部分。

累加器和尾递归似乎可以解决问题,但我反复读到,由于惰性求值,这两者在 Haskell 中都是不必要的。

如何将这种代码构造成不会出现堆栈空间问题和/或懒惰的代码?

A contrived example, but the below code demonstrates a class of problems I keep running into while learning Haskell.

import Control.Monad.Error
import Data.Char (isDigit)

countDigitsForList [] = return []
countDigitsForList (x:xs) = do
    q  <- countDigits x
    qs <- countDigitsForList xs
    return (q:qs)

countDigits x = do
    if all isDigit x
        then return $ length x
        else throwError $ "Bad number: " ++ x

t1 = countDigitsForList ["1", "23", "456", "7890"] :: Either String [Int]
t2 = countDigitsForList ["1", "23", "4S6", "7890"] :: Either String [Int]

t1 gives me the right answer and t2 correctly identifies the error.

Seems to me that, for a sufficiently long list, this code is going to run out of stack space because it runs inside of a monad and at each step it tries to process the rest of the list before returning the result.

An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.

How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

禾厶谷欠 2024-11-30 15:16:14

如何将这种代码构造成不会出现堆栈空间问题和/或懒惰的代码?

你不能让这个函数延迟处理列表,无论是否有 monad。这是 countDigitsForList 的直接翻译,使用模式匹配而不是 do 表示法:

countDigitsForList [] = return []
countDigitsForList (x:xs) = case countDigits x of
    Left e  -> Left e
    Right q -> case countDigitsForList xs of
                   Left e   -> Left e
                   Right qs -> Right (q:qs)

在这里应该更容易看到这一点,因为在任何时候都有一个 Left列表中的点使得整个列表返回该值,为了确定结果的最外层构造函数,必须遍历并处理整个列表;同样地处理每个元素。因为最终结果可能取决于最后一个字符串中的最后一个字符,所以编写的这个函数本质上是严格的,就像对数字列表求和一样。

鉴于此,要做的事情是确保函数足够严格,以避免构建巨大的未计算表达式。想要了解这方面的信息,最好从讨论 foldrfoldlfoldl' 之间的差异开始。

累加器和尾递归似乎可以解决问题,但我反复读到,由于惰性求值,这两者在 Haskell 中都是不必要的。

当您可以延迟生成、处理和使用列表时,两者都是不必要的;最简单的例子是map。对于不可能做到这一点的函数,严格评估的尾递归正是您想要的。

How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?

You can't make this function process the list lazily, monads or no. Here's a direct translation of countDigitsForList to use pattern matching instead of do notation:

countDigitsForList [] = return []
countDigitsForList (x:xs) = case countDigits x of
    Left e  -> Left e
    Right q -> case countDigitsForList xs of
                   Left e   -> Left e
                   Right qs -> Right (q:qs)

It should be easier to see here that, because a Left at any point in the list makes the whole thing return that value, in order to determine the outermost constructor of the result, the entire list must be traversed and processed; likewise for processing each element. Because the final result potentially depends on the last character in the last string, this function as written is inherently strict, much like summing a list of numbers.

Given that, the thing to do is ensure that the function is strict enough to avoid building up a huge unevaluated expression. A good place to start for information on that is discussions on the difference between foldr, foldl and foldl'.

An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.

Both are unnecessary when you can instead generate, process, and consume a list lazily; the simplest example here being map. For a function where that's not possible, strictly-evaluated tail recursion is precisely what you want.

秋日私语 2024-11-30 15:16:14

camccann 是对的,函数本质上是严格的。但这并不意味着它不能在常量堆栈中运行!

countDigitsForList xss = go xss []
    where go (x:xs) acc = case countDigits x of
                             Left e -> Left e
                             Right q -> go xs (q:acc)
          go [] acc = reverse acc

这个累积参数版本是 camccann 代码的部分 cps 转换,我敢打赌,您也可以通过处理 cps 转换的任一 monad 来获得相同的结果。

编辑以考虑jwodder关于反向的更正。哎呀。正如 John L 指出的那样,隐式或显式差异列表也可以工作......

camccann is right that the function is inherently strict. But that doesn't mean that it can't run in constant stack!

countDigitsForList xss = go xss []
    where go (x:xs) acc = case countDigits x of
                             Left e -> Left e
                             Right q -> go xs (q:acc)
          go [] acc = reverse acc

This accumulating parameter version is a partial cps transform of camccann's code, and I bet that you could get the same result by working over a cps-transformed either monad as well.

Edited to take into account jwodder's correction regarding reverse. oops. As John L notes an implicit or explicit difference list would work as well...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文