Haskell：如何在 State monad 之上编写交互式解释器？

发布于 2024-09-08 20:48:52 字数 1929 浏览 15 评论 0原文

我们正在开发一个在内部使用状态 monad 的模型文件系统。我们有一个具有如下操作的类型类：

class Monad m => FS m where
  isDirectory  :: Path -> m Bool
  children     :: Path -> m [Path]
  ...

我们正在开发一个小型交互式解释器，它将提供诸如 cd、ls、cat 之类的命令，等等。解释器中的操作可以这样编写：

fsop :: FS m => Operation -> m Response

Operation 和 Response 的定义并不重要；如果你愿意，可以将它们视为字符串。

我试图解决的问题是在 I/O monad 中编写一个顶级循环来解释文件系统操作并打印响应。如果 IO 是 FS 的一个实例（也就是说，如果我们直接使用 IO monad），生活就会很简单：我们可以写

loop :: Path -> IO ()
loop currentDir = do
        op <- getLine
        case read op of
          ChangeDir d -> loop d -- should test 'isDirectory d', but let's not
          Ls -> do { files <- children currentDir
                   ; mapM_ putStrLn files
                   ; loop currentDir }
          Exit -> return ()

但这不是我想要的。我想使用 Control.Monad.State:

newtype Filesystem a = Filesystem (State (Data.Map.Map Path Contents) a)

并声明

instance Monad Filesystem ...
instance FS Filesystem ...

使用 FS 抽象，我可以编写一个应该适用于任何实例的单步函数，实际上以下代码编译：

step :: FS fs => Path -> Operation -> fs (Path, Response)
step currentDir op = 
        case op of
          ChangeDir d -> return (d, "")
          Ls -> do { files <- children currentDir
                   ; return (currentDir, unlines files) }

此时我完全陷入困境。我想要做的是在 IO monad 中编写一个交互式循环，它可以读取操作并打印 Response ，但它适用于不一定是 IO 的状态单子。（拥有不在 IO 中的模型的原因之一是，这样我们就可以测试 QuickCheck 属性。）

我觉得这必须是一个标准问题——在有状态抽象之上的交互式读取-评估-打印循环是不是 IO——但我一定错过了一些非常明显的东西，因为我似乎无法弄清楚。我在网上查了一下，但没有得到启发。

任何帮助编写可以调用 step 的交互式 IO 执行计算的帮助将不胜感激。

原文

We're working on a model filesystem that uses a state monad internally. We have a type class with operations like these:

class Monad m => FS m where
  isDirectory  :: Path -> m Bool
  children     :: Path -> m [Path]
  ...

We're working on a little interactive interpreter that will offer commands like cd, ls, cat, and so on. An operation in the interpreter can be written this way:

fsop :: FS m => Operation -> m Response

The definitions of Operation and Response aren't important; if you like, take them to be strings.

The problem I am trying to solve is to write a top-level loop in the I/O monad that interprets filesystem Operations and prints responses. If IO were an instance of FS (that is, if we were working directly with the IO monad), life would be simple: we could write

loop :: Path -> IO ()
loop currentDir = do
        op <- getLine
        case read op of
          ChangeDir d -> loop d -- should test 'isDirectory d', but let's not
          Ls -> do { files <- children currentDir
                   ; mapM_ putStrLn files
                   ; loop currentDir }
          Exit -> return ()

But that's not what I want. I want to use Control.Monad.State:

newtype Filesystem a = Filesystem (State (Data.Map.Map Path Contents) a)

and to declare

instance Monad Filesystem ...
instance FS Filesystem ...

Using the FS abstraction, I can write a single-step function that should work with any instance, and indeed the following code compiles:

step :: FS fs => Path -> Operation -> fs (Path, Response)
step currentDir op = 
        case op of
          ChangeDir d -> return (d, "")
          Ls -> do { files <- children currentDir
                   ; return (currentDir, unlines files) }

At this point I am totally stuck. What I want to do is write an interactive loop in the IO monad, which can read Operations and print Responses, but which works on a state monad that is not necessarily IO. (One of the reasons for having a model that is not in IO is that so we can test QuickCheck properties.)

I feel like this has to be a standard problem—an interactive read-eval-print loop on top of a stateful abstraction that is not IO—but I must be missing something breathtakingly obvious because I can't seem to figure it out. I've looked online but have not been enlightened.

Any help writing an interactive, IO-performing computation that can call step would be greatly appreciated.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

黯淡〆 2024-09-15 20:48:52

使用 monad 转换器怎么样？它们或多或少是组合单子的标准方式。这是一个简单的示例：

type Foo a = StateT String IO a

replT :: Foo ()
replT = do
  str   <- liftIO getLine
  state <- get
  liftIO $ putStrLn ("current state: " ++ state)
  liftIO $ putStrLn ("setting state: " ++ str)
  put str
  replT

下面是从 ghci 内运行 replT 的结果。

*Main> runStateT replT "Initial state"
asd
current state: Initial state
setting state: asd
zxc
current state: asd
setting state: zxc
asdasd

共有三个 monad 转换器库。 mtl、transformers 和 monadLib。我不能推荐其中任何一个，因为我不经常使用它们。

What about using monad transformers? They are more or less standard way to combine monads. Here an simplistic example:

type Foo a = StateT String IO a

replT :: Foo ()
replT = do
  str   <- liftIO getLine
  state <- get
  liftIO $ putStrLn ("current state: " ++ state)
  liftIO $ putStrLn ("setting state: " ++ str)
  put str
  replT

Below are results of running replT from within ghci.

*Main> runStateT replT "Initial state"
asd
current state: Initial state
setting state: asd
zxc
current state: asd
setting state: zxc
asdasd

There are three monad transformers libs. mtl, transformers and monadLib. I cannot recommend any of them since I don't use them much.

回复收藏 0 原文

你丑哭了我 2024-09-15 20:48:52

免责声明：我不能保证以下内容是解决问题的好方法，但完成它听起来很有趣。让我们来试一试，好吗？

一些强制导入

首先，让我们讨论一些数据类型。我将填写一些细节并进行一些调整，以便定义一个我们可以实际交互的简单“文件系统”。

type Path = String
type Response = Maybe String
type Contents = [String]

data Operation = Cd Path 
               | Ls 
               | MkDir Path
               | Quit
    deriving (Read, Show)

接下来，我们将做一些有点急躁的事情......去掉所有的单子。什么？这太疯狂了！也许吧，但有时 >>= 提供的所有隐藏管道隐藏的东西有点太多。

对于文件系统本身，我们将仅存储当前工作目录和从路径到其子目录的映射。我们还需要一些函数来与其交互。

data Filesystem = Filesystem { wd :: Path, files :: M.Map Path Contents }
    deriving Show

newFS = Filesystem "/" (M.singleton "/" [])

isDirectory p fs = M.member p $ files fs
children p fs = fromMaybe [] . M.lookup p $ files fs
cd p fs = fs { wd = p }
create p fs = let newPath = wd fs ++ p ++ "/"
                  addPath = M.insert newPath [] . M.adjust (p:) (wd fs)
              in (newPath, fs { files = addPath $ files fs })

现在介绍 step 函数的无 monad 版本。它需要接受 Operation 和 Filesystem，并返回 Response 和（可能已修改）Filesystem：

step :: Operation -> Filesystem -> (Response, Filesystem)
step (Cd d) fs = (Just "Ok\n", cd d fs)
step (MkDir d) fs = first (\d -> Just $ "Created " ++ d ++ "\n") $ create d fs
step Ls fs = let files = children (wd fs) fs
             in (Just $ unlines files, fs)
step Quit fs = (Nothing, fs)

...嗯，该类型签名看起来已经很像 State monad 的内部了。算了，暂时不管它，继续盲目冲锋吧。

现在，我们想要的是一个为文件系统解释器提供通用接口的函数。特别是，我们希望接口至少在某种程度上是自包含的，以便无论使用该接口的内容都不必手动单步执行，但我们希望接口能够充分忽略使用它的代码我们可以将其连接到 IO monad、其他一些 Monad，甚至根本没有 monad。

这主要告诉我们的是，我们需要以某种方式将外部代码与解释器交错，而不是让任何一部分都受到控制。现在，Haskell 是一种函数式语言，所以这意味着使用大量高阶函数是好的，对吗？对我来说听起来似乎有道理，所以这是我们将使用的策略：如果一个函数不知道下一步要做什么，我们将交给它另一个我们假设的函数。重复，直到每个人都知道该做什么正在进行中。一个完美的计划，不是吗？

这一切的核心是 step 函数，因此我们将从调用它开始。

interp1 :: Operation -> Filesystem -> (Response, Filesystem)
interp1 op fs = step op fs

……好吧，这是一个开始。我猜。但是等等，Operation 来自哪里？我们需要外部代码来提供这一点，但我们不能只是要求它而不与像IO这样令人讨厌的字符混合在一起。所以我们得到了另一个函数来为我们做这些肮脏的工作：

interp2 :: ((Operation -> (Response, Filesystem)) -> t) -> Filesystem -> t
interp2 inp fs = inp (\op -> step op fs)

当然，现在我们所拥有的只是一些愚蠢的t，我们甚至不知道它是什么。我们知道它必须在某个地方有一个Response和一个Filesystem，但我们不能用它做任何事情，所以我们将其交还给另一个函数，以及一些有关如何继续的说明......这当然会涉及传递更多函数。你知道，它的功能一直在下降。

interp3 :: ((Operation -> (Response, Filesystem)) -> a)
           -> (a -> ((Response, Filesystem) -> b) -> c)
           -> (Filesystem -> b)
           -> (String -> Filesystem -> b) 
           -> Filesystem 
           -> c
interp3 inp check done out fs = check (inp (\op -> step op fs)) test
    where test (Nothing, fs) = done fs
          test (Just s, fs)  = out s fs

……好吧，这太难看了。但别担心，一切都按计划进行。接下来我们可以做一些观察：

类型 a 只存在于 inp 和 check 之间，所以事后看来，我们不妨将它们组合起来时间并将组合函数传递给解释器。
当我们调用done时，它的含义应该与罐头上的内容完全一致。因此，done 的返回类型应该与整个解释器相同，这意味着 b 和 c 应该是相同的类型。

现在，如果done结束了整个事情，那么out会发生什么？顾名思义，它向外部代码提供输出，但是之后它去哪里呢？它需要以某种方式循环回解释器，我们可能会注意到我们的解释器还不是递归的。前进的道路是明确的——解释者就像耶梦加德一样，抓住了自己的尾巴；无限循环直到解释结束（或者直到诸神黄昏，以先到者为准）。

interp4 :: ((Operation -> (Response, Filesystem)) 
               -> ((Response, Filesystem) -> r) -> r)
           -> (Filesystem -> r)
           -> (String -> Filesystem -> (Filesystem -> r) -> r)
           -> Filesystem
           -> r
interp4 checkInp done out fs = checkInp (\op -> step op fs) test
    where loop = interp4 checkInp done out
          test (Nothing, fs) = done fs
          test (Just s, fs)  = out s fs loop

...哦，我有没有提到它现在可以工作了？不，说真的！

下面是一些使用该接口的 IO 代码：

ioIn f k = putStr "> " >> (k . f =<< readLn)
ioDone fs = putStrLn "Done" >> return fs 
ioOut x fs k = putStr x >> k fs

ioInterp :: IO Filesystem
ioInterp = interp4 ioIn ioDone ioOut newFS

下面是运行命令列表、生成输出字符串列表的代码：

scriptIn f k (x:xs) = k (f x) xs
scriptDone fs xs = ["Done\n"]
scriptOut r fs k xs = r : k fs xs

scriptInterp :: [Operation] -> [String]
scriptInterp = interp4 scriptIn scriptDone scriptOut newFS

在此处的 GHCi 中运行，如果代码还不足以激发您的想象力。

嗯，就是这样。或者是吗？坦率地说，这个解释器是只有母亲才会喜欢的代码。有什么东西可以将它们优雅地结合在一起吗？有什么东西可以揭示代码的底层结构吗？

...好吧，所以这会导致什么结果就很明显了。函数在循环中尾部调用的整体设计看起来非常像连续传递风格，并且在解释器的类型签名中不是一次而是两次可以找到特征模式 (foo -> r）-> r，更广为人知的名称是延续单子。

不幸的是，即使在这之后，延续仍然让我头疼，而且我不知道如何最好地将解释器的临时结构分解为在 MonadCont 中运行的计算。

Disclaimer: I can't promise the following is a good way to go about it, but working through it sounds fun. Let's take it for a spin, shall we?.

A few obligatory imports

First, let's toss some data types out there. I'm going to fill in some details and tweak things a bit, in order to define a simple "file system" that we can actually interact with.

type Path = String
type Response = Maybe String
type Contents = [String]

data Operation = Cd Path 
               | Ls 
               | MkDir Path
               | Quit
    deriving (Read, Show)

Next, we'll do something a bit edgy... strip out all the monads. What? This is madness! Perhaps, but sometimes all the hidden plumbing that >>= provides hides things just a bit too much.

For the file system itself, we'll just store the current working directory and a map from paths to their children. We'll also need a handful of functions to interact with it.

data Filesystem = Filesystem { wd :: Path, files :: M.Map Path Contents }
    deriving Show

newFS = Filesystem "/" (M.singleton "/" [])

isDirectory p fs = M.member p $ files fs
children p fs = fromMaybe [] . M.lookup p $ files fs
cd p fs = fs { wd = p }
create p fs = let newPath = wd fs ++ p ++ "/"
                  addPath = M.insert newPath [] . M.adjust (p:) (wd fs)
              in (newPath, fs { files = addPath $ files fs })

Now for a monad-less version of the step function. It needs to take an Operation and a Filesystem, and return a Response and a (possibly modified) Filesystem:

step :: Operation -> Filesystem -> (Response, Filesystem)
step (Cd d) fs = (Just "Ok\n", cd d fs)
step (MkDir d) fs = first (\d -> Just $ "Created " ++ d ++ "\n") $ create d fs
step Ls fs = let files = children (wd fs) fs
             in (Just $ unlines files, fs)
step Quit fs = (Nothing, fs)

...hmm, that type signature already looks a lot like the guts of a State monad. Oh well, just ignore it for now, and charge blindly onward.

Now, what we want is a function that will provide a general-purpose interface to a Filesystem interpreter. Particularly, we want the interface to be at least somewhat self-contained so that whatever uses the interface doesn't have to step through manually, yet we want the interface to be sufficiently oblivious to the code using it that we can wire it up to the IO monad, some other Monad, or even no monad at all.

What this tells us primarily is that we'll need to interleave the external code with the interpreter in some fashion, rather than having either part be in control. Now, Haskell is a functional language, so that means that using lots of higher-order functions is good, right? Sounds plausible to me, so here's the strategy we'll use: If a function doesn't know what to do next, we'll hand it another function that we assume does. Repeat until everybody knows what's going on. A flawless plan, no?

The heart of it all is the step function, so we'll start by just calling that.

interp1 :: Operation -> Filesystem -> (Response, Filesystem)
interp1 op fs = step op fs

...well, it's a start. I guess. But wait, where is the Operation coming from? We need the external code to provide that, but we can't just ask for it without getting all mixed up with unsavory characters like IO. So we get another function to do the dirty work for us:

interp2 :: ((Operation -> (Response, Filesystem)) -> t) -> Filesystem -> t
interp2 inp fs = inp (\op -> step op fs)

Of course, now all we have is some stupid t that we don't even know what it is. We know it has to have a Response and a Filesystem in it somewhere, but we can't do anything with it, so we'll hand it back to another function, along with some instructions on how to proceed... which will of course involve passing in yet more functions. It's functions all the way down, you know.

interp3 :: ((Operation -> (Response, Filesystem)) -> a)
           -> (a -> ((Response, Filesystem) -> b) -> c)
           -> (Filesystem -> b)
           -> (String -> Filesystem -> b) 
           -> Filesystem 
           -> c
interp3 inp check done out fs = check (inp (\op -> step op fs)) test
    where test (Nothing, fs) = done fs
          test (Just s, fs)  = out s fs

...well that's pretty ugly. But don't worry, all is going according to plan. We can make a couple observations next:

The type a only exists between inp and check, so in hindsight, we might as well combine them ahead of time and just pass the composed function to the interpreter.
When we call done, it ought to mean exactly what it says on the tin. So the return type for done should be the same as the whole interpreter, meaning b and c ought to be the same type.

Now, if done ends the whole thing, what's out? As the name none-too-subtly implies, it's providing output to the external code, but where does it go after that? It needs to loop back into the interpreter somehow, and we might note that our interpreter is not yet recursive. The way forward is clear--the interpreter, like Jormungand, thus seizes its own tail; looping back around indefinitely till the interpretation finishes (or until Ragnarök, whichever comes first).

interp4 :: ((Operation -> (Response, Filesystem)) 
               -> ((Response, Filesystem) -> r) -> r)
           -> (Filesystem -> r)
           -> (String -> Filesystem -> (Filesystem -> r) -> r)
           -> Filesystem
           -> r
interp4 checkInp done out fs = checkInp (\op -> step op fs) test
    where loop = interp4 checkInp done out
          test (Nothing, fs) = done fs
          test (Just s, fs)  = out s fs loop

...oh, did I mention that it works now? No, seriously!

Here's some IO code to use the interface:

ioIn f k = putStr "> " >> (k . f =<< readLn)
ioDone fs = putStrLn "Done" >> return fs 
ioOut x fs k = putStr x >> k fs

ioInterp :: IO Filesystem
ioInterp = interp4 ioIn ioDone ioOut newFS

And here's code that runs a list of commands, producing a list of output strings:

scriptIn f k (x:xs) = k (f x) xs
scriptDone fs xs = ["Done\n"]
scriptOut r fs k xs = r : k fs xs

scriptInterp :: [Operation] -> [String]
scriptInterp = interp4 scriptIn scriptDone scriptOut newFS

Examples of running both in GHCi here, if just the code doesn't tickle your imagination sufficiently.

Well, that's that. Or is it? Frankly, that interpreter is code only a mother could love. Is there something that would tie it all together elegantly? Something to reveal the underlying structure of the code?

...okay, so it's pretty obvious where this leads. The overall design of functions tail-calling each other in circles looks an awful lot like continuation-passing style, and not once but twice in the interpreter's type signature can be found the characteristic pattern (foo -> r) -> r, better known as the continuation monad.

Unfortunately, even after all that, continuations make my head hurt and I'm not sure how best to disentangle the very ad-hoc structure of the interpreter into a computation running in a MonadCont.

回复收藏 0 原文

星光不落少年眉 2024-09-15 20:48:52

我在这里可以想到两种解决方案：

1）使用 monad 转换器库。除了图书馆的一些细节之外，我无法改进 Shimuuar 的回复。 Transformers 本身并不提供必要的实例；您需要使用 Transformer 以及 monads-tf 或 monads-fd，它们分别提供基于类型系列和 Fundeps 的实现。如果你走这条路，我更喜欢 monads-tf。其API与mtl几乎相同。我没有使用 MonadLib 的经验，但它看起来也不错。

2) 在 IO 中编写主循环，并为每个循环迭代调用 runState 来评估状态 monad。如下所示：

loop path state = do
  op <- readOp
  let ((newpath, resp), newstate) = runState (step path op) state
  print resp
  loop newpath newstate

这应该可以工作，但它远不如使用 monad 转换器那么惯用。

I can think of two solutions here:

1) Use a monad transformer library. I can't improve on Shimuuar's reply, except in some details on the libraries. Transformers by itself doesn't provide the necessary instances; you would need to use transformers and either monads-tf or monads-fd, which offer implementations based on type families and fundeps, respectively. I prefer monads-tf if you go this route. The api is almost identical to that of mtl. I don't have experience with MonadLib, but it looks quite good also.

2) Write your main loop in IO, and for each loop iteration call runState to evaluate the state monad. Something like the following:

loop path state = do
  op <- readOp
  let ((newpath, resp), newstate) = runState (step path op) state
  print resp
  loop newpath newstate

This should work, but it's far less idiomatic than using monad transformers.

回复收藏 0 原文

困倦 2024-09-15 20:48:52

要求您的 FS 实例是 MonadIO 的实例，而不仅仅是 Monad：

class MonadIO m => FS m where ...

然后您将可以使用 liftIO将 FS 提升为 IO: 的方法，

liftIO :: MonadIO m => m a -> IO a

这样你就可以在 IO monad: 中写入：

files <- liftIO $ children currentDir

等等。当然，这意味着你需要实现 <代码>liftIO
在您编写 FS 实例之前，对于每个 FS ，但是对于
这个应用程序（没有看到实际的细节）
听起来应该很简单。

Require your instances of FS to be instance of MonadIO, not just Monad:

class MonadIO m => FS m where ...

Then you will have available the liftIO method to lift FS into IO:

liftIO :: MonadIO m => m a -> IO a

so you can write in the IO monad:

files <- liftIO $ children currentDir

etc. Of course, that means you will need to implement liftIO
for each FS before you even write the FS instance, but for
this application (without having seen the actual details)
it sounds like that should be simple.

回复收藏 0 原文

~没有更多了~