将 MonadError 与 Parsec 结合使用

发布于 2024-08-20 04:37:59 字数 513 浏览 3 评论 0原文

我正在尝试将 MonadError 与 Parsec 一起使用。我想出了以下代码片段:

f5 = do
    char 'a'
    throwError "SomeError"

f6 = f5 `catchError` (\e -> unexpected $ "Got the error: " ++ e)

ret = runErrorT (runParserT f6 () "stdin" "a")

但是,retLeft“SomeError”,看来catchError没有任何效果。这里使用 MonadError 的正确方法是什么?

我更喜欢使用 MonadError 而不是 Parsec 自己的错误处理,例如当我有:

try (many1 parser1) <|> parser2

如果 parser1 在这里失败,parser2 将继续,但我希望有一个异常完全中止解析。

I'm trying to use MonadError together with Parsec. I've come up with the following code snippet:

f5 = do
    char 'a'
    throwError "SomeError"

f6 = f5 `catchError` (\e -> unexpected $ "Got the error: " ++ e)

ret = runErrorT (runParserT f6 () "stdin" "a")

However, ret is Left "SomeError", it seems the catchError doesn't have any effect. What's the right way to use MonadError here?

I'd prefer to use MonadError over Parsec's own error handling, as for example when I have:

try (many1 parser1) <|> parser2

If parser1 fails here, parser2 will continue, but I'd like to have an exception which aborts the parsing entirely.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃气十足 2024-08-27 04:37:59

我的印象是您试图以错误的原因涉及 MonadError

try (many1 parser1) <|> 中parser2,您试图避免的行为源于 try<|> 的使用 - 如果您不喜欢它,使用不同的组合器。也许像 (many1 parser1) >> 这样的表达式parser2 会更适合你吗? (这会丢弃 (many1 parser1) 的结果;您当然可以使用 >>= 并组合 (many1 parser1) 的结果> 与来自 parser2 的内容。)


(注意:在这一点之下,目前的问题没有真正好的解决方案,只是一些关于为什么有些事情可能不起作用的思考。 .. 希望这可能(在某种程度上)具有启发性,但不要期望太多。)

对 ParsecT / MonadError 交互的仔细检查。恐怕有点混乱,我仍然不确定如何最好地去做OP想做的事情,但我希望以下内容至少能够深入了解缺乏成功的原因原来的方法。

首先,请注意说 Parsec 是 MonadError 的实例是不正确的。 Parsec 是当内部 monad 为 Identity 时由 ParsecT 产生的 monad;当且仅当给定一个内部 monad(它本身就是一个要使用的 MonadError 实例)时,ParsecT 才会生成 MonadError 的实例。 GHCi 交互的相关片段:

> :i Parsec
type Parsec s u = ParsecT s u Identity
    -- Defined in Text.Parsec.Prim
-- no MonadError instance

instance (MonadError e m) => MonadError e (ParsecT s u m)
  -- Defined in Text.Parsec.Prim
-- this explains why the above is the case
-- (a ParsecT-created monad will only become an instance of MonadError through
-- this instance, unless of course the user provides a custom declaration)

接下来,让我们来一个使用 catchError 和 ParsecT 的工作示例。考虑这个 GHCi 交互:

> (runParserT (char 'a' >> throwError "some error") () "asdf" "a" :: Either String (Either ParseError Char)) `catchError` (\e -> Right . Right $ 'z')
Right (Right 'z')

类型注释似乎是必要的(这对我来说似乎很直观,但它与原始问题无关,所以我不会尝试详细说明)。整个表达式的类型由 GHC 确定如下:

Either String (Either ParseError Char)

因此,我们得到了一个常规解析结果 - Either ParseError Char - 包装在 Either String 中> monad 代替通常的 Identity monad。由于 Either StringMonadError 的实例,我们可以使用 throwError / catchError,但处理程序传递给 < code>catchError 当然必须产生正确类型的值。恐怕这对于打破解析例程来说并不是很有用。

回到问题中的示例代码。这做了稍微不同的事情。让我们检查一下问题中定义的 ret 类型:(

forall (m :: * -> *) a.
(Monad m) =>
m (Either [Char] (Either ParseError a))

根据 GHCi...请注意,我必须使用 {-# LANGUAGE NoMonomorphismRestriction #-} 解除单态限制 让代码在没有类型注释的情况下进行编译。)

该类型暗示了使用 ret 做一些有趣的事情的可能性。我们开始吧:

> runParserT ret () "asdf" "a"
Right (Left "some error")

事后看来,给 catchError 的处理程序使用 unexpected 生成一个值,所以当然它将成为(可用作)解析器......而我恐怕我不知道如何将其锤炼成对打破解析过程有用的东西。

I'm under the impression that you're trying to involve MonadError for the wrong reason.

In the try (many1 parser1) <|> parser2, the behaviour you're trying to avoid stems from the use of try and <|> -- if you don't like it, use different combinators. Perhaps an expression like (many1 parser1) >> parser2 would work better for you? (This discards the results from (many1 parser1); you could of course use >>= and combine the results from (many1 parser1) with those from parser2.)


(Note: Below this point, there is no really good solution to the problem at hand, just some musings as to why some things probably won't work... Hopefully this may be (somewhat) enlightening, but don't expect too much.)

A closer examination of ParsecT / MonadError interaction. I'm afraid it's a bit messy and I'm still not really sure how best to go about doing what the OP wants to do, but I'm hoping the following will at least provide insight into the reasons for the lack of success of the original approach.

Firstly, note that it is not correct to say that Parsec is an instance of MonadError. Parsec is the monad produced by ParsecT when the inner monad is Identity; ParsecT produces instances of MonadError if and only if it is given an inner monad which is itself an instance of MonadError to work with. Relevant fragments of GHCi interactions:

> :i Parsec
type Parsec s u = ParsecT s u Identity
    -- Defined in Text.Parsec.Prim
-- no MonadError instance

instance (MonadError e m) => MonadError e (ParsecT s u m)
  -- Defined in Text.Parsec.Prim
-- this explains why the above is the case
-- (a ParsecT-created monad will only become an instance of MonadError through
-- this instance, unless of course the user provides a custom declaration)

Next, let's have ourselves a working example with catchError and ParsecT. Consider this GHCi interaction:

> (runParserT (char 'a' >> throwError "some error") () "asdf" "a" :: Either String (Either ParseError Char)) `catchError` (\e -> Right . Right $ 'z')
Right (Right 'z')

The type annotation appears necessary (this seems to make intuitive sense to me, but it isn't pertinent to the original question, so I won't try to elaborate). The type of the whole expression is determined by GHC to be as follows:

Either String (Either ParseError Char)

So, we've got a regular parse result -- Either ParseError Char -- wrapped in the Either String monad in place of the usual Identity monad. Since Either String is an instance of MonadError, we can use throwError / catchError, but the handler passed to catchError must of course produce a value of the correct type. That's not very useful for breaking out of the parsing routine, I'm afraid.

Back to the example code from the question. That does a slightly different thing. Let's examine the type of ret as defined in the question:

forall (m :: * -> *) a.
(Monad m) =>
m (Either [Char] (Either ParseError a))

(According to GHCi... note that I had to lift the monomorphism restriction with {-# LANGUAGE NoMonomorphismRestriction #-} to have the code compile without type annotations.)

That type is a hint as to the possibility of doing something amusing with ret. Here we go:

> runParserT ret () "asdf" "a"
Right (Left "some error")

In hindsight, the handler given to catchError produces a value using unexpected, so of course it's going to be (usable as) a parser... And I'm afraid I don't see how to hammer this into something useful for breaking out of the parsing process.

旧故 2024-08-27 04:37:59

如果您尝试调试解析器来排除故障,那么使用 errorDebug.Trace 等可能会更简单。

另一方面,如果您需要终止对某些输入的解析作为实际程序的一部分,但由于 try (...) <|> 构造而没有这样做,那么你的逻辑中有一个错误,你应该停下来重新思考你的语法,而不是通过错误处理来解决它。

如果您希望解析器有时在给定的输入上终止,而不是其他时间,那么要么您的输入流中缺少某些内容(应该添加),要么解析器不能解决您的问题。

如果您希望解析器从非致命错误中优雅地恢复并在可能的情况下继续尝试,但在无法继续时以错误终止,那么您...可能需要考虑 Parsec 以外的其他东西,因为它确实没有设计为此。我相信乌得勒支大学的 Haskell 解析器组合器库更容易支持这种逻辑。

编辑:就 Parsec 本身是 MonadError 的一个实例而言,是的,它自己的错误处理包含了该功能。您要做的就是在秒差距之上堆叠一个第二错误单子,并且您可能会遇到麻烦,因为通常很难区分以这种方式“冗余”的单子转换器。处理多个状态单子是众所周知的尴尬,这就是为什么 Parsec(也是一个状态单子)提供保存自定义状态的功能。

换句话说,Parsec 作为一个错误单子对你没有任何帮助,事实上,它的相关性主要在于使你的问题变得更加困难。

If you're trying to debug a parser to troubleshoot, it's probably simpler to use error, Debug.Trace, or whatnot.

On the other hand, if you need to terminate the parsing on some inputs as part of your actual program, but it's not doing so because of a try (...) <|> construct, then you have a bug in your logic and you should stop and rethink your grammar, rather than hack around it with error handling.

If you want the parser to terminate on a given input some of the time, but not others, then either something is missing from your input stream (and should be added) or a parser is not the solution to your problem.

If you want the parser to recover gracefully from non-fatal errors and keep trying when possible, but terminate with an error when it can't continue, then you... may want to consider something other than Parsec, because it's really not designed for that. I believe Utrecht University's Haskell parser combinator library supports that sort of logic much more easily.

Edit: As far as Parsec being itself an instance of MonadError goes--yes, and its own error handling subsumes that functionality. What you're trying to do is stack a second error monad on top of Parsec, and you're probably having trouble because it's generally awkward to distinguish between monad transformers that are "redundant" in that manner. Dealing with multiple State monads is more famously awkward, which is why Parsec (a State monad as well) provides functionality to hold custom state.

In other words, Parsec being an error monad doesn't help you at all, and in fact is relevant mostly in the sense of making your problem more difficult.

狼性发作 2024-08-27 04:37:59

如果您需要终止对某些输入的解析作为实际程序的一部分,但由于 try (...) <|> 而没有这样做构造,那么你的逻辑中有一个错误,你应该停止并重新思考你的语法,而不是通过错误处理来解决它。

如果你希望解析器有时在给定的输入上终止,而不是其他,那么要么您的输入流中缺少某些内容(应该添加),要么解析器不能解决您的问题。

这个答案基于问题在于语法的假设。但是,如果我使用语法来提供编译器,则会出现语法无法处理的其他错误。假设一个变量引用,指向一个未定义的变量。并且语言被指定为单次传递,并且变量在遇到时进行评估。然后,语法就没问题了。解析就好了。但是,由于评估语法中指定的内容,发生了错误,现有的“失败”或“意外”或不足以处理此问题。如果有一种方法可以中止解析而不诉诸更高级别的错误处理,那就太好了。

if you need to terminate the parsing on some inputs as part of your actual program, but it's not doing so because of a try (...) <|> construct, then you have a bug in your logic and you should stop and rethink your grammar, rather than hack around it with error handling.

If you want the parser to terminate on a given input some of the time, but not others, then either something is missing from your input stream (and should be added) or a parser is not the solution to your problem.

This answer is based on the assumption that the problem lies in the grammar. But if I'm using the grammar to feed a compiler, there are other errors that a grammar can't handle. Let's say a variable reference, to a variable that wasn't defined. And the language is specified as a single pass, and variables are evaluated as encountered. Then, the grammar is just fine. The parsing is just fine. But as a result of evaluating what was specified in the grammar an error has occurred, the existing "fail" or "unexpected" or insufficient to deal with this problem. It would be nice to have a means to abort the parsing without resorting to higher level error handling.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文