在无状态世界中保持状态

发布于 2024-10-04 09:43:29 字数 1255 浏览 4 评论 0原文

我正在将上下文无关语法转换为 Greibach 范式 (GNF)。主要转换（来自 Hopcroft 和 Ullman）是对语法索引变量的迭代序列。它本质上是“无状态的”。我将其实现为对适当索引的一系列折叠（实现相当简单）：

gnf :: Ord a => Set (Rule a) -> Set (Rule a)
gnf rl = foldl step1 rl [1..maxIndex rl]
 where step1 rl' k = foldl step2 rl' [1..k - 1]
        where step2 rl'' j = noLR k (subst rl'' k j)

maxIndex rl 返回一组规则中的最大变量索引； subst rl k j 将 k 索引规则替换为右侧以 j 索引变量开头的规则。执行gnf后，我需要以相反的顺序对语法执行最后一次传递。

问题是 noLR，它用左递归 k 索引规则转换语法。这是一个“有状态”函数，因为必须为应用 noLR 的每个规则（或 k 索引规则）生成唯一变量。因此，我编写了一个有状态函数

noLR :: Ord a => Int -> Set (Rule a) -> State [Sym a] (Set (Rule a))
noLR rl = do (n:ns) <- get; put ns;
             let rl' = ... remove left recursion rl n ...
              in return rl'

，可以将 noLR 排序在一起，以更新 noLR 作为参数的 n。不过，我不确定如何在上述函数的step2中执行noLR。我似乎无法使用 let ... in 模式，因为有状态计算嵌入在几个递归函数中。

我想要做的是让n成为某种类型的全局变量，类似于n的显式线程，我可以在step2，这就是为什么我最初将该函数编写为带有 eta 扩展（对于 n）的折叠。有谁知道我如何在状态单子内构造 gnf 来实现这种效果？除了折叠中的最后一个计算之外，没有其他东西是“有状态的”，并且我只习惯将状态单子与“琐碎”的示例一起使用。我有点迷失了。

原文

I am converting a context-free grammar into Greibach Normal Form (GNF). The main transformation (from Hopcroft & Ullman) is a sequence of iterations over the indexed variables of the grammar. It is essentially "stateless". I have implemented it as a sequence of folds over the appropriate indices (the implementation is fairly straightforward):

gnf :: Ord a => Set (Rule a) -> Set (Rule a)
gnf rl = foldl step1 rl [1..maxIndex rl]
 where step1 rl' k = foldl step2 rl' [1..k - 1]
        where step2 rl'' j = noLR k (subst rl'' k j)

maxIndex rl returns the maximum variable index in a set of rules; subst rl k j performs substitution on the k-indexed rules by the rules whose right hand side starts with a j-indexed variable. After performing gnf, I need to perform a final pass over the grammar in reverse order.

The problem is noLR, which transforms a grammar with left-recursive k-indexed rules. This is a "stateful" function, since a unique variable must be generated for each rule (or k-indexed rule) to which noLR is applied. So I wrote a stateful function

noLR :: Ord a => Int -> Set (Rule a) -> State [Sym a] (Set (Rule a))
noLR rl = do (n:ns) <- get; put ns;
             let rl' = ... remove left recursion rl n ...
              in return rl'

I can sequence together the noLR in order to update the n which noLR takes as a parameter. I'm not sure how to perform noLR inside step2 in the above function, though. I don't seem to be able to use the let ... in schema, because the stateful computation is embedded inside several recursive functions.

What I want to do is have n be some type of global variable, similar to an explicit threading of n, which I can call and update inside step2, which is why I originally wrote the function as a fold with eta-expansion (for n). Does anyone know how I could structure gnf inside the state monad to achieve this kind of effect? Except for the last computation in the fold, nothing else is "stateful," and I'm only comfortable using the state monad with "trivial" examples. I'm rather lost.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

可可 2024-10-11 09:43:29

为了将 noLR 与您给定的类型一起使用，您必须按照以下几行重写您的 gnf 函数：

gnf :: Ord a => Set (Rule a) -> Set (Rule a)
gnf rl = evalState (foldM step1 rl [1..maxIndex rl]) ( ... generate the initial state [Sym a] here ...)
 where step1 rl' k = foldM step2 rl' [1..k - 1]
        where step2 rl'' j = noLR k (subst rl'' k j)

您的状态变量在整个计算过程中都存在，并且必须在代码中明确说明这一事实。

如果您需要的只是新生成的变量名称不会相互冲突，那么您可以通过从索引 k 和 j 生成新的符号名称来使 noLR 纯粹 - 类似于 k == 42 的“foo_42_16”和j == 16。但是，如果输入语法已包含此类符号名称，则可能会遇到麻烦。

如果您需要符号在语法中是唯一的，那么为什么不直接这么说呢？

newSymbol :: Set (Rule a) -> Sym a
newSymbol rl = ... find a symbol name not used in rl ...

不过，这肯定效率不高，除非您用不同的类型替换 Set（规则 a），从而可以更有效地实现 newSymbol 操作。

In order to use noLR with the type you have given, you will have to rewrite your gnf function along the following lines:

gnf :: Ord a => Set (Rule a) -> Set (Rule a)
gnf rl = evalState (foldM step1 rl [1..maxIndex rl]) ( ... generate the initial state [Sym a] here ...)
 where step1 rl' k = foldM step2 rl' [1..k - 1]
        where step2 rl'' j = noLR k (subst rl'' k j)

Your state variable exists during the whole computation, and that fact has to be made explicit in the code.

If all you need is that the newly-generated variable names don't collide with each other, then you can make noLR pure by generating a new symbol name from the indices k and j - something like "foo_42_16" for k == 42 and j == 16. If the input grammar already contains symbol names of that kind, you might be in trouble, however.

If you need your symbols to be unique within the grammar, then why not say just that?

newSymbol :: Set (Rule a) -> Sym a
newSymbol rl = ... find a symbol name not used in rl ...

This is definitely not efficient, though, unless you replace Set (Rule a) by a different type that allows you to implement the newSymbol operation more efficiently.

回复收藏 0 原文

放我走吧 2024-10-11 09:43:29

我会尝试将 noLR 重写为纯粹的。您确定不能重写它来生成仅依赖于规则名称及其索引（或类似内容）的符号吗？

noLR k j = noLR' k j $ newSymbol k j
    where newSymbol k j = ... -- some concatenation of k and j
          noLR' k j sym = ... -- your now pure function

I would try to rewrite noLR to be pure. Are you sure you cannot rewrite it to generate a symbol which depends only on the name of the rule and its index (or something similar)?

noLR k j = noLR' k j $ newSymbol k j
    where newSymbol k j = ... -- some concatenation of k and j
          noLR' k j sym = ... -- your now pure function

回复收藏 0 原文

~没有更多了~