如何检测ml-lex中的eof

发布于 2024-10-15 10:19:35 字数 141 浏览 3 评论 0原文

在 ml-lex 中编写代码时我们需要写eof函数 val eof = fn () => val eof = fn () => EOF；这是必要的部分吗？另外，如果我希望我的词法分析器在检测到 eof 时停止，那么我应该向给定函数添加什么。谢谢。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

二智少女 2024-10-22 10:19:35

Roger Price 编写的ML-Lex 和 ML-Yacc 用户指南非常适合学习 ml-lex和 ml-yacc。

eof 函数在 lex 定义的用户声明部分中是必需的，并且 lexresult 类型如下：

当输入结束时，词法分析器调用函数 eof
已达到流。

如果适合您的应用程序或 EOF 令牌，您的 eof 函数可以抛出异常。无论如何，它必须返回 lexresult 类型的内容。用户指南第 7.1.2 章中有一个示例，如果 EOF 位于块注释中间，则打印字符串。

我使用稍微“简单”的 eof 函数

structure T = Tokens
structure C = SourceData.Comments

fun eof data =
if C.depth data = 0 then
    T.EOF (~1, ~1)
else
  fail (C.start data) "Unclosed comment"

，其中 C 结构是一个“特殊”注释处理结构，用于计算开始和结束注释的数量。如果当前深度为 0，则返回 EOF 标记，其中 (~1, ~1) 用于指示左侧和右侧位置。因为我不使用 EOF 的位置信息，所以我只是将其设置为 (~1, ~1)。

通常，您会设置 %eop（解析结束）以使用 yacc 文件中的 EOF 标记，以表明无论使用什么起始符号，后面都可能跟有 EOF 标记。还要记住将 EOF 添加到 %noshift。有关 %eop 和 %noshift，请参阅第 9.4.5 节。

显然，您必须在 yacc 文件的 %term 声明中定义 EOF。

希望这会有所帮助，否则请查看 MLB 解析器或用 ml-lex 和 ml-yacc 编写的 SML 解析器。 MLB 解析器是最简单的，因此可能更容易理解。

The User’s Guide to ML-Lex and ML-Yacc by Roger Price is great for learning ml-lex and ml-yacc.

The eof function is mandatory in the user declarations part of your lex definition together with the lexresult type as:

The function eof is called by the lexer when the end of the input
stream is reached.

Where your eof function can either throw an exception if that is appropriate for your application or the EOF token. In any way it have to return something of type lexresult. There is an example in chapter 7.1.2 of the user guide which prints a string if EOF was in the middle of a block comment.

I use a somewhat "simpler" eof function

structure T = Tokens
structure C = SourceData.Comments

fun eof data =
if C.depth data = 0 then
    T.EOF (~1, ~1)
else
  fail (C.start data) "Unclosed comment"

where the C structure is a "special" comment handling structure that counts number of opening and closing comments. If the current depth is 0 then it returns the EOF token, where (~1, ~1) are used indicate the left and right position. As I don't use this position information for EOF i just set it to (~1, ~1).

Normally you would then set the %eop (end of parse) to use the EOF token in the yacc file, to indicate that what ever start symbol that is used, it may be followed by the EOF token. Also remember to add EOF to %noshift. Se section 9.4.5 for %eop and %noshift.

Obviously you have to define EOF in %term declaration of your yacc file aswel.

Hope this helps, else take a look at an MLB parser or an SML parser written in ml-lex and ml-yacc. The MLB parser is the simplest and thus might be easier to understand.

回复收藏 0 原文

~没有更多了~