该语法需要什么类型的解析器？

发布于 2024-12-09 23:23:49 字数 481 浏览 0 评论 0原文

我有一个语法，除了我不相信该语法是 LL(1) 之外，我不知道需要什么类型的解析器来解析它。我想我需要一个带有回溯或 LL(*) 之类的解析器。我想出的语法（可能需要一些重写）是：

S: Rules
Rules: Rule | Rule Rules
Rule: id '=' Ids
Ids: id | Ids id

我试图生成的语言看起来像这样：

abc = def g hi jk lm
xy = aaa bbb ccc ddd eee fff jjj kkk
foo = bar ha ha

零个或多个规则，其中包含左标识符，后跟等号，后跟一个或多个标识符。我认为编写解析器时会遇到问题的部分是，语法允许规则中存在任意数量的 id，并且判断新规则何时开始的唯一方法是它何时找到 id =，这需要回溯。

有谁知道这个语法的分类以及手写解析器的最佳解析方法？

原文

I have a grammar that I do not know what type of parser I need in order to parse it other than I do not believe the grammar is LL(1). I am thinking I need a parser with backtracking or LL(*) of some sort. The grammar I have came up with (which may need some rewriting) is:

S: Rules
Rules: Rule | Rule Rules
Rule: id '=' Ids
Ids: id | Ids id

The language I am trying to generate looks something like this:

abc = def g hi jk lm
xy = aaa bbb ccc ddd eee fff jjj kkk
foo = bar ha ha

Zero or more Rule that contain a left identifier followed by an equal sign followed by one or more identifers. The part that I think I will have a problem writing a parser for is that the grammar allows any amount of id in a Rule and that the only way to tell when a new Rule starts is when it finds id =, which would require backtracking.

Does anyone know the classification of this grammar and the best method of parsing, for a hand written parser?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

·深蓝 2024-12-16 23:23:49

生成标识符后跟等号后跟有限标识符序列的语法是常规。这意味着可以使用 DFA 或正则表达式来解析该语言中的字符串。不需要花哨的非确定性或 LL(*) 解析器。

要查看该语言是否是正则语言，请设 Id = U {a : a ∈ Г}，其中 Г ⊂ Σ 是符号集这可能出现在标识符中。您尝试生成的语言由正则表达式

Id⁺ =( Id⁺)^* Id⁺

设置 Γ = {a, b, ..., z}，正则表达式语言中的字符串示例是：

看 = 我是常规语言
嘿 = 这意味着我可以被 dfa 识别
很酷 = 甚至是正则表达式

不需要使用强大的解析技术来解析您的语言。这是使用正则表达式或 DFA 进行解析既合适又最佳的一种情况。

编辑：

调用上面的正则表达式R。要解析R^*，请生成识别R^*语言的DFA。为此，请使用从 Kleene 定理获得的算法生成一个识别 R^* 语言的 NFA。然后使用子集构造将 NFA 转换为 DFA。生成的 DFA 将识别 R^* 中的所有字符串。给定实现语言中构造的 DFA 的表示，所需的操作 - 例如，