用 C 语言构建词法分析器
I want to build a lexer in C and I am following the dragon book, I can understand the state transitions but how to implement them?
Is there a better book?
The fact that I have to parse a string through a number of states so that I can tell whether the string is acceptable or not!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您正在寻找比龙书更现代的治疗方法:Andrew W. Appel 和 Maia Ginsburg,《现代》C 语言编译器实现,剑桥大学出版社,2008 年。
第 2 章重点介绍词法分析:词法标记、正则表达式、有限自动机; 非确定性有限自动机; 词法分析器生成器
查看目录
If you're looking for a more modern treatment than the dragon book(s) : Andrew W. Appel and Maia Ginsburg, Modern Compiler Implementation in C, Cambridge University Press, 2008.
Chapter 2 is focused on Lexical Analysis : Lexical tokens, Regular expressions, Finite automata; Nondeterministic Finite Automata; Lexical analyzer generators
Look at the Table of Contents
程序 flex(lex 的克隆)将为您创建一个词法分析器。
给定一个带有词法分析器规则的输入文件,它将生成一个 C 文件,其中包含这些规则的词法分析器的实现。
因此,您可以检查 flex 的输出,了解如何用 C 编写词法分析器。也就是说,如果您不仅仅想使用 flex 的词法分析器...
The program flex (a clone of lex) will create a lexer for you.
Given an input file with the lexer rules, it will produce a C file with an implementation of a lexer for those rules.
You can thus check the output of flex for how to write a lexer in C. That is, if you don't just want to use flex's lexer...
您可以使用单个状态变量实现简单的状态转换,例如,如果您想循环状态 start->part1->part2->end 那么您可以使用枚举来跟踪当前状态并使用您想要在每个状态下运行的代码的 switch 语句。
对于依赖于多个变量的更复杂的状态转换,您应该使用如下表/数组:
You can implement simple state transitions with a single state variable, for example if you want to cycle through the states start->part1->part2->end then you can use an enum to keep track of the current state and use a switch statement for the code you want to run in each state.
For more complex state transitions that depend on several variables, you should use tables/arrays like this:
天啊,
假设您指的是有关编译器设计的《The Dragon》一书,我建议您浏览一下此页面 关于编译工具。
该页面本身很小,但提供了有关词法分析器的各种优秀资源的链接。
HTH
欢呼,
G'day,
Assuming you mean The Dragon book on compiler design, I'd recommend having a look around this page on compiler tools.
The page itself is quite small but has links through to various excellent resources on lexical analysers.
HTH
cheers,
有不止一种方法可以做到这一点。 每个正则表达式都直接对应于一个简单的结构化程序。 例如,数字的表达式可能是这样的:
相应的 C 代码将是:
在我看来,构建词法分析器的转换表方式不必要地复杂,并且显然运行速度较慢。
There's more than one way to do it. Every regular expression corresponds directly to a simple structured program. For example, an expression for numbers could be this:
and the corresponding C code would be:
The transition-table way of building lexers is, in my opinion, needlessly complicated, and obviously runs slower.