在解释器中,词法分析器之后(通常)是什么?
对于编程语言解释器,我想知道解释器经历的事件顺序。例如,我认为事情是这样的:
- 解释器获取一些输入 词法
- 分析器/分词器获取输入并划分标记
- x 获取标记列表
- ???
- 代码被执行
??? 中属于哪些步骤?点,以及替换 x 的位置(即,什么接收词法分析器生成的标记并对其进行操作)?
For a programming language interpreter, I am wondering about the sequence of events that the interpreter goes through. For instance, I think this is how it goes:
- Interpreter gets some input
- The lexer/tokenizer gets the input and demarcates the tokens
- The x gets the list of tokens
- ???
- The code gets executed
What step(s) belongs in the ??? spot, and what goes in the place of x (that is, what recieves and operates on the tokens that the lexer has produced)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我首先推荐经典且免费的书籍:计算机的结构与解释计划(视频讲座)
Lisp 是基线解释器,其他一切在某种程度上都是该主题的变体。
一般来说,步骤如下:
最后一步是“代码被执行”。对于编译语言或即时 (JIT) 语言,最后一步是将 AST 翻译回机器指令。注意可能存在的其他两个步骤也很重要。一种是翻译成更简单的语言,如 c--、LLVM、.NET 或 Java bitcode。另一个是解析器和评估器之间可能发生的脱糖和/或优化。例如,Haskell 就因大量脱糖而臭名昭著。
最后,我将鼓励您尝试编写一个Scheme(Lisp 的一种方言)解释器的众多演练中的一个。网上某处可能有适合您最喜欢的语言的语言。
I'll start by recommending the classic and free book: Structure and Interpretation of Computer Programs (video lectures)
Lisp is the baseline interpreter and everything else is a in some ways a variation on the theme.
In general the steps are:
data
(numbers, strings, constants, variables) nodes are either primitive functions (math, data manipulation, control structures) or higher level compound functions. Each node should reduce to something that can be fed directly into the node above it.This last step is "code gets executed." For a compiled or Just-In-Time (JIT) language the last step is instead translating the AST back into machine instructions. It is also important to note two other steps that may be present. One is a translation into a simpler language like c--, LLVM, .NET, or Java bitecode. The other is desugaring and/or optimization that can happen between the parser and the evaluator. Haskell for example is somewhat infamous for the huge amount of desugaring that goes on.
I'll end by encouraging you to try one of the many walk-throughs for writing a Scheme (a dialect of Lisp) interpreter. There is likely one for your favorite language on the net somewhere.
进行解析,将标记流转换为结构化的、经过验证的语法信息。如果你想计算一个算术表达式:
(
x
+
4
)
*
3
您不是通过从左到右扫描令牌来完成的。您需要弄清楚操作顺序。您需要将
if
关键字和{
}
大括号之间的标记转换为描述 if 语句的高级结构,以便您无需处理一堆代币即可对其进行评估。并且您需要检查语法,如果不正确解析它,这基本上是不可能的;请阅读上下文无关语法。上面的表达式将成为一个抽象语法树,如下所示:
评估它非常简单 - 只需遍历树,然后查找
x
和y
> 在环境中。类似地,给定一系列如下语句:
if
(
p
&&
q< /code>
)
{
foo
;
bar
;
}
其他
{
baz
;
}
抽象语法树可能具有以下一般结构:
希望您能够想象如何遍历这棵树来解释代码。
我强烈推荐的一本关于解释器的教科书是编程语言要点 。
Parsing happens, to turn the stream of tokens into structured, validated, syntactic information. If you want to evaluate, say, an arithmetic expression:
(
x
+
4
)
*
3
you don’t do it by scanning the tokens from left to right. You need to figure out order of operations. You need to turn the tokens between an
if
keyword and the{
}
curly braces into a high-level structure describing the if statement, so you can evaluate it without juggling a pile of tokens. And you need to check the syntax, which is essentially impossible without properly parsing it; please read about context-free grammars.The expression above would become an abstract syntax tree like the following:
Evaluating this is pretty simple — just traverse the tree, and look up
x
andy
in the environment.Similarly, given a series of statements like this:
if
(
p
&&
q
)
{
foo
;
bar
;
}
else
{
baz
;
}
the abstract syntax tree might have the following general structure:
Hopefully you can imagine how you would traverse this tree to interpret the code.
A textbook on interpreters which I highly recommend is Essentials of Programming Languages.
对于解释器来说,解析器通常会做两件事
此后,执行器将在符号表中执行 p 代码和查找标识符等。
解析器解析它接收到的标记流,并生成更简单、更高效的 p-code,同时执行解析过程中发现的任何符号,如变量、函数、复杂数据类型结构等。相位被输入到符号表中并在 p 代码中引用。
然后,执行器处理 p 代码流并执行指令,并使用符号表查找它在符号表中遇到的任何标识符。
For an interpreter the parser will generally do two things
After this the
executor
will execute the p-code and lookup identifiers etc. in the symbol table.The parser parses the stream of tokens it receives and generates the simpler more efficient to execute
p-code
at the same time any symbols like variables, functions, complex data type structures etc. that are found during the parsing phase are entered into the symbol table and referenced in the p-code.The executor then processes the stream of p-code and executes the instructions and uses the symbol table to lookup any identifiers it encounters in the symbol table.