假设你有 AST，那么编写解释型语言有多难？

发布于 2024-10-17 16:28:02 字数 233 浏览 9 评论 0原文

我已经有了一个针对我一直在研究的语言的解析器。对其进行解释很难吗？我以为这很简单。解析和语法检查已完成。我只有一棵对象树。每次创建对象时，我都会创建一个分支并存储其类型（字符串、int、float、class/obj）。然后，每次将新成员添加到对象时，我都会创建一个分支并重复。

我试着让它听起来简单。我仍然需要检查对象 A 是否可以添加到对象 B 等。

完成 AST 和语法检查后，它实际上相当简单，还是还有很多工作？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

用心笑 2024-10-24 16:28:02

通常，您需要构建符号表并进行类型检查。对于某些语言，您可以即时执行此操作；对于其他人，我认为首先必须进行名称解析和类型检查，否则您将无法很好地解释它（我想到了 C++）。

一旦构建了符号表，您几乎可以通过按执行顺序遍历树并执行运算符所说的操作来编写解释器。基本算术非常简单。字符串和动态存储管理比较困难；你要弄清楚如何处理存储分配和解除分配，对于管理存储的语言，你最终将不得不实现某种垃圾收集器。此时生活很快就会变得复杂。

您可能会发现您的语言需要一些您没有考虑到的功能。异常处理？多项任务？本地范围？拉姆达？倒闭？您很快就会发现现代语言有多少有用之处。

当您开始编写更复杂的程序时，您将需要一个调试器。断点？单步？变量检查？更新？从任意位置开始？读取-求值-打印循环？

您仍然需要将您的语言与外部库联系起来；大多数人想要与控制台和文件交谈；您想要缓冲文件还是一次可以使用 1 个字符以及相应的性能影响？您将争论字符表示（7 位 ascii？8 位？带有非单位宽字符的 UTF8？完整 Unicode？）和标准支持库（字符串连接、搜索、数字转换 [包括双向的精确浮点转换] 、大数算术、浮点陷阱……如果您想要一种有用的编程语言，那么问题清单会很长，

您会发现其他东西可能会消耗一两个数量级。在这里的某个地方，如果你希望任何人使用该语言，你需要记录你所做的所有选择，如果你改变解释器，天堂会帮助你。接下来，有人

会抱怨性能，然后开始后悔自己编写了一个解释器而不是编译器

。仅仅触及了表面。如果您确实接受了这一点，您将学会真正欣赏现代语言提供的开箱即用的功能，以及提供它需要付出多少努力。

Typically you need to build up symbol tables and do type checking. For some langauges, you can do this on the fly; for others, I think pretty much have to the name resolution and type checking pretty much first or you won't be able to interpret it well (C++ comes to mind).

Once you have symbol tables constructed, you can pretty much write an interpreter by walking the tree in exeuction order and doing what the operators say. Basic arithmetic is pretty easy. String and dynamic storage management is harder; you to figure out how you are going to handle storage allocation and deallocatoin, and for langauages that manage storage, you'll end up have to implement some kind of garbage collector. Life gets complicated quickly at this point.

You'll likely discover your langauage needs features you didn't consider. Exception handling? Multiple asignments? Local scopes? Lambdas? Closures? You'll find out pretty quickly how much modern languages have that make them useful.

As you start to write more complicated programs, you'll need a debugger. Breakpoints? Single step? Variable inspection? Update? Start at arbitrary places? Read-eval-print loop?

You still need to tie you language to external libraries; most people want to talk to consoles and files; do you want buffered files or are you OK with 1 character at a time and the corresponding performance hit? You'll get to argue with characater representations (7 bit ascii? 8 bit? UTF8 with non-unit wide characters? Full Unicode?) and standard support libraries (string concatentation, search, number conversions [including accurate floating point conversions both ways], large number arithmetic, floating point traps, .... The list of issues gets pretty long if you want a useful programming language.

The interpreter core will likely be pretty small. Youll find the other stuff probably burns one or two orders of magnitude more effort. Somewhere in here, if you want anybody to use the langauge, you need to document all the choices you made. And heaven help you if you change the interpreter a little bit after somebody gets a big application running.

Next, somebody will complain about performance. Now you get to tune your implementation, and start regrettng the fact that you wrote an interpreter instead of a compiler.

Enjoy. If you have an AST, you've barely scratched the surface. If you do take this on, you'll learn to really appreciate what modern languages provide out of the box, and how much effort it to took to provide it.

回复收藏 0 原文

苏璃陌 2024-10-24 16:28:02

这取决于您想要为其编写解释器的语言的复杂程度以及您选择的工具。简单的解释器很简单。

将以下内容视为 Haskell 中支持高阶函数和数字的语言的 AST 定义：

data Exp = Lam String Exp 
         | App Exp Exp 
         | Var String 
         | Num Int

现在您可以为它编写一个解释器作为简单的“eval”函数：

eval (App e1 e2) env = (unF (eval e1 env)) (eval e2 env)
eval (Lam x e) env   = F (\v -> (eval e ((x,v):env)))
eval (Num n) env     = N n
eval (Var x) env     = case (lookup x env) of 
                         Just v -> v
                         Nothing -> error ("Unbound variable " ++ x)

就是这样。一些无聊的支持定义如下。

data Val = F (Val -> Val)  | N Int
unF (F x) = x
instance Show Val where 
    show (F _) = "<procedure>"
    show (N n) = show n

换句话说，如果将上述三个代码块复制粘贴到 Haskell 源文件中，您将拥有一个工作解释器，您可以使用 ghci 对其进行测试，如下所示：

*Main> eval (App (Lam "x" (Var "x")) (Num 1)) []
1
*Main> eval (Var "x") []
*** Exception: Unbound variable x

您可以阅读经典 SICP 或 EOPL 或小书。如果您想构建一种类型化语言，您可能需要阅读更多内容。

也就是说，如果你要构建语言，我强烈建议你先大量阅读。一方面，这是非常有益的。其次，世界上有太多可怕的语言是由那些不知道如何创造语言的人造成的（其中许多语言由于各种历史原因而变得非常流行），而我们却被它们困住了。

It depends on how complex a language you want to write an interpreter for and your choice of tools. Simple interpreters are simple.

Consider the following as the definition on an AST in Haskell for a language that supports higher order functions and numbers:

data Exp = Lam String Exp 
         | App Exp Exp 
         | Var String 
         | Num Int

Now you can write an interpreter for it as the simple "eval" function:

eval (App e1 e2) env = (unF (eval e1 env)) (eval e2 env)
eval (Lam x e) env   = F (\v -> (eval e ((x,v):env)))
eval (Num n) env     = N n
eval (Var x) env     = case (lookup x env) of 
                         Just v -> v
                         Nothing -> error ("Unbound variable " ++ x)

And that's it. The few boring supporting definitions are as follows.

data Val = F (Val -> Val)  | N Int
unF (F x) = x
instance Show Val where 
    show (F _) = "<procedure>"
    show (N n) = show n

In other words, if you copy paste the above three blocks of code into a Haskell source file you will have a working interpreter, which you can test using ghci as follows:

*Main> eval (App (Lam "x" (Var "x")) (Num 1)) []
1
*Main> eval (Var "x") []
*** Exception: Unbound variable x

You could read about creating languages in the classic SICP or EOPL or the little book. If you wish to build a typed language, you may have to read some more.

That said, if you are going to build languages, might I strongly recommend lots of reading first. For one, its very rewarding. And secondly too many hideous languages have been inflicted on the world by people who don't know how to create languages (many of which have become very popular for various historical reasons) and we are stuck with them.

回复收藏 0 原文