我将如何用 C 语言编写解释器?

发布于 2024-11-27 01:41:54 字数 1436 浏览 0 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

旧城空念 2024-12-04 01:41:54

开始编写解释器的一个好方法是编写一个简单的机器模拟器。这是一种简单的语言,您可以为其编写解释器:

该语言有一个堆栈和 6 条指令:

push # 将数字压入堆栈

pop # pop off the first number on the stack

add # 弹出堆栈顶部的 2 项,并将它们的总和压入堆栈。 (请记住,您可以添加负数,因此您也可以使用减法)。您还可以通过使用其他一些指令与此指令创建循环来获得乘法。

ifeq

# 检查栈顶,如果为0,则继续,否则跳转到

,其中

< /code> 是行号

jump

# 跳转到行号

print # 打印栈顶的值

dup # 推送副本将堆栈顶部的内容放回到堆栈中。

一旦您编写了一个可以接受这些指令并执行它们的程序,您实际上就创建了一个非常简单的基于堆栈的虚拟机。由于这是一种非常低级的语言,因此您不需要了解 AST 是什么、如何将语法解析为 AST、并将其翻译为机器代码等。这对于教程项目来说太复杂了。从这里开始,一旦创建了这个小虚拟机,您就可以开始考虑如何将一些常见的构造转换到这台机器中。例如,您可能想考虑如何将 C if/else 语句或 while 循环翻译成这种语言。

编辑:

从下面的评论来看,听起来您需要更多的 C 经验才能解决此任务。

我建议首先了解以下主题:

  • scanf、printf、putchar、getchar - 基本 C IO 函数
  • struct - C 中的基本数据结构
  • 的区别
  • malloc - 如何分配内存,以及堆栈内存和堆内存链接 列表 - 以及如何实现堆栈,然后可能是二叉树(您需要
    首先了解结构和 malloc)

然后再了解一下 string.h 库也会很好
- strcmp、strdup - 几个有用的字符串函数。

简而言之,与Python相比,C的学习曲线要​​高得多,只是因为它是一种较低级别的语言,并且你必须管理自己的内存,所以在尝试编写解释器之前先学习一些关于C的基本知识是很好的,即使如果你已经知道如何用 python 编写一个。

A great way to get started writing an interpreter is to write a simple machine simulator. Here's a simple language you can write an interpreter for:

The language has a stack and 6 instructions:

push <num> # push a number on to the stack

pop # pop off the first number on the stack

add # pop off the top 2 items on the stack and push their sum on to the stack. (remember you can add negative numbers, so you have subtraction covered too). You can also get multiplication my creating a loop using some of the other instructions with this one.

ifeq <address> # examine the top of the stack, if it's 0, continue, else, jump to <address> where <address> is a line number

jump <address> # jump to a line number

print # print the value at the top of the stack

dup # push a copy of what's at the top of the stack back onto the stack.

Once you've written a program that can take these instructions and execute them, you've essentially created a very simple stack based virtual machine. Since this is a very low level language, you won't need to understand what an AST is, how to parse a grammar into an AST, and translate it to machine code, etc. That's too complicated for a tutorial project. Start with this, and once you've created this little VM, you can start thinking about how you can translate some common constructs into this machine. e.g. you might want to think about how you might translate a C if/else statement or while loop into this language.

Edit:

From the comments below, it sounds like you need a bit more experience with C before you can tackle this task.

What I would suggest is to first learn about the following topics:

  • scanf, printf, putchar, getchar - basic C IO functions
  • struct - the basic data structure in C
  • malloc - how to allocate memory, and the difference between stack memory and heap memory
  • linked lists - and how to implement a stack, then perhaps a binary tree (you'll need to
    understand structs and malloc first)

Then it'll be good to learn a bit more about the string.h library as well
- strcmp, strdup - a couple useful string functions that will be useful.

In short, C has a much higher learning curve compared to python, just because it's a lower level language and you have to manage your own memory, so it's good to learn a few basic things about C first before trying to write an interpreter, even if you already know how to write one in python.

腹黑女流氓 2024-12-04 01:41:54

解释器和编译器之间的唯一区别是,您不是从 AST 生成代码,而是在 VM 中执行它。一旦你理解了这一点,几乎任何编译器书籍,甚至红龙书第一版不是第二版!)就足够了。

The only difference between an interpreter and a compiler is that instead of generating code from the AST, you execute it in a VM instead. Once you understand this, almost any compiler book, even the Red Dragon Book (first edition, not second!), is enough.

无远思近则忧 2024-12-04 01:41:54

我发现这有点晚了,但是,当我搜索编写解释器时,该线程出现在结果列表中的第二位,并且没有人提到任何非常具体的内容,我将提供以下示例:

免责声明:这只是我匆忙编写的一些简单代码,以便为下面的解释奠定基础,因此并不完美,但它可以编译并运行,并且似乎给出了预期的答案。

阅读从下到上遵循 C 代码:

#include <stdio.h>
#include <stdlib.h>

double expression(void);

double vars[26]; // variables

char get(void) { char c = getchar(); return c; } // get one byte
char peek(void) { char c = getchar(); ungetc(c, stdin); return c; } // peek at next byte
double number(void) { double d; scanf("%lf", &d); return d; } // read one double

void expect(char c) { // expect char c from stream
    char d = get();
    if (c != d) {
        fprintf(stderr, "Error: Expected %c but got %c.\n", c, d);
    }
}

double factor(void) { // read a factor
    double f;
    char c = peek();
    if (c == '(') { // an expression inside parantesis?
        expect('(');
        f = expression();
        expect(')');
    } else if (c >= 'A' && c <= 'Z') { // a variable ?
        expect(c);
        f = vars[c - 'A'];
    } else { // or, a number?
        f = number();
    }
    return f;
}

double term(void) { // read a term
    double t = factor();
    while (peek() == '*' || peek() == '/') { // * or / more factors
        char c = get();
        if (c == '*') {
            t = t * factor();
        } else {
            t = t / factor();
        }
    }
    return t;
}

double expression(void) { // read an expression
    double e = term();
    while (peek() == '+' || peek() == '-') { // + or - more terms
        char c = get();
        if (c == '+') {
            e = e + term();
        } else {
            e = e - term();
        }
    }
    return e;
}

double statement(void) { // read a statement
    double ret;
    char c = peek();
    if (c >= 'A' && c <= 'Z') { // variable ?
        expect(c);
        if (peek() == '=') { // assignment ?
            expect('=');
            double val = expression();
            vars[c - 'A'] = val;
            ret = val;
        } else {
            ungetc(c, stdin);
            ret = expression();
        }
    } else {
        ret = expression();
    }
    expect('\n');
    return ret;
}

int main(void) {
    printf("> "); fflush(stdout);

    for (;;) {
        double v = statement();
        printf(" = %lf\n> ", v); fflush(stdout);
    }
    return EXIT_SUCCESS;
}

这是一个简单的递归下降解析器,用于支持单字母变量的基本数学表达式。运行它并输入一些语句会产生以下结果:

> (1+2)*3
 = 9.000000
> A=1
 = 1.000000
> B=2
 = 2.000000
> C=3
 = 3.000000
> (A+B)*C
 = 9.000000

您可以更改 get()、peek() 和 number() 以从文件或代码行列表中读取。您还应该创建一个函数来读取标识符(基本上是单词)。然后,扩展 statements() 函数,以便能够更改接下来运行的行以进行分支。最后你将你想要的分支操作添加到语句函数中,

if "condition" then 
    "statements" 
else 
    "statements" 
endif. 

while "condition" do
    "statements"
endwhile

function fac(x)
   if x = 0 then
      return 1
   else
      return x*fac(x-1) 
   endif
endfunction

显然你可以决定你喜欢的语法。您需要考虑定义函数的方式以及如何处理参数/参数变量、局部变量和全局变量。如果更喜欢数组和数据结构。参考文献∕指针。输入/输出?
为了处理递归函数调用,您可能需要使用堆栈。

在我看来,使用 C++ 和 STL 来完成所有这些工作会更容易。例如,一个 std::map 可用于保存局部变量,另一个映射可用于全局变量...

当然可以编写一个编译器,从代码中构建抽象语法树。然后遍历这棵树以生成机器代码或在虚拟机(如 Java 和 .Net)上执行的某种字节代码。这比天真地逐行解析并执行它们提供了更好的性能,但在我看来,这不是编写解释器。即编写编译器及其目标虚拟机。

如果有人想学习编写解释器,他们应该尝试制作最基本的简单实用的工作解释器。

I see this is a bit of a late reply, however since this thread showed up at second place in the result list when I did a search for writing an interpreter and no one have mentioned anything very concrete I will provide the following example:

Disclaimer: This is just some simple code I wrote in a hurry in order to have a foundation for the explanation below and are therefore not perfect, but it compiles and runs, and seems to give the expected answers.

Read the following C-code from bottom to top:

#include <stdio.h>
#include <stdlib.h>

double expression(void);

double vars[26]; // variables

char get(void) { char c = getchar(); return c; } // get one byte
char peek(void) { char c = getchar(); ungetc(c, stdin); return c; } // peek at next byte
double number(void) { double d; scanf("%lf", &d); return d; } // read one double

void expect(char c) { // expect char c from stream
    char d = get();
    if (c != d) {
        fprintf(stderr, "Error: Expected %c but got %c.\n", c, d);
    }
}

double factor(void) { // read a factor
    double f;
    char c = peek();
    if (c == '(') { // an expression inside parantesis?
        expect('(');
        f = expression();
        expect(')');
    } else if (c >= 'A' && c <= 'Z') { // a variable ?
        expect(c);
        f = vars[c - 'A'];
    } else { // or, a number?
        f = number();
    }
    return f;
}

double term(void) { // read a term
    double t = factor();
    while (peek() == '*' || peek() == '/') { // * or / more factors
        char c = get();
        if (c == '*') {
            t = t * factor();
        } else {
            t = t / factor();
        }
    }
    return t;
}

double expression(void) { // read an expression
    double e = term();
    while (peek() == '+' || peek() == '-') { // + or - more terms
        char c = get();
        if (c == '+') {
            e = e + term();
        } else {
            e = e - term();
        }
    }
    return e;
}

double statement(void) { // read a statement
    double ret;
    char c = peek();
    if (c >= 'A' && c <= 'Z') { // variable ?
        expect(c);
        if (peek() == '=') { // assignment ?
            expect('=');
            double val = expression();
            vars[c - 'A'] = val;
            ret = val;
        } else {
            ungetc(c, stdin);
            ret = expression();
        }
    } else {
        ret = expression();
    }
    expect('\n');
    return ret;
}

int main(void) {
    printf("> "); fflush(stdout);

    for (;;) {
        double v = statement();
        printf(" = %lf\n> ", v); fflush(stdout);
    }
    return EXIT_SUCCESS;
}

This is an simple recursive descend parser for basic mathematical expressions supporting one letter variables. Running it and typing some statements yields the following results:

> (1+2)*3
 = 9.000000
> A=1
 = 1.000000
> B=2
 = 2.000000
> C=3
 = 3.000000
> (A+B)*C
 = 9.000000

You can alter the get(), peek() and number() to read from a file or list of code lines. Also you should make a function to read identifiers (basically words). Then you expand the statement() function to be able to alter which line it runs next in order to do branching. Last you add the branch operations you want to the statement function, like

if "condition" then 
    "statements" 
else 
    "statements" 
endif. 

while "condition" do
    "statements"
endwhile

function fac(x)
   if x = 0 then
      return 1
   else
      return x*fac(x-1) 
   endif
endfunction

Obviously you can decide the syntax to be as you like. You need to think about ways of define functions and how to handle arguments/parameter variables, local variables and global variables. If preferable arrays and data structures. References∕pointers. Input/output?
In order to handle recursive function calls you probably need to use a stack.

In my opinion this would be easier to do all this with C++ and STL. Where for example one std::map could be used to hold local variables, and another map could be used for globals...

It is of course possible to write a compiler that build an abstract syntax tree out of the code. Then travels this tree in order to produce either machine code or some kind of byte code which executed on a virtual machine (like Java and .Net). This gives better performance than naively parse line by line and executing them, but in my opinion that is NOT writing an interpreter. That is writing both a compiler and its targeted virtual machine.

If someone wants to learn to write an interpreter, they should try making the most basic simple and practical working interpreter.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文