That's a great list of goals. Some folks go to college for 4 years to learn that. I have no idea of your background, but I might assume you've done some basic programming (BASIC programming?) and assembly language. If you haven't, that's a place to start. Learning something about grammars and regular expressions and then using that to develop a parser and interpreter for parts of a simple language with an easy grammar, like Pascal, would be a way to learn the front-end. Then move on and add code that generates assembly...the back end.
另外,在空闲时间简单地做一些练习,比如弄清楚如何操作 GCC 将 hello world 编译成少于 X 个字节,将教给你的东西比你想象的要多得多。 (顺便说一句,网上有很多这样的例子)
玩得开心!
My first advice is to read a high level book on the subject. That is, I am assuming you haven't done this yet, and were planning on simply working along with some online tutorials or something. At least for me, I tend to want to dive head first into things like this, but then I quickly feel in over my head and then just give up the project. Making sure I have a really good high level understanding of projects before starting helps me tremendously.
One series I might recommend are the Write Great Code books. I can't vouch for the whole series as I haven't read them all, but my office has them at work and I have used them a number of times to get a pretty good grasp on the subject before I dove headfirst into something. For instance, and one example that may relate directly to your plan, I was needing to understand how the GCC compiler organized the ELF binary it generates, what each section is, and what is stored there. (This was for an embedded system and we were expanding our RAM so I had to reorganize some stuff...)
You said "Nothing too hard"... In my opinion, I think your steps are already pretty difficult, especially if your end goal is to learn about compilers and operating systems. I would skip this whole virtual machine, at least for now. In reality, processors are pretty simple, and based on the fact that that you already know that it just processes a 'machine language' you probably already have a good starting grasp.
I would instead start with step 3 and just writing your own compiler. I took a compilers class in college and by the end of the semester, I had a working Pascal compiler that I built from the ground up using LEX and YACC. It was quite enlightening. You might also look at Bison as its used in conjunction with yacc for such things. I've never used it though.
Also, simply doing little exercises in your free time, like figuring out how to manipulate GCC to compile hello world into less than X number of bytes will teach you a lot more than you think about how that stuff works. (There are quite a few examples of this on the web, by the way)
Writing a FORTH interpreter is a good exercise. It's comparatively simple, and the language and semantics are already well-defined, so you don't need to go and design your own system from scratch. FORTH typically also has a compiler (although it is nothing like a C compiler) and may have an assembler built in, so you could investigate those also. It will give you the mental tools for managing memory, handling pointers, resolving references, and such.
Looking at an existing simple compiler will be helpful also. Once you have internalized what it is that a compiler does - translate one set of symbols into another - then you might want to start looking at parsing grammars and related topics. There's an awful lot of information out there, take it a little at a time or you'll get overwhelmed very easily.
发布评论
评论(3)
这是一个很棒的目标清单。有些人上了四年大学就是为了学习这一点。
我不知道你的背景,但我可能假设你已经完成了一些基本编程(BASIC 编程?)和汇编语言。如果您还没有,那就从这里开始吧。
学习一些语法和正则表达式的知识,然后用它来开发一个
具有简单语法的简单语言的某些部分的解析器和解释器,例如 Pascal,将是学习前端的一种方法。然后继续添加生成程序集的代码......后端。
That's a great list of goals. Some folks go to college for 4 years to learn that.
I have no idea of your background, but I might assume you've done some basic programming (BASIC programming?) and assembly language. If you haven't, that's a place to start.
Learning something about grammars and regular expressions and then using that to develop a
parser and interpreter for parts of a simple language with an easy grammar, like Pascal, would be a way to learn the front-end. Then move on and add code that generates assembly...the back end.
我的第一个建议是阅读一本有关该主题的高水平书籍。也就是说,我假设您还没有这样做,并且计划简单地使用一些在线教程或其他东西。至少对我来说,我倾向于想要一头扎进这样的事情,但很快我就感觉自己无法承受,然后就放弃了这个项目。在开始之前确保我对项目有一个非常好的高层次理解对我有很大帮助。
我可能推荐的一个系列是 编写出色的代码书籍。我不能保证整个系列的内容,因为我还没有全部读过,但我的办公室有它们在工作中,在我一头扎进某件事之前,我已经使用它们很多次来很好地掌握这个主题。例如,有一个可能与您的计划直接相关的示例,我需要了解 GCC 编译器如何组织它生成的 ELF 二进制文件、每个部分是什么以及其中存储的内容。 (这是针对嵌入式系统的,我们正在扩展 RAM,所以我必须重新组织一些东西......)
你说“没什么太难”......在我看来,我认为你的步骤已经相当困难了,特别是如果你最终目标是了解编译器和操作系统。我会跳过整个虚拟机,至少现在是这样。实际上,处理器非常简单,并且基于您已经知道它只处理“机器语言”的事实,您可能已经有了一个很好的入门掌握。
我会从第 3 步开始,编写您自己的编译器。我在大学上了一门编译器课程,到学期结束时,我有了一个可以工作的 Pascal 编译器,它是我使用 LEX 和 YACC。这很有启发性。您还可以查看 Bison,因为它与 yacc 结合使用来完成此类操作。不过我从来没有用过它。
另外,在空闲时间简单地做一些练习,比如弄清楚如何操作 GCC 将 hello world 编译成少于 X 个字节,将教给你的东西比你想象的要多得多。 (顺便说一句,网上有很多这样的例子)
玩得开心!
My first advice is to read a high level book on the subject. That is, I am assuming you haven't done this yet, and were planning on simply working along with some online tutorials or something. At least for me, I tend to want to dive head first into things like this, but then I quickly feel in over my head and then just give up the project. Making sure I have a really good high level understanding of projects before starting helps me tremendously.
One series I might recommend are the Write Great Code books. I can't vouch for the whole series as I haven't read them all, but my office has them at work and I have used them a number of times to get a pretty good grasp on the subject before I dove headfirst into something. For instance, and one example that may relate directly to your plan, I was needing to understand how the GCC compiler organized the ELF binary it generates, what each section is, and what is stored there. (This was for an embedded system and we were expanding our RAM so I had to reorganize some stuff...)
You said "Nothing too hard"... In my opinion, I think your steps are already pretty difficult, especially if your end goal is to learn about compilers and operating systems. I would skip this whole virtual machine, at least for now. In reality, processors are pretty simple, and based on the fact that that you already know that it just processes a 'machine language' you probably already have a good starting grasp.
I would instead start with step 3 and just writing your own compiler. I took a compilers class in college and by the end of the semester, I had a working Pascal compiler that I built from the ground up using LEX and YACC. It was quite enlightening. You might also look at Bison as its used in conjunction with yacc for such things. I've never used it though.
Also, simply doing little exercises in your free time, like figuring out how to manipulate GCC to compile hello world into less than X number of bytes will teach you a lot more than you think about how that stuff works. (There are quite a few examples of this on the web, by the way)
Have Fun!
您已经了解多少编程知识?
编写 FORTH 解释器是一个很好的练习。它相对简单,并且语言和语义已经定义良好,因此您不需要从头开始设计自己的系统。 FORTH 通常还有一个编译器(尽管它与 C 编译器完全不同),并且可能内置一个汇编器,因此您也可以研究它们。它将为您提供管理内存、处理指针、解析引用等的心理工具。
查看现有的简单编译器也会有所帮助。一旦您内部化了编译器的作用(将一组符号翻译为另一组符号),那么您可能需要开始研究解析语法和相关主题。那里的信息非常多,一次只获取一点,否则你很容易就会不知所措。
How much programming do you already know?
Writing a FORTH interpreter is a good exercise. It's comparatively simple, and the language and semantics are already well-defined, so you don't need to go and design your own system from scratch. FORTH typically also has a compiler (although it is nothing like a C compiler) and may have an assembler built in, so you could investigate those also. It will give you the mental tools for managing memory, handling pointers, resolving references, and such.
Looking at an existing simple compiler will be helpful also. Once you have internalized what it is that a compiler does - translate one set of symbols into another - then you might want to start looking at parsing grammars and related topics. There's an awful lot of information out there, take it a little at a time or you'll get overwhelmed very easily.