是“解析”吗? “编译”的子集?

发布于 2024-10-05 05:50:46 字数 291 浏览 0 评论 0原文

当我想到“编译”时,我想到的是将 C++ 代码转换为二进制文件。或者也许将 C# 转换为 CLR 字节代码。但“解析”可能类似于解析 Python 或 Web 模板语言,它不需要生成任何二进制文件,但可以立即逐条语句执行代码,或直接输出 HTML。

在这两种情况下你基本上都会做同样的任务吗?忽略语言语法,编译 C++ 是否与解析网站模板文件(Django、Smarty 等)或 Python 一样困难?

我想暗示的是,如果我学习“编译”或阅读一本有关“编译”的书,我一定会掌握解析非编译语言的技能吗?

When I think of "compiling" I think of turning C++ code into a binary. Or perhaps C# into CLR byte code. But "parsing" could be something like parsing Python, or a web template language, where it doesn't need to produce any binaries, but can either execute the code immediately, statement-by-statement, or output HTML directly.

Would you basically be doing the same task in either case? Ignoring the language syntax, would compiling C++ be equally as difficult as parsing a website template file (Django, Smarty, whatever), or Python?

What I'm trying to allude at, is if I study "compiling" or read a book on "compiling" will I necessarily pick up the skills to parse non-compiled languages?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

倾`听者〃 2024-10-12 05:50:46

简短回答:解析不是编译的子集。

长答案:一般来说,将源代码转换为另一种格式有 3 个步骤:

  1. Lexin,它将某种形式的输入转换为令牌流。
  2. 解析,将令牌流转换为抽象语法树(AST)。
  3. 编译,将 AST 转换为一组可执行指令(本机代码、字节代码等)。

(对于非常简单的语言,您甚至可能不需要解析器,您可能能够直接编译令牌流,或者您的解析器可以直接输出本机代码。)

因此从像这样的原始字符串开始:

let x = 0
while x < 10
    print x
    x := x + 1

词法分析器将转换将其转换为令牌流,可能是这样的:

[LET; String("x"); EQ; Int(0); NEWLINE; WHILE; String("x");
 LT; VAL(10); ... ]

解析器会将流转换为更有意义的数据结构,即你的抽象语法树:

// AST definition
type expr =
    | Block of expr list
    | Assign of string * expr
    | While of expr * expr
    | Call of string * expr list
    | Add of expr * expr
    | Var of string
    | Int of int

// AST instance created from token stream
Block
    [
        Assign("x", Int(10));
        While
        (
            LessThan(Var("x"), Int(10)),
            Block
                [
                    Call("print", [Var("x")]);
                    Assign("x", Add(Var("x"), Int(1)));
                ]
        );
    ]

一旦你有了 AST,你就可以用它做任何它想做的事情:

  • 你将 AST 转换为原生代码(编译)。
  • 或者您可以即时解释 AST,这可以使用动态编程语言或模板引擎来完成。
  • 或者你可以迭代 AST 来制作语法荧光笔。
  • 或者您可以遍历 AST 并以另一种语言输出等效代码。
  • 或者您可以查找 Var("x") 的所有实例,并将它们替换为 Var("y")(类似于代码重构工具)。

因此,虽然您通常在编译之前解析输入,但这并不等同于说解析是编译的子集。

Short answer: parsing is not a subset of compiling.

Long answer: generally, there are a 3 steps to converting source to another format:

  1. Lexing, which converts some form of input to a token stream.
  2. Parsing, which converts the token stream into an abstract syntax tree (AST).
  3. Compiling, which converts the AST into a set of executable instructions (native code, byte code, etc).

(For very simple languages, you may not even need a parser, you might be able to compile the token stream directly, or your parser could output native code directly.)

So start with a raw string like this:

let x = 0
while x < 10
    print x
    x := x + 1

A lexer is going to convert it into a token stream, probably something like this:

[LET; String("x"); EQ; Int(0); NEWLINE; WHILE; String("x");
 LT; VAL(10); ... ]

The parser will convert the stream into a more meaningful data structure, your abstract syntax tree:

// AST definition
type expr =
    | Block of expr list
    | Assign of string * expr
    | While of expr * expr
    | Call of string * expr list
    | Add of expr * expr
    | Var of string
    | Int of int

// AST instance created from token stream
Block
    [
        Assign("x", Int(10));
        While
        (
            LessThan(Var("x"), Int(10)),
            Block
                [
                    Call("print", [Var("x")]);
                    Assign("x", Add(Var("x"), Int(1)));
                ]
        );
    ]

Once you have an AST, you can do whatever it wants with it:

  • You convert the AST to native code (compiling).
  • or you could interpret the AST on the fly, which you might do with a dynamic programming language or a templating engine.
  • or you could iterate the AST to make a syntax highlighter.
  • or you could walk the AST and output equivalent code in another language.
  • or you could look for all instances of Var("x") and replace them with Var("y") similar to a code refactor tool).

So, while you usually parse input before compiling, that's not the same as saying that parsing is a subset of compiling.

黄昏下泛黄的笔记 2024-10-12 05:50:46

不,解析和编译可以完全独立。

  • 解析器可能根本不发出任何代码。它可以解析一些数据对象(JSON,XML,等等)
  • 编译器可能没有源代码来启动 - 它可以提供一个已经解析的抽象语法树,并且只需要发出相关代码

大多数编译器包括一个解析步骤,但我认为它不一定是编译的“子集”,并且解析当然与编译没有任何关系。

No, parsing and compiling can be completely independent.

  • A parser may not be emitting any code at all. It could be parsing some data object (JSON, XML, whatever)
  • A compiler may not have source code to start with - it could be presented with an abstract syntax tree, already parsed, and just have to emit the relevant code

Most compilers include a parsing step, but I don't think it's necessarily a "subset" of compiling, and parsing certainly doesn't have to have anything to do with compilation.

梦里°也失望 2024-10-12 05:50:46

...“我会学习解析非编译语言的技能吗?”是的,你会的,但你可以自己研究解析。

然而,您会发现,大部分编译(名称解析、类型推断、模式匹配、编译为指令 [pcode 而不是机器代码]、高性能执行、针对特殊情况进行优化)在处理中很有用非编译语言。因此,如果您打算做的不仅仅是字面上的解析,那么无论如何您都会想要研究编译器技术。

..."will I pick up skills to parse non-compiled languages?" Yes, you will, but you can study parsing by itself.

What you will find, however, is that much of compiling (name resolution, type inference, pattern matching, compiling to instructions [pcode rather than machine code], high-performance execution, optimizing for special cases) is useful in processing non-compiled languages. So if you intend to do more than just literally parse, you'll want to study compiler technology anyway.

神仙妹妹 2024-10-12 05:50:46

编译实际上比解析更困难,因为它只是编译的初步步骤之一。

解析之后,会生成一个符号表,从中生成实际的二进制代码。

在诸如 Javascript 之类的解释语言中,可以在解析每个语句时执行语句。

http://en.wikipedia.org/wiki/Parsing

Compiling is actually more difficult than parsing since its just one of the preliminary steps in compiling.

After the parsing, a symbol table is generated from which the actual binary code is generated.

In interpreting languages such as Javascript, the statements can be executed as each statement is parsed.

http://en.wikipedia.org/wiki/Parsing

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文