创建简单的领域特定语言
我很想了解如何创建特定于领域的语言。目前,该域非常基本,只有一些变量并运行一些循环、if 语句。
编辑:该语言将是非英语的,具有非常简单的语法。
我正在考虑针对 Java 虚拟机,即编译为 Java 字节代码。
目前我知道如何使用 ANTLR 编写一些简单的语法。
我知道 ANTLR 创建了一个词法分析器和解析器,但我该如何继续?
- 关于语义分析:它必须手动编写还是有一些工具可以创建它?
- 词法分析器和解析器的输出如何转换为 Java 字节码?
- 我知道有像 ASM 或 BCEL 这样的库,但确切的过程是什么?
- 有这样做的框架吗?如果有的话,最简单的是什么?
I am curious to learn about creating a domain specific language. For now the domain is quite basic, just have some variables and run some loops, if statements.
Edit :The language will be Non-English based with a very simple syntax .
I am thinking of targeting the Java Virtual Machine, ie compile to Java byte code.
Currently I know how to write some simple grammars using ANTLR.
I know that ANTLR creates a lexer and parser but how do I go forward from here?
- about semantic analysis: does it have to be manually written or are there some tools to create it?
- how can the output from the lexer and parser be converted to Java byte code?
- I know that there are libraries like ASM or BCEL but what is the exact procedure?
- are there any frameworks for doing this? And if there is, what is the simplest one?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您应该尝试 Xtext,一个基于 Eclipse 的 DSL 工具包。版本 2 非常强大且稳定。从其主页上您可以找到大量资源来帮助您入门,包括一些视频教程。由于 Eclipse 生态系统围绕 Java 运行,因此它似乎是您的最佳选择。
您也可以尝试MPS,但这是一个投影编辑器,初学者可能会觉得比较困难。尽管如此,它的功能并不比 Xtext 弱。
You should try Xtext, an Eclipse-based DSL toolkit. Version 2 is quite powerful and stable. From its home page you have plenty of resources to get you started, including some video tutorials. Because the Eclipse ecosystem runs around Java, it seems the best choice for you.
You can also try MPS, but this is a projectional editor, and beginners may find it more difficult. It is nevertheless not less powerful than Xtext.
如果您的目标是尽可能多地了解编译器,那么您确实必须走一条艰难的路 - 编写一个临时解析器(没有 antlr 等),编写您自己的语义传递和您自己的代码生成。
否则,您最好使用 DSL 来扩展现有的可扩展语言,重用其解析器、语义和代码生成功能。例如,您可以在 Clojure 宏之上轻松实现几乎任意复杂的 DSL(然后 Clojure 本身会被转换为 JVM,您将免费获得它)。
If your goal is to learn as much as possible about compilers, then indeed you have to go the hard way - write an ad hoc parser (no antlr and alike), write your own semantic passes and your own code generation.
Otherwise, you'd better extend an existing extensible language with your DSL, reusing its parser, its semantics and its code generation functionality. For example, you can easily implement an almost arbitrary complex DSL on top of Clojure macros (and Clojure itself is then translated into JVM, you'll get it for free).
具有简单语法的 DSL 可能意味着也可能不意味着简单的语义。
简单的语义可能意味着也可能不意味着容易翻译成目标语言;仅当 DSL 和目标语言共享许多通用数据类型和执行模型时,这种翻译“技术上很容易”。 (约束系统具有简单的语义,但将它们翻译成 Fortran 真的很难!)。 (你一定想知道:如果翻译你的 DSL 很容易,你为什么要拥有它?)
如果你想构建一个 DSL(在你的情况下,你坚持使用简单,因为你正在学习),你需要 DSL 编译器基础设施来满足你的需要其中包括对困难翻译的支持。将所有 DSL 翻译成所有可能的目标语言“需要什么”显然是一套大得不可思议的机器。
然而,有很多明显的东西是有帮助的:
由于解析机制的弱点? (如果您不知道这是什么,请阅读 LL(1) 语法作为示例)。
来自树中“远处”的信息,影响树中特定点的翻译
大多数可用于“构建 DSL 生成器”的工具都提供某种解析,也许是树构建,然后让您填写其余所有内容。这使您拥有一个小而干净的 DSL,但需要很长时间才能实现它。那不好。您确实想要所有这些基础设施。
我们的 DMS 软件重组工具包 拥有上述所有基础设施以及更多功能。 (显然没有,也不可能有月亮)。您可以看到一个完整的、一体化的“页面”练习的简单 DSL 示例这台机器的一些有趣的部分。
A DSL with simple syntax may or may not mean simple semantics.
Simple semantics may or may not mean easy translation to a target language; such translations are "technically easy" only if the DSL and the target languate share a lot of common data types and execution models. (Constraint systems have simple semantics, but translating them to Fortran is really hard!). (You gotta wonder: if translating your DSL is easy, why do you have it?)
If you want to build a DSL (in your case you stick with easy because you are learning), you want DSL compiler infrastructure that has whatever you need in it, including support for difficult translations. "What is needed" to handle translating all DSLs to all possible target languages is clearly an impossibly large set of machinery.
However, there is a lot which is clear that can be helpful:
by the weakness of the parsing machinery? (If you don't know what this is, go read about LL(1) grammmars as an example).
informatoin from "far away" in the tree, to influence the translation at a particular point in the tree
Most of the tools available for "building DSL generators" provide some kind of parsing, perhaps tree building, and then leave you to fill in all the rest. This puts you in the position of having a small, clean DSL but taking forever to implement it. That's not good. You really want all that infrastructure.
Our DMS Software Reengineering Toolkit has all the infrastructure sketched above and more. (It clearly doesn't, and can't have the moon). You can see a complete, all-in-one-"page", simple DSL example that exercises some ineresting parts of this machinery.