当前位置：文江博客话题详情

创建简单的领域特定语言

发布于 2024-12-01 03:00:22 字数 378 浏览 7 评论 0原文

我很想了解如何创建特定于领域的语言。目前，该域非常基本，只有一些变量并运行一些循环、if 语句。

编辑：该语言将是非英语的，具有非常简单的语法。

我正在考虑针对 Java 虚拟机，即编译为 Java 字节代码。

目前我知道如何使用 ANTLR 编写一些简单的语法。

我知道 ANTLR 创建了一个词法分析器和解析器，但我该如何继续？

关于语义分析：它必须手动编写还是有一些工具可以创建它？
词法分析器和解析器的输出如何转换为 Java 字节码？
我知道有像 ASM 或 BCEL 这样的库，但确切的过程是什么？
有这样做的框架吗？如果有的话，最简单的是什么？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我家小可爱 2024-12-08 03:00:22

您应该尝试 Xtext，一个基于 Eclipse 的 DSL 工具包。版本 2 非常强大且稳定。从其主页上您可以找到大量资源来帮助您入门，包括一些视频教程。由于 Eclipse 生态系统围绕 Java 运行，因此它似乎是您的最佳选择。

您也可以尝试MPS，但这是一个投影编辑器，初学者可能会觉得比较困难。尽管如此，它的功能并不比 Xtext 弱。

回复收藏 0 原文

只是在用心讲痛 2024-12-08 03:00:22

如果您的目标是尽可能多地了解编译器，那么您确实必须走一条艰难的路 - 编写一个临时解析器（没有 antlr 等），编写您自己的语义传递和您自己的代码生成。

否则，您最好使用 DSL 来扩展现有的可扩展语言，重用其解析器、语义和代码生成功能。例如，您可以在 Clojure 宏之上轻松实现几乎任意复杂的 DSL（然后 Clojure 本身会被转换为 JVM，您将免费获得它）。

回复收藏 0 原文

无所的.畏惧 2024-12-08 03:00:22

具有简单语法的 DSL 可能意味着也可能不意味着简单的语义。

简单的语义可能意味着也可能不意味着容易翻译成目标语言；仅当 DSL 和目标语言共享许多通用数据类型和执行模型时，这种翻译“技术上很容易”。（约束系统具有简单的语义，但将它们翻译成 Fortran 真的很难！）。（你一定想知道：如果翻译你的 DSL 很容易，你为什么要拥有它？）

如果你想构建一个 DSL（在你的情况下，你坚持使用简单，因为你正在学习），你需要 DSL 编译器基础设施来满足你的需要其中包括对困难翻译的支持。将所有 DSL 翻译成所有可能的目标语言“需要什么”显然是一套大得不可思议的机器。

然而，有很多明显的东西是有帮助的：

强大的解析机制（谁想要欺骗结构被强制的语法）
由于解析机制的弱点？（如果您不知道这是什么，请阅读 LL(1) 语法作为示例）。
自动构建已解析 DSL 的表示（例如抽象语法树）
能够访问/修改/构建新 AST
能够捕获有关符号及其含义的信息（符号表）
能够为 DSL 构建 AST 分析，支持需要的翻译
来自树中“远处”的信息，影响树中特定点的翻译
能够轻松地重新组织 AST 以实现局部优化
能够构建/分析控制和数据流信息（如果 DSL 具有某些程序方面，并且代码生成需要深度推理或优化

大多数可用于“构建 DSL 生成器”的工具都提供某种解析，也许是树构建，然后让您填写其余所有内容。这使您拥有一个小而干净的 DSL，但需要很长时间才能实现它。那不好。您确实想要所有这些基础设施。

我们的 DMS 软件重组工具包拥有上述所有基础设施以及更多功能。（显然没有，也不可能有月亮）。您可以看到一个完整的、一体化的“页面”练习的简单 DSL 示例这台机器的一些有趣的部分。

A DSL with simple syntax may or may not mean simple semantics.

Simple semantics may or may not mean easy translation to a target language; such translations are "technically easy" only if the DSL and the target languate share a lot of common data types and execution models. (Constraint systems have simple semantics, but translating them to Fortran is really hard!). (You gotta wonder: if translating your DSL is easy, why do you have it?)

If you want to build a DSL (in your case you stick with easy because you are learning), you want DSL compiler infrastructure that has whatever you need in it, including support for difficult translations. "What is needed" to handle translating all DSLs to all possible target languages is clearly an impossibly large set of machinery.

However, there is a lot which is clear that can be helpful:

Strong parsing machinery (who wants to diddle with grammars whose structure is forced
by the weakness of the parsing machinery? (If you don't know what this is, go read about LL(1) grammmars as an example).
Automatic construction of a representation (e.g, an abstract syntax tree) of the parsed DSL
Ability to access/modify/build new ASTs
Ability to capture information about symbols and their meaning (symbol tables)
Ability to build analyses of the AST for the DSL, to support translations that require
informatoin from "far away" in the tree, to influence the translation at a particular point in the tree
Ability to reogranize the AST easily to achieve local optimizations
Ability to consturct/analysis control and dataflow information if the DSL has some procedural aspects, and the code generation requires deep reasoning or optimization

Most of the tools available for "building DSL generators" provide some kind of parsing, perhaps tree building, and then leave you to fill in all the rest. This puts you in the position of having a small, clean DSL but taking forever to implement it. That's not good. You really want all that infrastructure.

Our DMS Software Reengineering Toolkit has all the infrastructure sketched above and more. (It clearly doesn't, and can't have the moon). You can see a complete, all-in-one-"page", simple DSL example that exercises some ineresting parts of this machinery.

回复收藏 0 原文

~没有更多了~