我想知道部署特定于域的语言的好策略,该语言必须在至少 2 种语言(Java、C#)和可能更多(Python,可能还有 Javascript)下运行。
一些背景。我们开发并部署了一种目前用 C# 编写的领域特定语言。它通过一系列方法调用进行部署,这些方法调用的参数可以是公共语言基元(字符串、双精度等)、集合(IEnumerable、HashSet 等)或特定于域的库中的对象(CMLMolecule、Point3、RealSquareMatrix)。该库经过了良好的测试,并且对象必须符合稳定部署的 XML 模式,因此更改将是渐进的和可管理的(至少这是希望)。
我们希望该语言将被广泛且部分具有计算机知识的社区所使用,这些社区习惯于在没有中央控制的情况下破解他们自己的解决方案。理想情况下,DSL 将创建一定程度的封装并产生他们所需的基本功能。这些库将管理详细的算法,这些算法多种多样,但相当众所周知。领域特定语言与领域特定语言与域特定语言中的 DSL 要求有很多共同点。函数库。
我很欣赏关于最佳架构的想法(显然一旦部署,我们就不能轻易回溯)。这些选择至少包括:
- 创建 IDL(例如通过 CORBA)。 W3C 为 XML DOM 做到了这一点 - 我讨厌它 -
- 为每个平台手动创建类似的签名并尽力保持它们同步似乎有点过分了。
- 创建可解析语言(例如 CSS)。
- XML 中的声明式编程(参见 XSLT)。这是我首选的解决方案,因为它可以被搜索、操作等。
性能并不重要。目的明确是。
编辑 有人讨论过应用程序调用是否构成 DSL。我发现了 Martin Fowler 对 DSL 的介绍 (http://martinfowler.com/dslwip/Intro.html),他认为简单的方法调用(或链式调用)可以称为 DSL。所以像这样的系列
point0 = line0.intersectWith(plane);
point1 = line1.intersectWith(plane);
midpoint = point0.midpoint(point1);
可以被认为是 DSL
I'd like to know good strategies for deploying a domain-specific-language which must run under at least 2 languages (Java, C#) and probably more (Python, and possibly Javascript).
Some background. We have developed and deployed a domain-specific language currently written in C#. It's deployed though a series of method calls whose arguments are either common language primitives (string, double, etc.), Collections (IEnumerable, HashSet, ...) or objects in a domain-specific library (CMLMolecule, Point3, RealSquareMatrix). The library is well tested and the objects have to comply to a stable deployed XML schema so change will be evolutionary and managed (at least that's the hope).
We hope the language will become used by a wide and partially computer-literate community, used to hacking their own solutions without central control. Ideally the DSL will create a degree of encapsulation and produce the essential functionality they need. The libraries will manage the detailed algorithms which are many and varied but fairly well known. There's a lot in common with the requirements of the DSL in Domain-specific languages vs. library of functions.
I'd appreciate ideas on the best architecture (clearly once it's deployed we cannot easily backtrack). The choices include at least:
- Creation of an IDL (e.g. through CORBA). The W3C did this for the XML DOM - I hated it - and it seems to be overkill
- manual creation of similar signatures for each platform and best endeavour to keep them in sync.
- Creation of a parsable language (e.g. CSS).
- declarative programming in XML (c.f. XSLT). This is my preferred solution as it can be searched, manipulated, etc.
Performance is not important. Clarity of purpose is.
EDIT There was discussion as to whether application calls contitute a DSL. I have discovered Martin Fowler's introduction to DSLs (http://martinfowler.com/dslwip/Intro.html) where he argues that simple method calls (or chained calls) can be called a DSL. So a series like:
point0 = line0.intersectWith(plane);
point1 = line1.intersectWith(plane);
midpoint = point0.midpoint(point1);
could be considered a DSL
发布评论
评论(7)
语言和图书馆之间的问题似乎存在一些含糊之处。术语“内部 DSL”和“外部 DSL”很有用,我认为这是由于 Martin Fowler< /a>.
“外部”DSL 可能是一个独立的命令行工具。它传递一个源字符串,它以某种方式解析它,并用它做一些事情。语法和语义的工作方式没有真正的限制。它也可以作为一个库提供,主要由类似 eval 的方法组成;一个常见的示例是将 SQL 查询构建为字符串并调用 RDBMS 库中的
execute
方法;这不是一种非常愉快或方便的使用模式,如果大规模传播到一个程序中,那就很糟糕了。“内部”DSL 是一种库,它的编写方式利用了宿主(通用)语言的特性,给人一种新语言可以嵌入到现有语言中的印象。在语法丰富的语言(C++、C#)中,这意味着以严重扩展(或忽略)运算符符号的通常含义的方式使用运算符重载。 C++中有很多例子; C# 中也有一些 - Irony 解析器工具包 进行模拟BNF 以一种相当克制的方式运作,效果很好。
最后,有一个普通的旧库:类、方法、属性,以及精心选择的名称。
外部 DSL 将允许您完全忽略跨语言集成问题,因为唯一类似库的部分是
eval
方法。但发明自己的工具链并非易事。人们总是忘记调试、智能感知、语法突出显示等的巨大重要性。如果您想在 C# 和 Java 上做得很好,内部 DSL 可能是毫无意义的努力。问题是,如果您利用一种宿主语言的怪癖,您不一定能够在另一种语言上重复这个技巧。例如,Java 没有运算符重载。
这就留下了一个普通的旧图书馆。如果您想要跨越 C# 和 Java(至少),那么您会在选择实现语言方面陷入困境。你真的想把这个库写两次吗?一种可能是用 Java 编写库,然后使用 IKVM 交叉编译它到 .NET 程序集。这将保证您在这两个平台上拥有相同的界面。
不利的一面是,API 将用最低公分母特性来表示 - 也就是说,Java 特性:)。没有属性,只有 getX/setX 方法。避免使用泛型,因为这两个系统在这方面有很大不同。此外,甚至两者之间命名方法的标准方式也不同(
camelCase
与PascalCase
),因此一组用户会闻到老鼠的味道。There seems to be some ambiguity in the question between language and library. The terms "internal DSL" and "external DSL" are useful, and I think are due to Martin Fowler.
An "external" DSL might be a standalone command-line tool. It is passed a string of source, it parses it somehow, and does something with it. There are no real limits on how the syntax and semantics can work. It can also be made available as a library consisting mostly of an
eval
-like method; a common example would be building a SQL query as a string and calling anexecute
method in an RDBMS library; not a very pleasant or convenient usage pattern, and horrible if spread around a program on a large scale.An "internal" DSL is a library that is written in such a way as to take advantage of the quirks of a host (general purpose) language to create the impression that a new language can be embedded inside an existing one. In syntactically-rich languages (C++, C#) this means using operator overloading in ways that seriously stretch (or ignore) the usual meanings of the operator symbols. There are many examples in C++; a few in C# also - the Irony parser toolkit simulates BNF in a fairly restrained way which works well.
Finally, there is a plain old library: classes, methods, properties, with well-chosen names.
An external DSL would allow you to completely ignore cross-language integration problems, as the only library-like portion would be an
eval
method. But inventing your own tool chain is non-trivial. People always forget the huge importance of debugging, intellisense, syntax highlighting etc.An internal DSL is probably a pointless endeavour if you want to do it well on C# and Java. The problem is that if you take advantage of the quirks of one host language, you won't necessarily be able to repeat the trick on another language. e.g. Java has no operator overloading.
Which leaves a plain old library. If you want to span C# and Java (at least), then you are somewhat stuck in terms of a choice of implementation language. Do you really want to write the library twice? One possibility is to write the library in Java, and then use IKVM to cross-compile it to .NET assemblies. This would guarantee you an identical interface on both of those platforms.
On the downside, the API would be expressed in lowest-common-denominator features - which is to say, Java features :). No properties, just getX/setX methods. Steer clear of generics because the two systems are quite different in that respect. Also even the standard way of naming methods differs between the two (
camelCase
versusPascalCase
), so one set of users would smell a rat.如果您愿意使用 ANTLR 重新描述您的语言,您可以生成多种语言的 DSL 解释器,而无需手动维护它们,包括您提到的所有语言以及更多语言。
Antlr 是一个解析器/词法分析器生成器,拥有大量目标语言。这使您可以一次性描述您的语言,而无需维护它的多个副本。
请在此处查看目标语言的完整列表。
If you are willing to re-describe your language using ANTLR you could generate your DSL interpreter in multiple languages without having to manually maintain them including all of the languages you mentioned plus more.
Antlr is a parser/lexer generator and has a large number of target languages. This allows you to describe your language once, without having to maintain multiple copies of it.
See the whole list of target languages here.
虽然我不想过多宣传自己的项目,但还是想提一下PIL,一种平台无关的语言,我一直在研究的一种中间语言,用于支持多个软件平台(如 Java、Python 等),特别是针对外部 DSL。总体思路是用 PIL(Java 的子集)生成代码,然后 PIL 编译器可以将其转换为许多其他语言之一,目前只有 Java 或 Python,但将来会添加更多语言。
大约两天前,我在软件和语言工程会议上发表了一篇关于此的论文,您可以找到 PIL 网站出版物的链接(pil-lang.org),如果您有兴趣。
Although I do not want to promote my own project too much, I would like to mention PIL, a Platform Independent Language, an intermediate language I have been working on to enable the support of multiple software platforms (like Java, Python, ...), specifically for external DSLs. The general idea is that you generate code in PIL (a subset of Java), which the PIL compiler can then translate to one of many other languages, currently just Java or Python, but more will be added in the future.
I presented a paper about this on the Software and Language Engineering conference about 2 days ago, you can find a link to the publication of the PIL website (pil-lang.org), if you're interested.
如果您需要做一些 DSL 不支持的事情,或者出于性能原因(尽管我意识到这不是优先事项),能够转义到实现语言。
我正在研究 DSL,用于在 C# 中的规则引擎中实现规则,有些规则非常复杂,并且将来可能会发生重大变化,因此能够转义到 C# 非常有用。当然,这会破坏跨平台兼容性,但这实际上只是一种绕过边缘情况而无需更改 DSL 的方法。
Ability to escape to the implementation language in the event you need to do something that just isn't supported by your DSL, or for performance reasons (though I realize that isn't a priority).
I am researching DSL for implementing rules in a rule engine in C#, some of the rules are really complex and may change significantly in the future, so being able to escape out to C# is really useful. Of course this breaks cross-platform compatibility, but it is really just a way of hacking around edge cases without having to change your DSL.
您最好用 C(或像 rpython 这样的语言,它将生成 C 代码)编写库,然后使用 SWIG 或类似的工具来生成 C#、Java Python 等语言特定的绑定。
请注意,这种方法不会如果您在浏览器中使用 Javascript,则会有所帮助 - 您必须单独编写 javascript 库。如果您通过 Rhino 使用 javascript,那么您可以只使用 Java 绑定。
You'd be best off writing the library in C (or some language like rpython which will generate C-code) and then using SWIG or similar to generate the language specific bindings for C#, Java Python etc.
Note that this approach won't help if you are using Javascript in the browser - you'll have to write the javascript library separately. If you are using javascript through Rhino, then you'd be able to just use the Java bindings.
可以直接使用脚本引擎从 Java 程序内部解释 JavaScript,显然也可以从 C# 解释 JavaScript。 Python 可以在 JVM 和 .NET 引擎上运行。
我建议您研究这些选项,然后在您选择的语言可用的执行路径的公共子集中编写您的库。我不会考虑用需要后期翻译和转换的语言来编写它,因为您引入的步骤在出现问题时可能非常非常难以调试。
It is possible to interpret JavaScript from inside a Java-program directly using the script engine, and apparently also from C#. Python can be run on the JVM and the .NET engine.
I would suggest that you investigate these options, and then write your library in a common subset of the execution paths available to the language you choose. I would not consider writing it in a language which requires post translation and conversion, since you introduce a step which can be very, very difficult to debug in case of problems.
我想扩展达里恩的答案。我认为 ANTLR 带来了其他词法分析器/解析器工具很少提供的东西(至少据我所知)。如果您想创建一个最终生成 Java 和 C# 代码的 DSL,那么 ANTLR 确实很适合。
ANTLR 提供四个基本组件:
您的词法分析器、解析器和树语法可以保持独立于最终生成的语言。事实上,StringTemplate 引擎支持模板定义的逻辑组。它甚至提供了模板组的接口继承。这意味着您可以让第三方使用您的 ANTLR 解析器来创建 python、程序集、c 或 ruby,而您最初提供的只是 java 和 C# 输出。随着需求随时间的变化,DSL 的输出语言可以轻松扩展。
要充分利用 ANTLR,您需要阅读以下内容:
权威 ANTLR 参考:构建特定领域的语言
语言实现模式:创建您自己的特定领域和通用编程语言
I would like to expand on Darien's answer. I think that ANTLR brings something to the table that few other lexer/parser tools provide (at least to my knowledge). If you would like to create a DSL which ultimately generates Java and C# code, ANTLR really shines.
ANTLR provides four fundamental components:
Your lexer,parser, and tree grammars can remain independent of your final generated language. In fact, the StringTemplate engine supports logical groups of template definitions. It even provides for interface inheritance of template groups. This means you can have third parties use your ANTLR parser to create say python, assembly, c, or ruby, when all you initially provided was java and C# output. The output language of your DSL can easily be extended as requirements change over time.
To get the most out of ANTLR you will want to read the following:
The Definitive ANTLR Reference: Building Domain-Specific Languages
Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages