有没有语言可以指定自动修改代码?

发布于 2024-10-31 08:56:39 字数 456 浏览 7 评论 0原文

我正在做一些工作,我需要能够描述对某些要自动完成的程序代码的修改。

有没有任何语言可以描述这一点?

该语言应该具有接收代码中要进行修改的位置的模块或函数,并且应该允许指定可能要进行的修改。
它应该允许描述修改,例如删除给定函数、在一段代码周围添加 if 条件、添加不执行任何操作的新函数声明等。
修改应该在解析树上完成,以便仅通过修改就可以恢复原始代码。
我什至不需要该语言具有关联的解析器或实现,我所需要的只是语言本身的描述,无论是作为 BNF 语法还是非正式的。

我知道 phc(PHP 提前编译器)能够将源代码转换为 XML 表示形式并返回,从而更容易修改代码并恢复它。 我需要的是一种描述对 XML 的实际修改的方法,以便我可以运行一个程序,例如删除特定函数调用的所有实例,或在每个实例周围添加 if(false)。 此外,如果该语言与语言无关,那就更好了,尽管这不是必需的。

你认为这样的事情存在吗?

I'm doing some work where I need to be able to describe modifications to some program code that are to be done automatically.

Is there any language that allows to describe this?

The language should have modules or functions that receive the location in the code where the modification is to be done and should allow specifying the possible modifications to be done.
It should allow describing modifications such as removing a given function, adding an if condition around a piece of code, adding a new function declaration that does nothing, etc.
The modifications should be done over the parse tree so it is possible to restore the original code, only with the modifications.
I don't even need the language to have a parser or an implementation associated, all I need is the description of the language itself, either as a BNF grammar or even informally.

I know that phc, the PHP ahead of time compiler, is able to transform the source code into a XML representation and back, making it easier to modify the code and restore it.
What I need is a way to describe the actual modifications to the XML so that I can run a program that can for example remove all instances of a specific function call, or add if(false) around each.
Also, it would be better if the language was language-agnostic, although its not a requirement.

Do you think something like this exists?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

稍尽春風 2024-11-07 08:56:39

查看 Semantic Designs 的 DMS 软件重新工程工具包。它可能用于您正在寻找的东西。

Checkout the DMS software reengineering toolkit from Semantic Designs. It may be used for what you are looking for.

中性美 2024-11-07 08:56:39

关键思想是程序转换。 Ondrej 对 DMS 的想法是正确的,但我是 DMS 的作者,所以我可能有偏见。

用于完成转换的DMS语言称为“(DMS)规则规范语言”,或RSL,用于指定(程序转换)规则。这样的规则有:

  • 一个名称(我们往往有很多个,这是引用它们的一种方便的方式)
  • 参数(定义模式变量),根据感兴趣的目标语言语法键入,
  • 左手“match-this” “pattern
  • a right hand”替换为“pattern” 模式

通常以目标语言的表面语法编写,即正在转换的语言的本机语法以及模式变量的扩展。来区分
来自目标语言的 RSL 语言语法,模式写在(元)引号“...”内。模式内的 \ 字符是返回 RSL 的(元)转义。模式变量写作“\x”。一个(元)函数 foobar
写为 \foobar( ... ),注意 (meta)functions 参数上的 (meta)escape。在引号之外,需要元转义符,并且这些构造在编写时不带 \,例如 foobar(...)。

DMS 规则可能比这复杂得多,但这些是基础知识。表面语法模式并不代表文本;而是代表文本。相反,它们实际上代表了模式中代码的等效 AST。 DMS 规则用于匹配和更改 AST。程序转换系统当然必须有解析器来生成 AST,以及反解析器(“prettyprinters”)将 AST 转换回文本。 (DMS 有一个庞大的语言前端库,涵盖地球上所有广泛使用的语言和许多不常见的语言;我们刚刚添加了 MUMPS)。

对于您的具体示例,以下规则将解决问题:

“...删除给定函数”:

rule remove_function(f:IDENTIFIER,p:parameters,b:body): declarations -> declarations
  "  \f \p \b " -> " ; "  -- replace function delcaration by empty declation
  if f==target_function_name();

...在代码块周围添加 if 条件:

rule wrap_in_if(s:statement): statement -> statement
 " \s " ->  " \if ( \generated_condition\(\) ) \s ";

...添加不执行任何操作的新函数声明:

rule insert_noop_function(d:declarations): delcarations -> declarations
" \d " -> " \target_function\name\(\) ( )  { } ";

当您观察到,你必须将它们指向某个地方;这就是“元程序”的工作,它在 AST 中找到想要应用规则的位置,然后应用它们。对于您的规则,您需要(使用 DMS)和明确的程序方法来找到正确的位置。对于某些 DMS 规则,您可以简单地“随处”应用; DMS 基本上会遍历指定的 AST 并为您应用规则。

几条规则从来都不会令人印象深刻,就像几行代码也不会令人印象深刻一样。几百或几千条规则可以做非常壮观的事情(例如完整的语言翻译),就像几百或几千行代码可以产生非常有趣的结果一样。不同之处在于,传统代码使用数字、字符串和结构,而程序转换工具则通过程序结构 (AST) 进行计算。

有一个完整的工作示例,展示了如何定义 DMS 的语言和规则,以及如何应用这些规则来实现“程序修改”(该示例实际上修改了“代数表达式”,但思想完全相同)。

DMS 是毫不掩饰的商业性,而且它不是一个廉价工具,因此它可能不是您论文所需要的。

如果不是 DMS,您可以获得具有相同想法的免费工具。考虑 TXL (www.txl.ca) 或 StrategoXt (www.strategoxt.org)。 DMS、TXL、Stratego 都使用表面语法模式进行程序转换,但 TXL 和 Stratego 无法像 DMS IMHO 那样处理大规模的代码更改。 (出于某些原因,请在 DMS 网站上阅读有关流量分析的信息)。不过,TXL 和 Stratego 非常适合学习基础知识和构建强大的演示。

The key idea is program transformations. Ondrej has the right idea with DMS but I'm the author of DMS so I'm likely biased.

The DMS language used to accomplish transformations is called the "(DMS) Rule Specification Language", or RSL, and is used to specify (program transformation) rules. Such a rule has:

  • a name (we tend have a lot of them, and this is a handy way to refer to them)
  • parameters (defining pattern variables), typed according the target language grammar of interest,
  • a left hand "match-this" pattern
  • a right hand "replace by this" pattern

The patterns are often written in the surface-syntax of the target language, that is the native syntax of the language being transform with extensions for pattern variables . To distinguish
the RSL language syntax from the target langauge, patterns are written inside (meta) quotes "...". THe \ character inside patterns is a (meta)escape back into RSL. A pattern variable is written "\x". A (meta)function foobar
is written as \foobar( ... ), note the (meta)escape on the (meta)functions argumements. Outside of the quotes, the meta-escapes are needed and these construct are written without \, e.g., foobar(...).

DMS rules can be a lot more complex than this, but these are the basics. The surface-syntax patterns do not represent text; rather, they really represent the equivalent ASTs of code in the patterns. DMS rules are used to match and change ASTs. The program transformation system of course has to have parsers to produce ASTs, and anti-parsers ("prettyprinters") to convert ASTs back to text. (DMS has a big library of langauge front ends for all the widely used langauges on the planet and a lot of the uncommon ones; we just added MUMPS).

For your specific examples, the following rules will do the trick:

"... removing a given function":

rule remove_function(f:IDENTIFIER,p:parameters,b:body): declarations -> declarations
  "  \f \p \b " -> " ; "  -- replace function delcaration by empty declation
  if f==target_function_name();

... adding an if condition around a block of code:

rule wrap_in_if(s:statement): statement -> statement
 " \s " ->  " \if ( \generated_condition\(\) ) \s ";

... adding a new function declaration that does nothing:

rule insert_noop_function(d:declarations): delcarations -> declarations
" \d " -> " \target_function\name\(\) ( )  { } ";

As you have observed, you have to point these somewhere; that's the job of a "metaprogram" which locates where in you AST you want the rules applied, and then it applies them. For your rules, you need (with DMS) and explicit procedural method to find the right location. For some DMS rules, you can simply apply then "everywhere"; DMS will essentially walk all over the AST designated and apply the rules for you.

Several rules are never very impressive, in the same way that several lines of code aren't impressive. A few hundred or thousand rules can do pretty spectacular things (like complete langauge translations), in the same way that few hundred or thousand lines of code can produce pretty interesting results. The difference is that conventional code works with numbers, strings and structures, and program transformation tools compute over program structures (ASTs).

There's a complete worked example showing how one defines a language and rules to DMS, and how those rules are applied to achive "program modifications" (the example actually modifies "algebraic expressions" but the ideas are exactly the same).

DMS is unabashedly commercial, and it isn't a dimestore tool, so it might not be what you need for your thesis.

If not DMS, you can get free tools that have the same ideas. Consider TXL (www.txl.ca) or StrategoXt (www.strategoxt.org). DMS, TXL, Stratego all do program transformations using surface syntax patterns, but TXL and Stratego can't handle massive changes to code as well as DMS IMHO. (Read about flow analysis at the DMS website for some reasons). TXL and Stratego are good for learning the basics and build strong demos, though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文