源到源的操作

发布于 2024-11-29 16:44:53 字数 912 浏览 0 评论 0原文

我需要在 Linux 内核中进行一些源到源的操作。我尝试使用 clang 来实现此目的，但出现了问题。 Clang 对源代码进行预处理，即宏和包含扩展。这会导致 clang 有时会在 Linux 内核方面产生损坏的 C 代码。我无法手动维护所有更改，因为我预计每个文件都有数千个更改。

我尝试了ANTLR，但可用的公共语法不完整，不适合Linux内核等项目。

所以我的问题如下。有没有什么方法可以在不进行预处理的情况下对 C 代码执行源到源操作？

因此假设以下代码。

#define AAA 1
void f1(int a){
    if(a == AAA)
        printf("hello");
}

在应用源到源操作后，我想要得到这个

#define AAA 1
void f1(int a){
    if(functionCall(a == AAA))
        printf("hello");
}

，但是例如，Clang 生成的以下代码不符合我的要求，即它扩展了宏 AAA

#define AAA 1
void f1(int a){
    if(functionCall(a == 1))
        printf("hello");
}

我希望我足够清楚。

编辑

上面的代码只是一个示例。我想要做的源到源操作不限于 if() 语句替换，还可以在表达式前面插入一元运算符，用其正值或负值替换算术表达式等。

解决方案

我为自己找到了一个解决方案。我使用 gcc 来生成预处理的源代码，然后应用 Clang。然后我对宏扩展和包含没有任何问题，因为这项工作是由 gcc 完成的。感谢您的回答！

原文

I need to do some source-to-source manipulations in Linux kernel. I tried to use clang for this purpose but there is a problem. Clang does preprocessing of the source code, i.e. macro and include expansion. This causes clang to sometimes produce broken C code in terms of Linux kernel. I can't maintain all the changes manually, since I expect to have thousands of changes per single file.

I tried ANTLR, but the public grammars available are incomplete and not suitable for such projects as Linux kernel.

So my question is the following. Are there any ways to perform source-to-source manipulations for a C code without preprocessing it?

So assume following code.

#define AAA 1
void f1(int a){
    if(a == AAA)
        printf("hello");
}

After applying source-to-source manipulation I want to get this

#define AAA 1
void f1(int a){
    if(functionCall(a == AAA))
        printf("hello");
}

But Clang, for instance, produces following code which does not fit my requirements, i.e. it expands macro AAA

#define AAA 1
void f1(int a){
    if(functionCall(a == 1))
        printf("hello");
}

I hope I was clear enough.

Edit

The above code is only an example. The source-to-source manipulations I want to do are not restricted with if() statement substitution, but also inserting unary operator in front of expression, replace arithmetic expression with its positive or negative value, etc.

Solution

There is one solution I found for my self. I use gcc in order to produce preprocessed source code and then apply Clang. Then I don't have any issues with macro expansion and includes, since that job is done by gcc. Thanks for the answers!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

七分※倦醒 2024-12-06 16:44:54

一个想法是将所有出现的替换

if(a == AAA)

为

if(functionCall(a == AAA))

您可以使用 sed 工具轻松完成此操作。

如果您有有限的要替换的模式集合，您可以编写 sed 脚本来执行替换。

这能解决您的问题吗？

An idea would be to replace all occurrences of

if(a == AAA)

with

if(functionCall(a == AAA))

You can do this easily using, e.g., the sed tool.

If you have a finite collection of patterns to be replaced you can write a sed script to perform the substitution.

Would this solve your problem?

回复收藏 0 原文

秋凉 2024-12-06 16:44:54

处理预处理器是将转换应用于 C（和 C++）代码时最困难的问题之一。

我们的 DMS 软件重组工具包及其 C 前端相对接近于做到这一点。 DMS 可以解析 C 源代码，保留大多数预处理器条件、宏定义和使用。

它通过允许预处理器在“结构良好”的地方执行操作来实现这一点。示例：#defines 可以出现在声明或语句可能出现的地方，宏调用和条件语句可以替代语言中的许多非终结符（例如，函数头、表达式、语句、声明）以及人们通常放置的许多非结构化位置它们（例如，#if fooif (...) {#endif）。它解析源代码和预处理器指令，就好像它们是一种语言的一部分（它们是，称为“C”），并构建相应的 AST，这些 AST 可以进行转换，并将使用捕获的预处理器指令正确重新生成。 [这个级别的能力完美地处理了OP的例子。]

一些指令放置得不好（无论是在语法意义上，例如，跨语言的多个片段，还是在“你一定是在开玩笑”的可理解性意义上）。这些 DMS 通过在高级工程师的指导下扩展它们来进行处理（“始终扩展此宏”）。一种不太令人满意的方法是将非结构化预处理器条件/宏调用手动转换为结构化条件；这有点痛苦，但比人们想象的更可行，因为坏情况发生的频率比好情况少得多。

为了做得更好，需要有考虑预处理器条件的符号表和流分析，并捕获所有预处理器条件。我们已经使用 DMS 进行了一些实验工作，以捕获符号表中的条件声明（似乎工作正常），并且我们刚刚开始为后者制定方案。

做到绿色并不容易。

回复收藏 0 原文