自定义 gcc 预处理器
您能给我一个编写自定义 gcc 预处理器的示例吗?
我的目标是用适当的 CRC32 计算值替换 SID("foo") 类似的宏。对于任何其他宏,我想使用标准 cpp 预处理器。
看起来可以使用 -no-integrated-cpp -B 选项来实现这个目标,但是我找不到任何简单的用法示例。
Could you please give me an example of writing a custom gcc preprocessor?
My goal is to replace SID("foo") alike macros with appropriate CRC32 computed values. For any other macro I'd like to use the standard cpp preprocessor.
It looks like it's possible to achieve this goal using -no-integrated-cpp -B
options, however I can't find any simple example of their usage.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
警告:危险且丑陋的黑客行为。现在闭上眼睛 您可以通过将“-no-integrated-cpp”和“-B”开关添加到 gcc 命令行来挂钩自己的预处理器。 “-no-integrated-cpp”表示 gcc 在使用其内部搜索路径之前会在“-B”路径中搜索其预处理器。如果使用“-E”选项调用“cc1”、“cc1plus”或“cc1obj”程序(这些是 C、C++ 和 Objective-c 编译器),则可以识别预处理器的调用。当您看到此选项时,您可以进行自己的预处理。当没有“-E”选项时,将所有参数传递给原始程序。当有这样的选项时,您可以进行自己的预处理,并将操作后的文件传递给原始编译器。
它看起来像这样:
该示例调用原始预处理器,但打印一条附加消息和参数。您可以用自己的预处理器替换该脚本。
糟糕的黑客事件结束了。你现在可以睁开眼睛了。
Warning: dangerous and ugly hack. Close your eyes now You can hook your own preprocessor by adding the '-no-integrated-cpp' and '-B' switches to the gcc command line. '-no-integrated-cpp' means that gcc does search in the '-B' path for its preprocessors before it uses its internal search path. The invocations of the preprocessor can be identified if the 'cc1', 'cc1plus' or 'cc1obj' programs (these are the C, C++ and Objective-c compilers) are invoked with the '-E' option. You can do your own preprocessing when you see this option. When there is no '-E' option pass all the parameters to the original programs. When there is such an option, you can do your own preprocessing, and pass the manipulated file to the original compiler.
It looks like this:
This example calls the original preprocessor, but prints an additional message and the parameters. You can replace the script by your own preprocessor.
The bad hack is over. You can open your eyes now.
一种方法是使用程序转换系统,“重写”只是 在进行编译之前调用您想要的 SID 宏,将预处理器处理的其余部分留给编译器本身。
我们的DMS Software Reengineering Toolkit就是这样一个系统,可以应用于许多领域语言,包括 C,特别是 GCC 2/3/4 系列编译器。
要使用 DMS 实现此想法,您可以使用其 C 前端 运行 DMS
在编译步骤之前检查源代码。 DMS 可以解析代码而无需扩展预处理器指令,构建
表示它的抽象语法树,对 AST 进行转换,然后将结果吐出为可编译的 C 文本。
您将使用的具体转换规则是:
其中 ComputeCRC32 是执行其说明的自定义代码。 (DMS 包含 CRC32 实现,因此其自定义代码非常短。DMS
是完成此任务的一把大锤。您可以使用 PERL 来实现非常相似的东西。与 PERL(或其他一些字符串匹配/替换 hack)的区别) 的风险是:a) 它可能会在您不想要替换的地方找到该模式,例如
,您可以通过仔细编码模式匹配来修复该模式,b) 无法匹配 SID 调用在令人惊讶的情况下发现:
并且 c) 无法处理出现在文字字符串本身中的各种转义字符:
DMS 的 C 前端会为您处理所有转义字符;上面的 ComputeCRC32 函数将看到包含实际预期字符的字符串,而不是您在源代码中看到的原始文本。
因此,这实际上取决于您是否关心暗角情况,或者您是否认为可能需要进行更多特殊处理。
鉴于您描述问题的方式,我非常想首先采用 Perl 路线,并简单地禁止这些有趣的情况。如果你做不到这一点,那么大锤子就有意义了。
One way is to use a program transformation system, to "rewrite" just the SID macro invocation to what you want before you do the compilation, leaving the rest of the preprocessor handling to the compiler itself.
Our DMS Software Reengineering Toolkit is a such a system, that can be applied to many languages including C and specifically the GCC 2/3/4 series of compilers.
To implement this idea using DMS, you would run DMS with its C front end
over your source code before the compilation step. DMS can parse the code without expanding the preprocessor directives, build
abstract syntax trees representing it, carry out transformations on the ASTs, and then spit out result as compilable C text.
The specific transformation rule you would use is:
where ComputeCRC32 is custom code that does what it says. (DMS includes a CRC32 implementation, so the custom code for this is pretty short.
DMS is kind a a big hammer for this task. You could use PERL to implement something pretty similar. The difference with PERL (or some other string match/replace hack) is the risk that a) it might find the pattern someplace where you don't want a replacement, e.g.
which you can probably fix by coding your pattern match carefully, b) fail to match a SID call found in suprising circumstances:
and c) fail to handle the various kinds of escape characters that show up in the literal string itself:
DMS's C front end handles all the escapes for you; the ComputeCRC32 function above would see the string containing the actual intended characters, not the raw text you see in the source code.
So its really a matter of whether you care about the dark-corner cases, or if you think you may have more special processing to do.
Given the way you've described the problem, I'd be sorely tempted to go the Perl route first and simply outlaw the funny cases. If you can't do this, then the big hammer makes sense.