如何使用正则表达式匹配 C 函数原型和变量定义?

发布于 2024-12-09 09:12:30 字数 952 浏览 0 评论 0原文

以前已经问过这个问题,但我有一个特殊的情况,我应该能够使用正则表达式来处理。

我正在尝试阅读 Doxygen 的警告日志,源代码是 C 语言(到目前为止,我不敢考虑 C++)。

我需要匹配该日志中找到的函数和变量定义并选取函数和变量名称。

更具体地说,日志中包含诸如

/home/me/blaa.c:10:Warning: Member a_function(int a, int b) (function) of file blaa.c is not documented

and

/home/me/blaa.h:10:Warning: Member a_variable[SOME_CONST(sizeof(SOME_STRUCT), 64)*ANOTHER_CONST] (variable) of file blaa.h is not documented

之类的行,您可以在 C 中拥有所有变体...

我可以仅用一个正则表达式来匹配这些变体吗?或者我什至不应该打扰?括号中的“参数”列表(我松散地使用它也包括变量)后面的单词是一组某些单词(函数、变量、枚举等),因此如果没有其他帮助,我可以与这些单词匹配,但我最好不要,以防日志中存在我尚未看到的类型。

我当前的尝试看起来像

'(?P<full_path>.+):\d+:\s+Warning:\s+Member\s+(?P<member_name>.+)([\(\[](\**)\s*\w+([,)])[\)\]))*\s+\((?P<member_type>.+)\) of file\s+(?P<filename>.+)\s+is not documented'

(我使用Python的重新包。)

但它仍然无法捕获所有内容。

编辑:我在上次编辑中犯了一些错误。

This has been asked before but I have a specialised case which I should be able to handle with a regular expression.

I'm trying to read the warning log from Doxygen and the source is in C (so far, I dread to think about C++).

I need to match the functions and variable definitions found in that log and pick up the function and variable names.

More specifically the log has lines like

/home/me/blaa.c:10:Warning: Member a_function(int a, int b) (function) of file blaa.c is not documented

and

/home/me/blaa.h:10:Warning: Member a_variable[SOME_CONST(sizeof(SOME_STRUCT), 64)*ANOTHER_CONST] (variable) of file blaa.h is not documented

With all the variations you can have in C...

Can I match those with just one regexp or should I not even bother? The word in after the "parameter" (I use this loosely to also include the variables) list in parentheses is a set of certain words (function, variable, enum, etc) so if nothing else helps, I could match with those but I'd rather not in case there are types that I haven't seen yet in the logs.

My current attempt looks like

'(?P<full_path>.+):\d+:\s+Warning:\s+Member\s+(?P<member_name>.+)([\(\[](\**)\s*\w+([,)])[\)\]))*\s+\((?P<member_type>.+)\) of file\s+(?P<filename>.+)\s+is not documented'

(I use Python's re package.)

But it still fails to catch everything.

EDIT: There's some mistake in there that I have done in the last edit.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

铁轨上的流浪者 2024-12-16 09:12:30

您允许 之间出现零次或多次匹配。试试这个:

'(?P<full_path>.+):\d+:\s+Warning:\s+Member\s+(?P<member_name>\w+).*\s+\((?P<member_type>\w+)\) of file\s+(?P<filename>.+)\s+is not documented'

You were allowing zero or more matches between <member_name> and <member_type>. Try this instead:

'(?P<full_path>.+):\d+:\s+Warning:\s+Member\s+(?P<member_name>\w+).*\s+\((?P<member_type>\w+)\) of file\s+(?P<filename>.+)\s+is not documented'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文