正则表达式去除注释、多行注释和空行
我想解析一个文件,我想使用 php 和 regex 来删除:
- 空白或空行
- 单行注释
- 多行注释
基本上我想删除任何包含
/* text */
或多行注释的
/***
some
text
*****/
行如果可能的话,另一个正则表达式来检查该行是否是空(删除空行)
可以吗? 有人可以向我发布一个可以做到这一点的正则表达式吗?
多谢。
I want to parse a file and I want to use php and regex to strip:
- blank or empty lines
- single line comments
- multi line comments
basically I want to remove any line containing
/* text */
or multi line comments
/***
some
text
*****/
If possible, another regex to check if the line is empty (Remove blank lines)
Is that possible? can somebody post to me a regex that does just that?
Thanks a lot.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
这应该可以将所有 /* 替换为 */。
This should work in replacing all /* to */.
这是一个很好的功能,并且有效!
现在使用此函数“strip_comments”来传递某些变量中包含的代码:
将结果输出为
从 php 文件加载:
加载 php 文件,剥离注释并将其保存回
来源:http://www.php.net/manual/en/tokenizer.examples.php
This is a good function, and WORKS!
Now using this function 'strip_comments' for passing code contained in some variable:
Will result output as
Loading from a php file:
Loading a php file, stripping comments and saving it back
Source: http://www.php.net/manual/en/tokenizer.examples.php
如果不习惯正则表达式,这是我的解决方案。 以下代码删除所有由 # 分隔的注释并检索此样式中变量的值 NAME=VALUE
This is my solution , if one is not used to regexp. The following code remove all comment delimited by # and retrieves the values of variable in this style NAME=VALUE
我发现这个更适合我,
(\s+)\/\*([^\/]*)\*/\n*
它删除了多行、选项卡式或非注释以及间隔在其后面。 我将留下一个该正则表达式将匹配的注释示例。I found this one to suit me better,
(\s+)\/\*([^\/]*)\*/\n*
it removes multi-line, tabbed or not comments and the spaced behind it. I'll leave a comment example which this regex would match.请记住,如果您正在解析的文件具有包含与这些条件匹配的内容的字符串,那么您使用的任何正则表达式都将失败。 例如,它将把这个:
变成这个:
这可能不是您想要的。 但也许是这样,我不知道。 无论如何,正则表达式在技术上无法以某种方式解析数据来避免该问题。 我说从技术上讲是因为现代 PCRE 正则表达式添加了许多技巧,使它们能够做到这一点,更重要的是,不再是正则表达式,而是无论如何。 如果您想避免在引号内或其他情况下剥离这些内容,那么成熟的解析器是无可替代的(尽管它仍然非常简单)。
Keep in mind that any regex you use will fail if the file you're parsing has a string containing something that matches these conditions. For example, it would turn this:
Into this:
Which is probably not what you want. But maybe it is, I don't know. Anyway, regexes technically can't parse data in a manner to avoid that problem. I say technically because modern PCRE regexes have tacked on a number of hacks to make them both capable of doing this and, more importantly, no longer regular expressions, but whatever. If you want to avoid stripping these things inside quotes or in other situations, there is no substitute for a full-blown parser (albeit it can still be pretty simple).
这是可能的,但我不会这样做。 您需要解析整个 php 文件,以确保您没有删除任何必要的空格(字符串、关键字/标识符之间的空格(publicfuntiondoStuff())等)。 最好使用 PHP 的分词器扩展。
It is possible, but I wouldn't do it. You need to parse the whole php file to make sure that you're not removing any necessary whitespace (strings, whitespace beween keywords/identifiers (publicfuntiondoStuff()), etc). Better use the tokenizer extension of PHP.