为什么我的 ack 正则表达式会得到额外的、意想不到的结果?
我终于开始学习正则表达式并使用 ack 进行训练。我相信这使用 Perl 正则表达式。
我想匹配第一个非空白字符为 if (
的所有行,元素之间有任意数量的空格。
这就是我想到的:
^[ \t]*if *\(\w+ *!
它几乎^[ \t]*
是错误的,因为它匹配一个或不匹配[空格或制表符]。 我想要的是匹配任何可能只包含空格或制表符(或不包含任何内容)的内容。
例如,这些不应该匹配:
// if (asdf != 0)
else if (asdf != 1)
How can I edit my regexp for that?
编辑添加命令行
ack -i --group -a '^\s*if *\(\w+ *!' c:/work/proj/proj
注意单引号,我不再那么确定它们了。
我的搜索库是一个更大的代码库。它确实包含匹配表达式(相当多),但即使是例如:
274: }else if (y != 0)
,我通过上述命令得到的结果。
编辑添加mobrule测试的结果
Mobrule,感谢您为我提供了测试文本。我将在此处复制我在提示符中得到的内容:
C:\Temp\regex>more ack.test
# ack.test
if (asdf != 0) # no spaces - ok
if (asdf != 0) # single space - ok
if (asdf != 0) # single tab - ok
if (asdf != 0) # multiple space - ok
if (asdf != 0) # multiple tab - ok
if (asdf != 0) # spaces + tab ok
if (asdf != 0) # tab + space ok
if (asdf != 0) # space + tab + space ok
// if (asdf != 0) # not ok
} else if (asdf != 0) # not ok
C:\Temp\regex>ack '^[ \t]*if *\(\w+ *!' ack.test
C:\Temp\regex>"C:\Program\git\bin\perl.exe" C:\bat\ack.pl '[ \t]*if *\(\w+ *!' a
ck.test
if (asdf != 0) # no spaces - ok
if (asdf != 0) # single space - ok
if (asdf != 0) # single tab - ok
if (asdf != 0) # multiple space - ok
if (asdf != 0) # multiple tab - ok
if (asdf != 0) # spaces + tab ok
if (asdf != 0) # tab + space ok
if (asdf != 0) # space + tab + space ok
// if (asdf != 0) # not ok
} else if (asdf != 0) # not ok
问题出在我对 ack.bat 的调用中!
ack.bat 包含:
"C:\Program\git\bin\perl.exe" C:\bat\ack.pl %*
虽然我用插入符号调用,但它在调用 bat 文件时就消失了!
使用 ^^
转义插入符号不起作用。
使用 " "
而不是 ' '
引用正则表达式是有效的。我的问题是 DOS/win 问题,很抱歉打扰大家。
I'm finally learning regexps and training with ack. I believe this uses Perl regexp.
I want to match all lines where the first non-blank characters are if (<word> !
, with any number of spaces in between the elements.
This is what I came up with:
^[ \t]*if *\(\w+ *!
It only nearly worked. ^[ \t]*
is wrong, since it matches one or none [space or tab].
What I want is to match anything that may contain only space or tab (or nothing).
For example these should not match:
// if (asdf != 0)
else if (asdf != 1)
How can I modify my regexp for that?
EDIT adding command line
ack -i --group -a '^\s*if *\(\w+ *!' c:/work/proj/proj
Note the single quotes, I'm not so sure about them anymore.
My search base is a larger code base. It does include matching expressions (quite some), but even for example:
274: }else if (y != 0)
, which I get as a result of the above command.
EDIT adding the result of mobrule's test
Mobrule, thanks for providing me a text to test on. I'll copy here what I get on my prompt:
C:\Temp\regex>more ack.test
# ack.test
if (asdf != 0) # no spaces - ok
if (asdf != 0) # single space - ok
if (asdf != 0) # single tab - ok
if (asdf != 0) # multiple space - ok
if (asdf != 0) # multiple tab - ok
if (asdf != 0) # spaces + tab ok
if (asdf != 0) # tab + space ok
if (asdf != 0) # space + tab + space ok
// if (asdf != 0) # not ok
} else if (asdf != 0) # not ok
C:\Temp\regex>ack '^[ \t]*if *\(\w+ *!' ack.test
C:\Temp\regex>"C:\Program\git\bin\perl.exe" C:\bat\ack.pl '[ \t]*if *\(\w+ *!' a
ck.test
if (asdf != 0) # no spaces - ok
if (asdf != 0) # single space - ok
if (asdf != 0) # single tab - ok
if (asdf != 0) # multiple space - ok
if (asdf != 0) # multiple tab - ok
if (asdf != 0) # spaces + tab ok
if (asdf != 0) # tab + space ok
if (asdf != 0) # space + tab + space ok
// if (asdf != 0) # not ok
} else if (asdf != 0) # not ok
The problem is in my call to my ack.bat!
ack.bat contains:
"C:\Program\git\bin\perl.exe" C:\bat\ack.pl %*
Although I call with a caret, it gets away at the call of the bat file!
Escaping the caret with ^^
does not work.
Quoting the regex with " "
instead of ' '
works. My problem was a DOS/win problem, sorry for bothering you all for that.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
\S
。\w
不会匹配任何特殊字符,因此if ($word
不会匹配。可能符合您的规范,在这种情况下\w(字母数字加“_”
)就可以了
\S
for non-white-space.\w
will not match any special chars, soif ($word
will not match. May be that's OK with your specs, in which case\w
(alphanumeric plus "_") is OK
在
ack
和grep
中,*
匹配零个或多个,而不是零或一个。所以我认为你已经有了正确的解决方案。哪些测试用例没有给您想要的结果?结果:
In both
ack
andgrep
,*
matches zero or more, not zero or one. So I think you already have the right solution. What test cases aren't giving you the results you want?Results:
您可以尝试:
.
将是零个或多个制表符或零个或多个空格,而不是空格和制表符的混合。
You can try:
.
will be zero or more tabs or zero or more spaces not a mix of spaces and tabs.