egrep 正则表达式在 PHP 中工作,但在 unix shell 中不起作用 - 转义问题?
我认为我的问题与转义在 PHP 中使用正则表达式与在 Bash 命令行中使用它之间的差异有关。
这是我在 PHP 中工作的正则表达式:
$emailregex = '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$';
所以我尝试在命令行中给出以下内容,但它似乎与任何内容都不匹配。 (其中 emails.txt
是一个很长的纯文本文件,其中包含数千个(可能格式错误)电子邮件地址,每行一个)。
[root@host dir]# egrep '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$' emails.txt
我尝试用双引号而不是单引号包围正则表达式,但这没有什么区别。 我需要在正则表达式中添加一些反斜杠吗?
解决了!谢谢你! 我的文件是在 Windows 中创建的,行尾标记中的额外 CR 与正则表达式中的美元符号不一致。
I think my problem has something to do with escaping differences between using a regex within PHP versus using it at Bash commandline.
Here is my regex that is working in PHP:
$emailregex = '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})
So I try giving the following at commandline and it doesn't seem to match anything.
(where emails.txt
is a long plain text file with thousands of (possibly badly-formed) email addresses, one per line).
[root@host dir]# egrep '^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})
I have tried surrounding the regex with double-quotemarks instead of single-quotemarks, but it made no difference.
Do I need to add some backslashes into the regex?
SOLVED! Thank you!
My file was created in Windows and extra CR in the END-OF-LINE markers did not agree with the dollar sign in the regex.
;
So I try giving the following at commandline and it doesn't seem to match anything.
(where emails.txt
is a long plain text file with thousands of (possibly badly-formed) email addresses, one per line).
I have tried surrounding the regex with double-quotemarks instead of single-quotemarks, but it made no difference.
Do I need to add some backslashes into the regex?
SOLVED! Thank you!
My file was created in Windows and extra CR in the END-OF-LINE markers did not agree with the dollar sign in the regex.
emails.txt
I have tried surrounding the regex with double-quotemarks instead of single-quotemarks, but it made no difference.
Do I need to add some backslashes into the regex?
SOLVED! Thank you!
My file was created in Windows and extra CR in the END-OF-LINE markers did not agree with the dollar sign in the regex.
So I try giving the following at commandline and it doesn't seem to match anything.
(where emails.txt
is a long plain text file with thousands of (possibly badly-formed) email addresses, one per line).
I have tried surrounding the regex with double-quotemarks instead of single-quotemarks, but it made no difference.
Do I need to add some backslashes into the regex?
SOLVED! Thank you!
My file was created in Windows and extra CR in the END-OF-LINE markers did not agree with the dollar sign in the regex.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
单引号应该与 bash 一起使用...
它对我来说适用于这个简单的情况:
您的问题可能是您有一个 dos 格式的文件。在这种情况下,额外的
\r
将使正则表达式不匹配,因为它会认为行尾有一个额外的字符。您可以对其运行dos2unix
,或者通过从正则表达式中删除开始和结束标记来减少正则表达式的限制:在您的文本文件中,该行必须仅包含电子邮件地址。行中任何额外的空格都会使其失效。例如,这不会打印任何内容:
您的问题可能是您有一个 dos 格式的文件。在这种情况下,额外的
\r
将使正则表达式不匹配,因为它会认为行尾有一个额外的字符。您可以对其运行dos2unix
,或者通过从正则表达式中删除开始和结束标记来减少正则表达式的限制:Single quotes should work with bash...
It works for me with this simple case:
Your problem might be that you have a dos formatted file. In that case the extra
\r
will make it so that the regex doesn't match since it will think there's an extra character at the end of the line. You can rundos2unix
against it, or make your regex less restrictive by removing the beginning and end markers from your regex:In your text file, the line has to only contain the email address. Any additional spaces on the line will throw it off. For example this doesn't print anything:
Your problem might be that you have a dos formatted file. In that case the extra
\r
will make it so that the regex doesn't match since it will think there's an extra character at the end of the line. You can rundos2unix
against it, or make your regex less restrictive by removing the beginning and end markers from your regex:WWorks 对我来说:
小心尾随空格/制表符/和返回 - 他们有一种咬正则表达式的方式
这里有一个关于 shell 引用的很棒的参考 http://www.mpi-inf.mpg.de/~uwe/lehre/unixffb/quoting-guide.html
WWorks for me:
Beware trailing whitespace/tabs/and returns - they have a way of biting regexs
There is a great ref on shell quoting here http://www.mpi-inf.mpg.de/~uwe/lehre/unixffb/quoting-guide.html