如何在 php 中使用长正则表达式字符串
我有一个从网站上获取的用于从文件中提取电子邮件的正则表达式字符串:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
我已经在正则表达式伙伴(正则表达式测试软件)中对其进行了测试,并且它有效!
当我将正则表达式从 regex buddy 复制并粘贴到我的 php 文件时,我必须转义 2 个 "
字符以使正则表达式在 php 中形成有效字符串。
在 php 中我这样使用它:
$file = file_get_contents(/* URL TO GET */);
$email_pattern = "(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])";
$matches = array();
if ( preg_match_all ( $email_pattern, $file, $matches ))
{
echo print_r($matches, true);
}
但是我收到这个警告!?!?
Warning: preg_match_all() [function.preg-match-all]: Unknown modifier '@'
但是这个正则表达式在正则表达式中有效?
我哪里错了???
i have this regex string that i got from a website to pull emails from a file:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
Ive tested it in regex buddy ( regex testing software ) and it works!
when i copy and paste the regex from regex buddy to my php file, i have to escape 2 "
characters to make the regex form a valid string in php.
in php i use it like this:
$file = file_get_contents(/* URL TO GET */);
$email_pattern = "(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])";
$matches = array();
if ( preg_match_all ( $email_pattern, $file, $matches ))
{
echo print_r($matches, true);
}
but i get this warning!?!?
Warning: preg_match_all() [function.preg-match-all]: Unknown modifier '@'
however this regex works in regex buddy?
Where am i going wrong???
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
两件事:
第 1 步:
您需要在正则表达式之前和之后放置分隔符( / ,以便您可以添加修饰符):
第 2 步:
在 PHP 字符串中,您需要转义所有特殊字符(例如\ 必须变成 \\ ,而 $ 会变成 \$ ,等等...)
因此,在 PHP 字符串中包含正则表达式的转义应该如下所示:
并且您还必须转义 /,因为我们使用该字符分隔符的第一步。所以我们需要正则表达式来查看 \/,但是当我们在 php 字符串中表达正则表达式时,我们将用 \\/ 替换 /
如果我是对的 - 通常我也使用正则表达式伙伴来进行转换PHP 导出工具,但现在我没有它,所以我手动完成了 - 它应该给出LIKE这样的东西:
我会还建议您将字符串放在单引号内。
2 things:
step 1:
You need to put delimiters ( the / before and after the regex, so that you may add modifier ):
step2:
And as your in a PHP string, you'll need to escape all the special character ( like \ that must become \\ , and $ that would become \$ , etc... )
So the escape to include the regex in a PHP String should look like this:
And you also have to escape /, as we use that caracter for the delimiter of the first step. So we need the regex to see \/, but as we express the regex in a php string, we will replace / by \\/
If I'm right -- usually I use regex buddy too to do the conversion with the PHP export tool, but now I don't have it so I've done it by hand-- it should give something LIKE this:
I would also suggest that you put the string inside single quote.
我尝试过并且...
单引号会给出错误...
使用双引号和 {} 作为分隔符 // 也会给出错误
I tried and...
Single quotes will give an error...
Use double quotes and the {} as delimiters // gives an error also