负向后看正则表达式捕获的问题

发布于 2024-11-25 16:35:37 字数 1500 浏览 5 评论 0 原文

我尝试匹配电子邮件地址，但前提是它们前面没有“mailto：”。我尝试这个正则表达式：

"/(?

对此字符串： '[电子邮件受保护] “>电子邮件 ... [电子邮件受保护] '

我希望仅捕获 '[电子邮件受保护]'，但我也收到 ' <一href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="f29d9f97979f939b9eb2969d9f939b9cdc919d9f">[电子邮件受保护]' - 查看缺少的 ' 。我想知道这里出了什么问题。在后行断言之后我不能有一个正常的正则表达式吗？

我的整个 PHP 示例如下：

$testString = '<a href="mailto:[email protected]">EMAIL</a>  ...   [email protected] ';
$pattern = "/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/";
preg_match_all($pattern, $testString, $matches);
echo('<pre>');print_r($matches);echo('</pre>');

谢谢！

原文

I try to match email addresses but only when they are not preceeded with "mailto:". I try this regular expression:

"/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/"

against this string:
'<a href="mailto:[email protected]">EMAIL</a> ... [email protected] '

I would expect to catch only '[email protected]', but I also receive '[email protected]' - see missing 's'. I wonder what's wrong here. Can't I have a normal regex after the lookbehind assertion?

My whole example in PHP looks like:

$testString = '<a href="mailto:[email protected]">EMAIL</a>  ...   [email protected] ';
$pattern = "/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/";
preg_match_all($pattern, $testString, $matches);
echo('<pre>');print_r($matches);echo('</pre>');

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

枫以 2024-12-02 16:35:37

因为在 s 之后有一个与您的正则表达式匹配的字符串，[电子邮件受保护]，并且因为s 很难与 mailto: 匹配。在其中设置单词边界适用于大多数情况：

更改：

(?<!mailto:)

至：

(?<!mailto:)\b

旁注：使用 example.com 为例，domain.com 由实际公司拥有。

Because after s there is a string that matches your regex, [email protected], and because s is hardly mailto: it matches. Getting a word boundary in there will work for most cases:

Change: