帮助编写正则表达式

发布于 2024-10-21 17:58:06 字数 1108 浏览 3 评论 0原文

我有以下字符串:user1 fam [电子邮件受保护] >, user2 fam <[电子邮件受保护]>, .. .

我如何使用正则表达式从该字符串中获取邮件地址。我需要在邮件地址的输出列表中

[email protected]
[email protected]

尝试:

<.*>

但它的输出为 < >:

   <[email protected]>
   <[email protected]>

谢谢。

ps 谢谢@xanatos 的评论,我使用 Erlang

I have folowing string: user1 fam <[email protected]>, user2 fam <[email protected]>, ...

How can i get mail address from this string with regular expression. I need in output list of mail address

[email protected]
[email protected]

I try:

<.*>

But it's ouput with < >:

   <[email protected]>
   <[email protected]>

Thank you.

p.s. Thank you @xanatos for comment, I use Erlang

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

∝单色的世界 2024-10-28 17:58:06

正如其他人所说,但为了使其更快:

<([^>]*)>

这样,正则表达式就不必回溯(使用建议的其他正则表达式,正则表达式将匹配所有字符串,然后开始回滚以查找 > ;

我要补充一点,由于历史原因,.[\s\S] 之间存在细微差别。两者都捕获除 \n 之外的所有字符。第一个 (.) 没有捕获它。因此,通过使用 [^>] 您可以捕获 \n,但这对于您正在做的事情来说不应该成为问题。 http://www.regular-expressions.info/dot.html

只是为了完整,因为这是一个经常发生的问题,所以还有另一种变体:(

<((?:(?!>).)*)>

如果需要,您可以将 . 替换为 [\s\S],或者使用 SingleLine 选项如果您的语言支持它,则使 . 的行为方式不同)。这里的要点是“stop”表达式可以长于一个字符。您可以插入 (?!%%) 而不是 (?!>),它会在 %% 处停止。但我不确定这个变体是否适用于 Erlang(我没有注意到新标签...当我最初阅读问题时它不存在并且我不是 Erlang 程序员...而且看起来至少两个 Erlang 程序员对这个论点有不同的看法:-))

As the other have said, but to make it faster:

<([^>]*)>

In this way the Regex won't have to backtrack (with the other Regexes suggested, the Regex will match all the string and then will begin to rollback to find a >)

I'll add that, for historical reasons, there are small differences between the . and, for example [\s\S]. Both catch all the characters EXCEPT the \n. The first one (.) doesn't catch it. So by using the [^>] you are catching the \n, but this shouldn't be a problem for what you are doing. http://www.regular-expressions.info/dot.html

Just to be complete, because it's a problem that often happens, there is another variant:

<((?:(?!>).)*)>

(you can substitute the . with [\s\S] if you want, or use the SingleLine option if your language supports it, to make the . behave in a different way). The point here is that the "stop" expression can be longer than one character. Instead of (?!>) you could have inserted (?!%%) and it would have stopped at %%. BUT I'm not sure this variant work with Erlang (I hadn't noticed the new Tag... It wasn't there when I orginally read the question and I'm not an Erlang programmer... And it seems at least two Erlang programmers have different opinions on the argument :-) )

爱冒险 2024-10-28 17:58:06
  • 您需要使用选项ungreedy,以便它只匹配各个括号对。

  • global,以便您可以获得所有匹配项。

  • 并且您需要 {capture, all_but_first, list} 以便获得实际值(如果您愿意,list 也可以是 binary二进制结果)。 all_but_first 告诉 re 不要返回整个匹配项(其中包括 <>),而只返回组。

结果:

1> S.
"user1 fam <[email protected]>, user2 fam <[email protected]>, "
2> re:run(S, "<(.+)>", [ungreedy, global, {capture, all_but_first, list}]).
{match,[["[email protected]"],["[email protected]"]]}
  • You need to use the option ungreedy so that it only matches the individual bracket pairs.

  • global so that you can get all the matches.

  • and you need {capture, all_but_first, list} so that you get the actual values (list can also be binary if you prefer binary results). all_but_first tells re to not return the whole match (which would include <>), just the group.

Result:

1> S.
"user1 fam <[email protected]>, user2 fam <[email protected]>, "
2> re:run(S, "<(.+)>", [ungreedy, global, {capture, all_but_first, list}]).
{match,[["[email protected]"],["[email protected]"]]}
吾性傲以野 2024-10-28 17:58:06

使用团体。有关更多详细信息,请参阅正则表达式引擎的文档。

>>> re.findall('<(.*?)>', 'user1 fam <[email protected]>, user2 fam <[email protected]>, ...')
['[email protected]', '[email protected]']

Use groups. See your regex engine's documentation for more details.

>>> re.findall('<(.*?)>', 'user1 fam <[email protected]>, user2 fam <[email protected]>, ...')
['[email protected]', '[email protected]']
岁吢 2024-10-28 17:58:06

保持简单并使用 <([^>]*)> ,它的速度大约是最快的,并且适用于大多数版本的正则表达式。这更快,因为在使用<(.*?)>时永远不必回溯,导致回溯。

Keep it simple and use <([^>]*)> which is about as fast as it can get and works for most versions of regular expressions. This is faster as it never has to backtrack while using <(.*?)> will cause backtracking.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文