将推文文本中的 @replies 替换为 HTML 超链接,而不替换电子邮件地址
我正在使用正则表达式使用以下 PHP 代码检测 Twitter 流中的 @replies
。 在第一个模式中,我在字符串的开头替换@replies; 在第二个中,我替换了空格后面的@replies。
$text = preg_replace('!^@([A-Za-z0-9_]+)!', '<a href="http://twitter.com/$1" target="_blank">@$1</a>', $text);
$text = preg_replace('! @([A-Za-z0-9_]+)!', ' <a href="http://twitter.com/$1" target="_blank">@$1</a>', $text);
如何最好地结合这两个规则而不出现错误标记 [email protected]
作为回复?
I'm detecting @replies
in a Twitter stream with the following PHP code using regexes. In the first pattern, I replace @replies at the beginning of the string; in the second, I replace the @replies which follow a space.
$text = preg_replace('!^@([A-Za-z0-9_]+)!', '<a href="http://twitter.com/$1" target="_blank">@$1</a>', $text);
$text = preg_replace('! @([A-Za-z0-9_]+)!', ' <a href="http://twitter.com/$1" target="_blank">@$1</a>', $text);
How can I best combine these two rules without false flagging [email protected]
as a reply?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
好的,再想一想,不标记whatever@email 意味着前一个元素必须是“非单词”项目,因为单词中可能包含的任何其他元素都可以被标记为电子邮件,因此它会导致:
但是你必须使用 $2 而不是 $1。
OK, on a second thought, not flagging whatever@email means that the previous element has to be a "non-word" item, because any other element that could be contained in a word could be signaled as an email, so it would lead:
but then you have to use $2 instead of $1.
由于
^
不必位于 RE 的开头,因此您可以使用分组和|
来组合这些 RE。如果您不想重新插入捕获的空白,则必须使用“正向后查找”:
或“负向后查找”:
...无论您认为更容易理解哪个。
Since the
^
does not have to stand at the beginning of the RE, you can use grouping and|
to combine those REs.If you don't want re-insert the whitespace you captured, you have to use "positive lookbehind":
or "negative lookbehind":
...whichever you find easier to understand.
这是我的组合方式
Here's how I'd do the combination
在非捕获组中使用交替,如果使用
\K
匹配,则忘记空格。使用
(\w+)
捕获字母数字和下划线字符。全字符串匹配将保留
@
。捕获组 1 将包含
@
之后的文本。代码:(演示)
Use alternation in the non-capturing group and forget the space if matched using
\K
.Use
(\w+)
to capture alphanumeric and underscore characters.The fullstring match will retain the
@
.Capture group 1 will contain the text after the
@
.Code: (Demo)
(? 被粗略地翻译为“前面没有非空白字符”。 有点像双重否定,但也适用于字符串/行的开头。
这不会消耗任何前面的字符,不会使用任何捕获组,并且不会匹配诸如
"[email protected]"
,这是一个有效的电子邮件地址。测试:
(?<!\S)
is loosely translated to "no preceding non-whitespace character". Sort of a double-negation, but also works at the start of the string/line.This won't consume any preceding character, won't use any capturing group, and won't match strings such as
"[email protected]"
, which is a valid e-mail address.Tested:
胡,伙计们,别逼得太远......就是这样:
Hu, guys, don't push too far... Here it is :
我认为你可以使用交替,:所以寻找字符串或空格的开头
I think you can use alternation,: so look for the beginning of a string or a space