与正向先行断言匹配的空白如何出现在 Python 正则表达式中的最终匹配字符串中？

发布于 2024-11-26 17:05:31 字数 563 浏览 5 评论 0原文

为了回答这个问题，我创建了这个Python正则表达式来匹配任何egg子字符串，后跟一个不属于 http:// 开头的 URL 的数字：

>>> r = re.compile('(?:\s(?!http://\S*))egg\d')

然后我将其应用于以下字符串：

>>> a = "a egg1 http://egg2.com egg3 http://www.egg4.org egg5"

结果是：

>>> r.findall(a)
[' egg1', ' egg3', ' egg5']

正则表达式对于许多其他问题都是不正确的，但一个问题除外更多错误：为什么空白出现在结果中？由于我使用了像 (?:\s...) 这样的前瞻断言，难道不应该从结果字符串中取出它吗？

原文

Trying to answer this question, I created this Python regular expression to match any egg substring followed by a digit that is not part of a URL preceded by http://:

>>> r = re.compile('(?:\s(?!http://\S*))egg\d')

Then I applied it to the following string:

>>> a = "a egg1 http://egg2.com egg3 http://www.egg4.org egg5"

The result is:

>>> r.findall(a)
[' egg1', ' egg3', ' egg5']

The regular expression is not correct for a lot of other problems but one bugged more: why does the whitespace appears in the result? Since I used a lookahead assertion like (?:\s...), shouldn't it be take out of the resulting strings?

分享到QQ

分享到微博