re 模块中的正则表达式支持字边界 (\b) 吗?
在尝试更多地了解正则表达式时,教程建议您可以使用 \b
来匹配单词边界。但是,Python 解释器中的以下代码片段无法按预期工作:
>>> x = 'one two three'
>>> y = re.search("\btwo\b", x)
如果有任何内容匹配,它应该是一个匹配对象,但它是 None
。
Python 不支持 \b
表达式还是我使用错误?
While trying to learn a little more about regular expressions, a tutorial suggested that you can use the \b
to match a word boundary. However, the following snippet in the Python interpreter does not work as expected:
>>> x = 'one two three'
>>> y = re.search("\btwo\b", x)
It should have been a match object if anything was matched, but it is None
.
Is the \b
expression not supported in Python or am I using it wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您应该在代码中使用原始字符串
另外,你为什么不尝试
输出:
You should be using raw strings in your code
Also, why don't you try
Output:
这将起作用:
re.search(r"\btwo\b", x)
当您在 Python 中编写
"\b"
时,它是单个字符:“\x08”
。要么像这样转义反斜杠:要么像这样编写原始字符串:
This will work:
re.search(r"\btwo\b", x)
When you write
"\b"
in Python, it is a single character:"\x08"
. Either escape the backslash like this:or write a raw string like this:
只是为了明确解释为什么
re.search("\btwo\b", x)
不起作用,这是因为\b
在Python 字符串是退格字符的简写。因此,模式
"\btwo\b"
正在寻找一个退格键,后跟two
,然后是另一个退格键,即您要搜索的字符串 (x = '一二三'
) 没有。要允许
re.search
(或compile
)将序列\b
解释为单词边界,请转义反斜杠(" \\btwo\\b"
)或使用原始字符串来创建您的模式(r"\btwo\b"
)。Just to explicitly explain why
re.search("\btwo\b", x)
doesn't work, it's because\b
in a Python string is shorthand for a backspace character.So the pattern
"\btwo\b"
is looking for a backspace, followed bytwo
, followed by another backspace, which the string you're searching in (x = 'one two three'
) doesn't have.To allow
re.search
(orcompile
) to interpret the sequence\b
as a word boundary, either escape the backslashes ("\\btwo\\b"
) or use a raw string to create your pattern (r"\btwo\b"
).Python 文档
https://docs. python.org/2/library/re.html#regular-expression-syntax
Python documentation
https://docs.python.org/2/library/re.html#regular-expression-syntax
请注意,对于动态变量,这将无法
在左侧和右侧使用
r"\b"
just a note, for dynamic variable this will not work
use
r"\b"
on left and right