为什么不d d d d d re.compile（rf'（\ b {re.escape（＆＃x27; g（x）＆＃x27;）} \ b）\ b）＆＃x2B;字符串＆＃x27; g（x）＆＃x27;？

发布于 2025-02-09 03:56:02 字数 764 浏览 1 评论 0原文

我在Python 3.8中运行以下代码，

import re

a,b,c = 'g(x)g', '(x)g', 'g(x)'
a_re = re.compile(rf"(\b{re.escape(a)}\b)+",re.I)
b_re = re.compile(rf"(\b{re.escape(b)}\b)+",re.I)
c_re = re.compile(rf"(\b{re.escape(c)}\b)+",re.I)

a_re.findall('g(x)g')
b_re.findall('(x)g')
c_re.findall('g(x)')
c_re.findall(' g(x) ')

我想要的结果在下面。

['g(x)g']
['(x)g']
['g(x)']
['g(x)']

但是实际结果在下面。

['g(x)g']
[]
[]
[]

必须观察到以下条件：

A combination of variables and f-string should be used.
\b must not be removed.

因为我想知道句子中是否有某些字符。

如何获得想要的结果？

常规字符使用\ b没有问题，但是它不适合以'（或以'）''为开头的单词。

我想知道是否有\ b的替代方法可以用这些单词使用。

我必须使用与\ b相同的功能，因为我想确保句子包含一个特定的单词。

原文

I ran the following code in python 3.8

import re

a,b,c = 'g(x)g', '(x)g', 'g(x)'
a_re = re.compile(rf"(\b{re.escape(a)}\b)+",re.I)
b_re = re.compile(rf"(\b{re.escape(b)}\b)+",re.I)
c_re = re.compile(rf"(\b{re.escape(c)}\b)+",re.I)

a_re.findall('g(x)g')
b_re.findall('(x)g')
c_re.findall('g(x)')
c_re.findall(' g(x) ')

The result I want is below.

['g(x)g']
['(x)g']
['g(x)']
['g(x)']

But the actual result is below.

['g(x)g']
[]
[]
[]

The following conditions must be observed:

A combination of variables and f-string should be used.
\b must not be removed.

Because I want to know if there are certain characters in the sentence.

How can I get the results I want?

Regular characters have no problem using \b, but it won't work for words that start with '(' or end with ')'.

I was wondering if there is an alternative to \b that can be used in these words.

I must use the same function as \b because I want to make sure that the sentence contains a specific word.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

感情洁癖 2025-02-16 03:56:02

\ b是\ W和\ w角色之间的边界/library/re.html“ rel =“ nofollow noreferrer”> docs ）。这就是为什么您的第一个给出结果的原因（因为它以字符开始和结尾），但没有其他结果。

为了获得预期的结果，您的模式应该看起来像这样：

a_re = re.compile(rf"(\b{re.escape(a)}\b)+",re.I)  # No change
b_re = re.compile(rf"({re.escape(b)}\b)+",re.I)  # No '\b' in the beginning
c_re = re.compile(rf"(\b{re.escape(c)})+",re.I)  # No '\b' in the end

\b is the boundary between \w and \W characters (Docs). That is why your first one gives the result (since it starts and ends with characters) but none of the others.

To get the expected result, your patterns should look like these:

a_re = re.compile(rf"(\b{re.escape(a)}\b)+",re.I)  # No change
b_re = re.compile(rf"({re.escape(b)}\b)+",re.I)  # No '\b' in the beginning
c_re = re.compile(rf"(\b{re.escape(c)})+",re.I)  # No '\b' in the end

回复收藏 0 原文

爱已欠费 2025-02-16 03:56:02

您可以通过查找启动，结束或分离器来编写自己的\ b，而不是捕获

（^| [。\ \“ \']） start或boundard
code> （$ | [。\“ \']）结束或边界
（？：）非捕捉组

>>> a_re = re.compile(rf"(?:^|[ .\"\'])({re.escape(a)})(?:$|[ .\"\'])", re.I)
>>> b_re = re.compile(rf"(?:^|[ .\"\'])({re.escape(b)})(?:$|[ .\"\'])", re.I)
>>> c_re = re.compile(rf"(?:^|[ .\"\'])({re.escape(c)})(?:$|[ .\"\'])", re.I)
>>> a_re.findall('g(x)g')
['g(x)g']
>>> b_re.findall('(x)g')
['(x)g']
>>> c_re.findall('g(x)')
['g(x)']
>>> c_re.findall(' g(x) ')
['g(x)']

You can write your own \b by finding start, end, or separator and not capturing it

(^|[ .\"\']) start or boundary
($|[ .\"\']) end or boundary
(?:) non-capture group

>>> a_re = re.compile(rf"(?:^|[ .\"\'])({re.escape(a)})(?:$|[ .\"\'])", re.I)
>>> b_re = re.compile(rf"(?:^|[ .\"\'])({re.escape(b)})(?:$|[ .\"\'])", re.I)
>>> c_re = re.compile(rf"(?:^|[ .\"\'])({re.escape(c)})(?:$|[ .\"\'])", re.I)
>>> a_re.findall('g(x)g')
['g(x)g']
>>> b_re.findall('(x)g')
['(x)g']
>>> c_re.findall('g(x)')
['g(x)']
>>> c_re.findall(' g(x) ')
['g(x)']

回复收藏 0 原文

~没有更多了~