不区分大小写的正则表达式无需重新编译?

发布于 2024-07-13 05:10:12 字数 485 浏览 13 评论 0 原文

在 Python 中,我可以使用 re.compile 将正则表达式编译为不区分大小写:

>>> s = 'TeSt'
>>> casesensitive = re.compile('test')
>>> ignorecase = re.compile('test', re.IGNORECASE)
>>> 
>>> print casesensitive.match(s)
None
>>> print ignorecase.match(s)
<_sre.SRE_Match object at 0x02F0B608>

有没有办法做到同样的事情,但不使用 re.compile。 我在文档中找不到类似 Perl 的 i 后缀(例如 m/test/i)的内容。

In Python, I can compile a regular expression to be case-insensitive using re.compile:

>>> s = 'TeSt'
>>> casesensitive = re.compile('test')
>>> ignorecase = re.compile('test', re.IGNORECASE)
>>> 
>>> print casesensitive.match(s)
None
>>> print ignorecase.match(s)
<_sre.SRE_Match object at 0x02F0B608>

Is there a way to do the same, but without using re.compile. I can't find anything like Perl's i suffix (e.g. m/test/i) in the documentation.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

原来分手还会想你 2024-07-20 05:10:12

re.IGNORECASE 传递给 flags 参数"noreferrer">搜索匹配,或sub

re.search('test', 'TeSt', re.IGNORECASE)
re.match('test', 'TeSt', re.IGNORECASE)
re.sub('test', 'xxxx', 'Testing', flags=re.IGNORECASE)

Pass re.IGNORECASE to the flags param of search, match, or sub:

re.search('test', 'TeSt', re.IGNORECASE)
re.match('test', 'TeSt', re.IGNORECASE)
re.sub('test', 'xxxx', 'Testing', flags=re.IGNORECASE)
青丝拂面 2024-07-20 05:10:12

您还可以使用不带 IGNORECASE 标志的搜索/匹配来执行不区分大小写的搜索(在 Python 2.7.3 中测试):

re.search(r'(?i)test', 'TeSt').group()    ## returns 'TeSt'
re.match(r'(?i)test', 'TeSt').group()     ## returns 'TeSt'

You can also perform case insensitive searches using search/match without the IGNORECASE flag (tested in Python 2.7.3):

re.search(r'(?i)test', 'TeSt').group()    ## returns 'TeSt'
re.match(r'(?i)test', 'TeSt').group()     ## returns 'TeSt'
帅气尐潴 2024-07-20 05:10:12

不区分大小写的标记 (?i) 可以直接合并到正则表达式模式中:

>>> import re
>>> s = 'This is one Test, another TEST, and another test.'
>>> re.findall('(?i)test', s)
['Test', 'TEST', 'test']

The case-insensitive marker, (?i) can be incorporated directly into the regex pattern:

>>> import re
>>> s = 'This is one Test, another TEST, and another test.'
>>> re.findall('(?i)test', s)
['Test', 'TEST', 'test']
迷迭香的记忆 2024-07-20 05:10:12

您还可以在模式编译期间定义不区分大小写:

pattern = re.compile('FIle:/+(.*)', re.IGNORECASE)

You can also define case insensitive during the pattern compile:

pattern = re.compile('FIle:/+(.*)', re.IGNORECASE)
妄想挽回 2024-07-20 05:10:12

在导入中

import re

在运行时处理中:

RE_TEST = r'test'
if re.match(RE_TEST, 'TeSt', re.IGNORECASE):

应该提到的是,不使用 re.compile 是一种浪费。 每次调用上面的 match 方法时,都会编译正则表达式。 这在其他编程语言中也是错误的做法。 下面是更好的做法。

在应用程序初始化中:

self.RE_TEST = re.compile('test', re.IGNORECASE)

在运行时处理中:

if self.RE_TEST.match('TeSt'):

In imports

import re

In run time processing:

RE_TEST = r'test'
if re.match(RE_TEST, 'TeSt', re.IGNORECASE):

It should be mentioned that not using re.compile is wasteful. Every time the above match method is called, the regular expression will be compiled. This is also faulty practice in other programming languages. The below is the better practice.

In app initialization:

self.RE_TEST = re.compile('test', re.IGNORECASE)

In run time processing:

if self.RE_TEST.match('TeSt'):
待天淡蓝洁白时 2024-07-20 05:10:12

要执行不区分大小写的操作,请提供 re.IGNORECASE

>>> import re
>>> test = 'UPPER TEXT, lower text, Mixed Text'
>>> re.findall('text', test, flags=re.IGNORECASE)
['TEXT', 'text', 'Text']

,如果我们想替换与大小写匹配的文本...

>>> def matchcase(word):
        def replace(m):
            text = m.group()
            if text.isupper():
                return word.upper()
            elif text.islower():
                return word.lower()
            elif text[0].isupper():
                return word.capitalize()
            else:
                return word
        return replace

>>> re.sub('text', matchcase('word'), test, flags=re.IGNORECASE)
'UPPER WORD, lower word, Mixed Word'

To perform case-insensitive operations, supply re.IGNORECASE

>>> import re
>>> test = 'UPPER TEXT, lower text, Mixed Text'
>>> re.findall('text', test, flags=re.IGNORECASE)
['TEXT', 'text', 'Text']

and if we want to replace text matching the case...

>>> def matchcase(word):
        def replace(m):
            text = m.group()
            if text.isupper():
                return word.upper()
            elif text.islower():
                return word.lower()
            elif text[0].isupper():
                return word.capitalize()
            else:
                return word
        return replace

>>> re.sub('text', matchcase('word'), test, flags=re.IGNORECASE)
'UPPER WORD, lower word, Mixed Word'
左岸枫 2024-07-20 05:10:12

对于不区分大小写的正则表达式(Regex):
有两种方法可以添加代码:

  1. flags=re.IGNORECASE

    Regx3GList = re.search("(WCDMA:)((\d*)(,?))*", txt, re.IGNORECASE) 
      
  2. 不区分大小写的标记(?i)

    Regx3GList = re.search("**(?i)**(WCDMA:)((\d*)(,?))*", txt) 
      

For Case insensitive regular expression(Regex):
There are two ways by adding in your code:

  1. flags=re.IGNORECASE

    Regx3GList = re.search("(WCDMA:)((\d*)(,?))*", txt, re.IGNORECASE)
    
  2. The case-insensitive marker (?i)

    Regx3GList = re.search("**(?i)**(WCDMA:)((\d*)(,?))*", txt)
    
对风讲故事 2024-07-20 05:10:12
#'re.IGNORECASE' for case insensitive results short form re.I
#'re.match' returns the first match located from the start of the string. 
#'re.search' returns location of the where the match is found 
#'re.compile' creates a regex object that can be used for multiple matches

 >>> s = r'TeSt'   
 >>> print (re.match(s, r'test123', re.I))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
 # OR
 >>> pattern = re.compile(s, re.I)
 >>> print(pattern.match(r'test123'))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
#'re.IGNORECASE' for case insensitive results short form re.I
#'re.match' returns the first match located from the start of the string. 
#'re.search' returns location of the where the match is found 
#'re.compile' creates a regex object that can be used for multiple matches

 >>> s = r'TeSt'   
 >>> print (re.match(s, r'test123', re.I))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
 # OR
 >>> pattern = re.compile(s, re.I)
 >>> print(pattern.match(r'test123'))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
勿忘初心 2024-07-20 05:10:12

如果您想替换但仍保留之前 str 的风格。 有可能的。

例如:突出显示字符串“test asdasd TEST asd tEst asdasd”。

sentence = "test asdasd TEST asd tEst asdasd"
result = re.sub(
  '(test)', 
  r'<b>\1</b>',  # \1 here indicates first matching group.
  sentence, 
  flags=re.IGNORECASE)

测试 asdasd 测试 asd tEst asdasd

If you would like to replace but still keeping the style of previous str. It is possible.

For example: highlight the string "test asdasd TEST asd tEst asdasd".

sentence = "test asdasd TEST asd tEst asdasd"
result = re.sub(
  '(test)', 
  r'<b>\1</b>',  # \1 here indicates first matching group.
  sentence, 
  flags=re.IGNORECASE)

test asdasd TEST asd tEst asdasd

天生の放荡 2024-07-20 05:10:12

(?i) 使用以下有效标志匹配模式的其余部分: i 修饰符:不敏感。 不区分大小写的匹配(忽略 [a-zA-Z] 的大小写)

>>> import pandas as pd
>>> s = pd.DataFrame({ 'a': ["TeSt"] })
>>> r = s.replace(to_replace=r'(?i)test', value=r'TEST', regex=True)
>>> print(r)
      a
0  TEST

(?i) match the remainder of the pattern with the following effective flags: i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])

>>> import pandas as pd
>>> s = pd.DataFrame({ 'a': ["TeSt"] })
>>> r = s.replace(to_replace=r'(?i)test', value=r'TEST', regex=True)
>>> print(r)
      a
0  TEST
猫弦 2024-07-20 05:10:12

我建议使用 (?i:string_region_to_ignore_case) 而不是 (?i)。 这种方法允许人们以一种更挑剔但更清晰的方式处理区分大小写的问题。 例如:

rex = re.findall (r'J(?i:ohn) S(?i:mith)',
      "John smith ; JOHN SMITH; john Smith; John Smith")
#Result:
['JOHN SMITH', 'John Smith']

I would recommend using (?i:string_region_to_ignore_case) rather than (?i). This method allows one to deal with case sensitivity in a more picky yet clear manner. For instance:

rex = re.findall (r'J(?i:ohn) S(?i:mith)',
      "John smith ; JOHN SMITH; john Smith; John Smith")
#Result:
['JOHN SMITH', 'John Smith']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文