使用正则表达式将Python中的大写重复字母替换为单个小写字母
我试图将字符串中重复两次的大写字母的任何实例替换为该字母的单个小写实例。我正在使用以下正则表达式,它能够匹配重复的大写字母,但我不确定如何将被替换的字母变成小写。
import re
s = 'start TT end'
re.sub(r'([A-Z]){2}', r"\1", s)
>>> 'start T end'
如何将“\1”变成小写?我不应该使用正则表达式来执行此操作吗?
I am trying to replace any instances of uppercase letters that repeat themselves twice in a string with a single instance of that letter in a lower case. I am using the following regular expression and it is able to match the repeated upper case letters, but I am unsure as how to make the letter that is being replaced lower case.
import re
s = 'start TT end'
re.sub(r'([A-Z]){2}', r"\1", s)
>>> 'start T end'
How can I make the "\1" lower case? Should I not be using a regular expression to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
传递一个函数作为
repl
参数。MatchObject
传递给此函数并.group(1)
给出第一个带括号的子组:编辑
是的,您应该使用
([AZ])\1
而不是([AZ]){2}
以便不匹配例如AZ
。 (参见@bobince的给出:
Pass a function as the
repl
argument. TheMatchObject
is passed to this function and.group(1)
gives the first parenthesized subgroup:EDIT
And yes, you should use
([A-Z])\1
instead of([A-Z]){2}
in order to not match e.g.AZ
. (See @bobince's answer.)Gives:
您无法更改替换字符串中的大小写。您需要一个替换功能:
You can't change case in a replacement string. You would need a replacement function:
我想这就是您正在寻找的。
I guess this is what you are looking for.
您可以使用正则表达式来完成此操作,只需传递一个函数作为替换,例如 文档说。问题是你的模式。
事实上,您的模式与任意两个大写字母的运行相匹配。我将把实际的模式留给您,但它以 AA|BB|CC| 开头。
You can do it with a regular expression, just pass a function as the replacement like the docs say. The problem is your pattern.
As it is, your pattern matches runs of any two capital letters. I'll leave the actual pattern to you, but it starts with
AA|BB|CC|
.标识替换的“repl”参数可以是字符串(如此处所示)或函数。这将做你想做的事:
The 'repl' parameter that identifies the replacement can be either a string (as you have it here) or a function. This will do what you wish:
试试这个:
请注意,这不会替换单个大写字母。如果您想这样做,请使用
r'([AZ]){1,}'
。Try this:
Note that this doesn't replace singe upper letters. If you want to do it, use
r'([A-Z]){1,}'
.警告!这篇文章没有按要求回复。继续承担你自己的责任!
我不知道极端情况怎么可能发生,但这就是普通 Python 进行我的幼稚编码的方式。
WARNING! This post has no re as requested. Continue with your own responsibility!
I do not know how possible are corner cases but this is how normal Python does my naive coding.