如何用一系列大写单词将字符串与正则划分
给出这样的文字:
text= "THE TEXT contains uppercase letter, but ALSO LOWER case ones. This is another sentence."
我想要这样的输出 - >
['THE TEXT contains uppercase letter, but', 'ALSO LOWER case ones. This is another sentence.']
如何编写正则表达式以获取该输出?
我尝试使用此正则”(\ b [az] [az]+(?:\ s+[az] [az]+)*\ b)“
,但输出有所不同:
[ '',
'THE TEXT',
'contains uppercase letter, but',
'ALSO LOWER',
'case ones. This is another sentence.']
Giving a text like this :
text= "THE TEXT contains uppercase letter, but ALSO LOWER case ones. This is another sentence."
I want an output something like this -->
['THE TEXT contains uppercase letter, but', 'ALSO LOWER case ones. This is another sentence.']
How can i write a regex to obtain that output?
I tried with this regex "(\b[A-Z][A-Z]+(?:\s+[A-Z][A-Z]+)*\b)"
but the output was differnt:
[ '',
'THE TEXT',
'contains uppercase letter, but',
'ALSO LOWER',
'case ones. This is another sentence.']
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以与
详细信息:
\ b [az] {2,}(?:\ s+[az] {2,})*\ b
- 单词边界,两个或多个大写字母,一个或多个空格,两个或多个ASCII大写字母和一个单词边界\ b [az] {2} | $)
- 一个正面的lookahead匹配,该位置与零或更多的whitespaces,单词边界和两个大写字母或字符串的结尾相匹配。You can match and extract them with
See the regex demo.
Details:
\b[A-Z]{2,}(?:\s+[A-Z]{2,})*\b
- word boundary, two or more uppercase letters, zero or more repetitions of one or more whitespaces, two or more ASCII uppercase letters and a word boundary.*?
- any zero or more chars as few as possible(?=\s*\b[A-Z]{2}|$)
- a positive lookahead that matches a location that is immediately followed with zero or more whitespaces, word boundary and two uppercase letters, or end of string.