如何用一系列大写单词将字符串与正则划分

发布于 2025-01-28 02:24:47 字数 518 浏览 1 评论 0原文

给出这样的文字:

text= "THE TEXT contains uppercase letter, but ALSO LOWER case ones. This is another sentence."

我想要这样的输出 - >

['THE TEXT contains uppercase letter, but', 'ALSO LOWER case ones. This is another sentence.']

如何编写正则表达式以获取该输出?

我尝试使用此正则”(\ b [az] [az]+(?:\ s+[az] [az]+)*\ b)“,但输出有所不同:

[ '',
 'THE TEXT',
 'contains uppercase letter, but',
 'ALSO LOWER',
  'case ones. This is another sentence.']

Giving a text like this :

text= "THE TEXT contains uppercase letter, but ALSO LOWER case ones. This is another sentence."

I want an output something like this -->

['THE TEXT contains uppercase letter, but', 'ALSO LOWER case ones. This is another sentence.']

How can i write a regex to obtain that output?

I tried with this regex "(\b[A-Z][A-Z]+(?:\s+[A-Z][A-Z]+)*\b)" but the output was differnt:

[ '',
 'THE TEXT',
 'contains uppercase letter, but',
 'ALSO LOWER',
  'case ones. This is another sentence.']

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

情深如许 2025-02-04 02:24:47

您可以与

re.findall(r'\b[A-Z]{2,}(?:\s+[A-Z]{2,})*\b.*?(?=\s*\b[A-Z]{2}|$)', text, re.DOTALL)

详细信息

  • \ b [az] {2,}(?:\ s+[az] {2,})*\ b - 单词边界,两个或多个大写字母,一个或多个空格,两个或多个ASCII大写字母和一个单词边界
  • \ b [az] {2} | $) - 一个正面的lookahead匹配,该位置与零或更多的whitespaces,单词边界和两个大写字母或字符串的结尾相匹配。

You can match and extract them with

re.findall(r'\b[A-Z]{2,}(?:\s+[A-Z]{2,})*\b.*?(?=\s*\b[A-Z]{2}|$)', text, re.DOTALL)

See the regex demo.

Details:

  • \b[A-Z]{2,}(?:\s+[A-Z]{2,})*\b - word boundary, two or more uppercase letters, zero or more repetitions of one or more whitespaces, two or more ASCII uppercase letters and a word boundary
  • .*? - any zero or more chars as few as possible
  • (?=\s*\b[A-Z]{2}|$) - a positive lookahead that matches a location that is immediately followed with zero or more whitespaces, word boundary and two uppercase letters, or end of string.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文