当前位置：文江博客话题详情

Python regex

拆分琴弦而无需删除分离器

发布于 2025-02-12 21:00:26 字数 287 浏览 2 评论 0 原文

我有以下文本，

text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

我希望输出是：

12345678 abcdefg
37394822 gdzdnhqihdzuiew 
09089799 
78998728 gdjewdwq

我尝试了“ re.split（“ \ d {8}”，text）”，但结果不正确。如何获得正确的输出？

原文

I have the following text,

text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

And I want the output be:

12345678 abcdefg
37394822 gdzdnhqihdzuiew 
09089799 
78998728 gdjewdwq

I tried "re.split("\d{8}", text)", but the result is incorrect.
How to get the correct output?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烦人精 2025-02-19 21:00:26

您可以使用“ lookahead”

regex tutorial- lookahead- lookbehind and lookbehind Zere Lengthens Zere Length Spertions

import re
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"
arr = re.split(r"\s+(?=\d)", text)
print(arr)

You can use "Lookahead"

Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions

import re
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"
arr = re.split(r"\s+(?=\d)", text)
print(arr)

回复收藏 0 原文

给我一枪 2025-02-19 21:00:26

iiuc，您希望将数字部分与字母数字和数字配对，始终是每行的第一条问题，

而不是解决方案的优雅，而是解决问题

splitted_txt = txt.split(' ')
i=0
while (i < (len(splitted_txt))):
    if (splitted_txt[i].isdigit() & ~(splitted_txt[i+1].isdigit())  ):
        print(splitted_txt[i], splitted_txt[i+1] )
        i+=1
    else:
        print(splitted_txt[i])
    i+=1

12345678 abcdefg
37394822 gdzdnhqihdzuiew
09089799
78998728 gdjewdwq

IIUC, you looking to pair the numeric part with the alphanumeric and numeric will always be the first on each line

not an elegant of solution but addresses the question

splitted_txt = txt.split(' ')
i=0
while (i < (len(splitted_txt))):
    if (splitted_txt[i].isdigit() & ~(splitted_txt[i+1].isdigit())  ):
        print(splitted_txt[i], splitted_txt[i+1] )
        i+=1
    else:
        print(splitted_txt[i])
    i+=1

12345678 abcdefg
37394822 gdzdnhqihdzuiew
09089799
78998728 gdjewdwq

回复收藏 0 原文

凑诗 2025-02-19 21:00:26

我更喜欢 @itagaki的答案，但值得注意的是， findall 也可以使用：

import re
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

re.findall(r"\d+(?:\s+[a-z]+)?", text)
  #=> ['12345678 abcdefg', '37394822 gdzdnhqihdzuiew', '09089799', '78998728 gdjewdwq']

demo

正则表达式可以分解如下。

\d+       # match one or more digits
(?:       # begin a non-capture group
  \s+     # match one or more whitespaces
  [a-z]+  # match one or more lowercase letters
)         # end non-capture group
?         # make non-capture group optional

如果需要完全有8位数字，并且字符串小写字母的长度在（例如）7和15之间（如示例），则将其正则稍微修改：

r"\d{8}(?:\s+[a-z]{7,15})?"

I prefer @Itagaki's answer but it's worth noting that findall could also be used:

import re
text = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

re.findall(r"\d+(?:\s+[a-z]+)?", text)
  #=> ['12345678 abcdefg', '37394822 gdzdnhqihdzuiew', '09089799', '78998728 gdjewdwq']

Demo

The regular expression can be broken down as follows.

\d+       # match one or more digits
(?:       # begin a non-capture group
  \s+     # match one or more whitespaces
  [a-z]+  # match one or more lowercase letters
)         # end non-capture group
?         # make non-capture group optional

If it were required that there be exactly 8 digits and that the strings lowercase letters have lengths between (say) 7 and 15 (as in the example), the regex would be modified slightly:

r"\d{8}(?:\s+[a-z]{7,15})?"

回复收藏 0 原文

握住你手 2025-02-19 21:00:26

如果要匹配8位数字，则可以使用：

\b\d{8}\b.*?(?=\s*(?:\b\d{8}\b|$))

说明

\ b \ d {8} \ b 匹配8位被单词边界包围的数字以防止部分匹配
。
- \ s*匹配可选的Whitespace Chars
- （？：\ b \ d {8} \ b | $）匹配8位或断言字符串的结尾
（？：\ b ）关闭lookahead

Regex Demo | python demo

示例

import re

pattern = r"\b\d{8}\b.*?(?=\s*(?:\b\d{8}\b|$))"
s = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

print(re.findall(pattern, s))

['12345678 abcdefg', '37394822 gdzdnhqihdzuiew', '09089799', '78998728 gdjewdwq']

If you want to match 8 digits, you can use:

\b\d{8}\b.*?(?=\s*(?:\b\d{8}\b|$))

Explanation

\b\d{8}\b Match 8 digits surrounded by word boundaries to prevent partial matches
.*? Match any char, as least as possible
(?= Positive lookahead
- \s* Match optional whitespace chars
- (?:\b\d{8}\b|$) Match either 8 digits or assert the end of the string
) Close lookahead

Regex demo | Python demo

Example

import re

pattern = r"\b\d{8}\b.*?(?=\s*(?:\b\d{8}\b|$))"
s = "12345678 abcdefg 37394822 gdzdnhqihdzuiew 09089799 78998728 gdjewdwq"

print(re.findall(pattern, s))

Output

['12345678 abcdefg', '37394822 gdzdnhqihdzuiew', '09089799', '78998728 gdjewdwq']

回复收藏 0 原文

~没有更多了~

关于作者

一片旧的回忆

暂无简介

文章

28 人气

关注发私信

櫻之舞

文章 0 评论 0

关注

弥枳

文章 0 评论 0

关注

m2429

文章 0 评论 0

关注

寻找一个思念的角度

文章 0 评论 0

关注

野却迷人

文章 0 评论 0

关注

我怀念的。

文章 0 评论 0

友情链接

文江博客

拆分琴弦而无需删除分离器

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

拆分琴弦而无需删除分离器

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。