Python 正则表达式中的动态命名组

发布于 2024-08-17 18:23:33 字数 219 浏览 19 评论 0原文

有没有办法动态更新Python中正则表达式组的名称？

例如，如果文本是：

person 1: name1
person 2: name2
person 3: name3
...
person N: nameN

在事先不知道有多少人的情况下，您如何命名组“person1”、“person2”、“person3”、...和“personN”？

原文

Is there a way to dynamically update the name of regex groups in Python?

For example, if the text is:

person 1: name1
person 2: name2
person 3: name3
...
person N: nameN

How would you name groups 'person1', 'person2', 'person3', ..., and 'personN' without knowing beforehand how many people there are?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

所谓喜欢 2024-08-24 18:23:33

不，但你可以这样做：

>>> import re
>>> p = re.compile('(?m)^(.*?)\\s*:\\s*(.*)
输出：
[('person 1', 'name1'), ('person 2', 'name2'), ('person 3', 'name3'), ('person N', 'nameN')]

快速解释：
(?m)     # enable multi-line mode
^        # match the start of a new line
(.*?)    # un-greedily match zero or more chars and store it in match group 1
\s*:\s*  # match a colon possibly surrounded by space chars
(.*)     # match the rest of the line and store it in match group 2
$        # match the end of the line

引用

多行模式：  http://www.regular-expressions.info/modifiers.html
贪婪/不贪婪匹配： http://www.regular-expressions.info/repeat.html
匹配组 http://www.regular-expressions.info/brackets.html

)
>>> text = '''person 1: name1
person 2: name2
person 3: name3
...
person N: nameN'''
>>> p.findall(text)

输出：

快速解释：

引用

多行模式： http://www.regular-expressions.info/modifiers.html
贪婪/不贪婪匹配： http://www.regular-expressions.info/repeat.html
匹配组 http://www.regular-expressions.info/brackets.html

No, but you can do something like this:

>>> import re
>>> p = re.compile('(?m)^(.*?)\\s*:\\s*(.*)
output:
[('person 1', 'name1'), ('person 2', 'name2'), ('person 3', 'name3'), ('person N', 'nameN')]

A quick explanation:
(?m)     # enable multi-line mode
^        # match the start of a new line
(.*?)    # un-greedily match zero or more chars and store it in match group 1
\s*:\s*  # match a colon possibly surrounded by space chars
(.*)     # match the rest of the line and store it in match group 2
$        # match the end of the line

References

multi-line mode: http://www.regular-expressions.info/modifiers.html
greedy/ungreedy matching: http://www.regular-expressions.info/repeat.html
match groups http://www.regular-expressions.info/brackets.html

)
>>> text = '''person 1: name1
person 2: name2
person 3: name3
...
person N: nameN'''
>>> p.findall(text)

output:

A quick explanation:

References

multi-line mode: http://www.regular-expressions.info/modifiers.html
greedy/ungreedy matching: http://www.regular-expressions.info/repeat.html
match groups http://www.regular-expressions.info/brackets.html

回复收藏 0 原文

蹲墙角沉默 2024-08-24 18:23:33

命名捕获组和编号组（\1、\2 等）不能是动态的，但您可以使用 findall 实现相同的效果：

re.findall(pattern, string[, flags])
返回字符串中模式的所有非重叠匹配项，作为列表
字符串。字符串被扫描
从左到右，匹配项是
按找到的顺序返回。如果一个或
更多团体出现在
模式，返回组列表；这
将是一个元组列表，如果
模式有多个组。空的
匹配项包含在结果中
除非他们触及了开头
另一场比赛。

回复收藏 0 原文

白云不回头 2024-08-24 18:23:33

从您接受的答案来看，不需要正则

p="""
person 1: name1
person 2: name2
person 3: name3
person N: nameN
"""

ARR=[]
for item in p.split("\n"):
    if item:
        s=item.split(":")
        ARR.append(s)
print ARR

表达式输出

$ ./python.py
[['person 1', ' name1'], ['person 2', ' name2'], ['person 3', ' name3'], ['person N', ' nameN']]

judging from your accepted answer, there's no need for regex

p="""
person 1: name1
person 2: name2
person 3: name3
person N: nameN
"""

ARR=[]
for item in p.split("\n"):
    if item:
        s=item.split(":")
        ARR.append(s)
print ARR

output

$ ./python.py
[['person 1', ' name1'], ['person 2', ' name2'], ['person 3', ' name3'], ['person N', ' nameN']]

回复收藏 0 原文

前事休说 2024-08-24 18:23:33

Python 中的正则表达式（我非常确定这对于一般的正则表达式来说是正确的）不允许任意数量的匹配。您可以捕获整个重复匹配（通过在重复组周围放置捕获括号）或捕获一系列匹配中的最后一个匹配（通过重复捕获组）。这与这些捕获组是命名的还是编号的无关。

您需要通过迭代字符串中的所有匹配项以编程方式执行此操作，例如

for match in re.findall(pattern, string):
    do_something(match)

Regexes in Python (and I'm pretty certain that that's true for regexes in general) don't allow for an arbitrary number of matches. You can either capture a repeated match in its entirety (by placing capturing parentheses around a repeated group) or capture the last match in a series of matches (by repeating a capturing group). This is independent of whether these are named or numbered capturing groups.

You need to do this programmatically by iterating over all matches in a string, like

for match in re.findall(pattern, string):
    do_something(match)

回复收藏 0 原文

~没有更多了~

关于作者

软的没边

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

Python 正则表达式中的动态命名组

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

Python 正则表达式中的动态命名组

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。