列表过滤和转换

发布于 2024-08-09 23:37:26 字数 448 浏览 6 评论 0原文

我有一个库文件名列表,需要根据正则表达式进行过滤,然后从匹配的文件名中提取版本号。这是执行此操作的明显方法:

libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
versions = []
regex = re.compile('libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')
for l in libs:
    m = regex.match(l)
    if m:
        versions.append(m.group(1))

这会产生以下列表:

['3.3.1', '3.2.0']

但我觉得循环不是很“Python 风格”,并且觉得应该可以用一些智能单行代码替换上面的“for”循环。 建议?

I have a list of library filenames that I need to filter against regular expression and then extract version number from those that match. This is the obvious way to do that:

libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
versions = []
regex = re.compile('libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')
for l in libs:
    m = regex.match(l)
    if m:
        versions.append(m.group(1))

That produces the following list:

['3.3.1', '3.2.0']

Yet I feel that loop is not very 'Python style' and feel it should be possible to replace 'for' loop above with some smart one-liner.
Suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

絕版丫頭 2024-08-16 23:37:26

列表理解怎么样?

In [5]: versions = [m.group(1) for m in [regex.match(lib) for lib in libs] if m] 
In [6]: versions
Out[6]: ['3.3.1', '3.2.0']

How about a list comprehension?

In [5]: versions = [m.group(1) for m in [regex.match(lib) for lib in libs] if m] 
In [6]: versions
Out[6]: ['3.3.1', '3.2.0']
酒几许 2024-08-16 23:37:26

再一句只是为了展示其他方式(我还清理了一点正则表达式):

regex = re.compile(r'^libIce\.so\.([0-9]+\.[0-9]+\.[0-9]+)

但请注意,您的原始版本比所有建议更具可读性。值得改变吗?

) sum(map(regex.findall, libs), [])

但请注意,您的原始版本比所有建议更具可读性。值得改变吗?

One more one-liner just to show other ways (I've also cleaned regexp a bit):

regex = re.compile(r'^libIce\.so\.([0-9]+\.[0-9]+\.[0-9]+)

But note, that your original version is more readable than all suggestions. Is it worth to change?

) sum(map(regex.findall, libs), [])

But note, that your original version is more readable than all suggestions. Is it worth to change?

涙—继续流 2024-08-16 23:37:26

你可以这样做:

versions = [m.group(1) for m in [regex.match(l) for l in libs] if m]

不过,我认为它的可读性不是很好......

也许分两步完成会更清楚:

matches = [regex.match(l) for l in line]
versions = [m.group(1) for m in matches if m]

You could do this:

versions = [m.group(1) for m in [regex.match(l) for l in libs] if m]

I don't think it's very readable, though...

Maybe it's clearer done in two steps:

matches = [regex.match(l) for l in line]
versions = [m.group(1) for m in matches if m]
你没皮卡萌 2024-08-16 23:37:26

使用标准 for 循环没有什么不是 Pythonic 的。但是,您可以使用 map() 函数生成新的基于针对列表中的每个项目运行的函数的结果的列表。

There's nothing that isn't pythonic about using a standard for loop. However, you can use the map() function to generate a new list based on the results from a function run against each item in the list.

南渊 2024-08-16 23:37:26

对于简单的情况,您真的不需要费心使用正则表达式

>>> libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
>>> libs
['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
>>> for i in libs:
...   print i.split("so.")
...
['libIce.', '33']
['libIce.', '3.3.1']
['libIce.', '32']
['libIce.', '3.2.0']
>>> for i in libs:
...   print i.split("so.")[-1]
...
33
3.3.1
32
3.2.0
>>>

进行进一步检查以获取带有“点”的内容。

you don't really need to bother with regex for your simple case

>>> libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
>>> libs
['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
>>> for i in libs:
...   print i.split("so.")
...
['libIce.', '33']
['libIce.', '3.3.1']
['libIce.', '32']
['libIce.', '3.2.0']
>>> for i in libs:
...   print i.split("so.")[-1]
...
33
3.3.1
32
3.2.0
>>>

Do further checking to get those with "dots".

遮了一弯 2024-08-16 23:37:26

这个怎么样:

import re

def matches(regexp, list):
    'Regexp, [str] -> Iterable(Match or None)'
    return (regexp.match(s) for s in list)

libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
regexp = re.compile('libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')
versions = [m.group(1) for m in matches(regexp, libs) if m is not None]

>>> print versions
['3.3.1', '3.2.0']

How about this one:

import re

def matches(regexp, list):
    'Regexp, [str] -> Iterable(Match or None)'
    return (regexp.match(s) for s in list)

libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
regexp = re.compile('libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')
versions = [m.group(1) for m in matches(regexp, libs) if m is not None]

>>> print versions
['3.3.1', '3.2.0']
盗梦空间 2024-08-16 23:37:26

我能想到的一种方法是将“地图”和列表理解结合起来。
解决方案如下所示:

import re  
libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']  
versions = []  

regex = re.compile('libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')  

def match(s):  
    m = regex.match(s)  
    if m:  
        return m.group(1)  

versions = [x for x in map(match,libs) if x]  

One way I could think of was to combine 'map' and list comprehension.
The solution looks as below:

import re  
libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']  
versions = []  

regex = re.compile('libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')  

def match(s):  
    m = regex.match(s)  
    if m:  
        return m.group(1)  

versions = [x for x in map(match,libs) if x]  

暮年 2024-08-16 23:37:26

Python 3.8 开始,并引入赋值表达式 (PEP 572):= 运算符),可以在列表理解中使用局部变量,以避免调用正则表达式匹配结果的两次:

# libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
# pattern = re.compile(r'libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')
[match.group(1) for lib in libs if (match := pattern.match(lib))]
# ['3.3.1', '3.2.0']

此:

  • 命名 < 的计算code>pattern.match(lib) 作为变量 match(可以是 Nonere.Match 对象)
  • 使用此 match 命名表达式(NoneMatch)过滤掉不匹配的元素
  • 并重新使用 match 通过提取第一个组 (match.group(1)) 来添加到映射值中。

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), it's possible to use a local variable within a list comprehension in order to avoid calling twice the result of the regex matching:

# libs = ['libIce.so.33', 'libIce.so.3.3.1', 'libIce.so.32', 'libIce.so.3.2.0']
# pattern = re.compile(r'libIce.so\.([0-9]+\.[0-9]+\.[0-9]+)')
[match.group(1) for lib in libs if (match := pattern.match(lib))]
# ['3.3.1', '3.2.0']

This:

  • Names the evaluation of pattern.match(lib) as a variable match (which is either None or a re.Match object)
  • Uses this match named expression in place (either None or a Match) to filter out non matching elements
  • And re-uses match in the mapped value by extracting the first group (match.group(1)).
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文