列表理解与映射

发布于 2024-08-02 03:27:40 字数 159 浏览 6 评论 0原文

是否有理由更喜欢使用 map() 而不是列表理解或反之反之亦然? 它们中的任何一个通常比另一个更高效还是被认为更Pythonic?

Is there a reason to prefer using map() over list comprehension or vice versa? Is either of them generally more efficient or considered generally more Pythonic than the other?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

ヅ她的身影、若隐若现 2024-08-09 03:27:40

在某些情况下,map 可能会在微观上更快(当您为此目的创建 lambda,而是在 map 和 列表理解)。 在其他情况下,列表推导可能会更快,并且大多数(不是全部)Pythonista 认为它们更直接、更清晰。

使用完全相同的函数时 map 的微小速度优势的示例:

$ python -m timeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop

$ python -m timeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

当 map 需要 lambda 时性能比较如何完全相反的示例:

$ python -m timeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop

$ python -m timeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop

map may be microscopically faster in some cases (when you're not making a lambda for the purpose, but using the same function in map and a list comprehension). List comprehensions may be faster in other cases and most (not all) Pythonistas consider them more direct and clearer.

An example of the tiny speed advantage of map when using exactly the same function:

$ python -m timeit -s'xs=range(10)' 'map(hex, xs)'
100000 loops, best of 3: 4.86 usec per loop

$ python -m timeit -s'xs=range(10)' '[hex(x) for x in xs]'
100000 loops, best of 3: 5.58 usec per loop

An example of how performance comparison gets completely reversed when map needs a lambda:

$ python -m timeit -s'xs=range(10)' 'map(lambda x: x+2, xs)'
100000 loops, best of 3: 4.24 usec per loop

$ python -m timeit -s'xs=range(10)' '[x+2 for x in xs]'
100000 loops, best of 3: 2.32 usec per loop
緦唸λ蓇 2024-08-09 03:27:40

案例

  • 常见案例:几乎总是,您会希望在python中使用列表理解,因为您正在做的事情会更加明显新手程序员正在阅读您的代码。 (这不适用于其他语言,其他习惯用法可能适用。)对于 python 程序员来说,您正在做的事情甚至会更加明显,因为列表推导式是 python 中迭代的事实上的标准; 它们是预期的
  • 不太常见的情况:但是,如果您已经定义了一个函数,那么使用 map 通常是合理的,尽管它被认为是“非Pythonic” 。 例如,map(sum, myLists)[sum(x) for x in myLists] 更优雅/简洁。 您无需构造虚拟变量(例如 sum(x) for x...sum(_) for _... 或 < code>sum(readName) for ReadableName...),您必须输入两次才能迭代。 同样的论点也适用于 filterreduce 以及来自 itertools 模块的任何内容:如果您已经有一个方便的函数,您可以继续执行一些函数式编程。 这在某些情况下获得了可读性,但在其他情况下却失去了可读性(例如,新手程序员、多个参数)……但是代码的可读性很大程度上取决于您的注释。
  • 几乎从不:在进行函数式编程(映射 map)时,您可能希望将 map 函数用作纯抽象函数,或者柯里化 map,或者从将 map 作为函数讨论中获益。 例如,在 Haskell 中,名为 fmap 的仿函数接口概括了对任何数据结构的映射。 这在Python中非常罕见,因为Python语法迫使你使用生成器风格来谈论迭代; 你不能轻易地概括它。 (这有时是好的,有时是坏的。)您可能会想出一些罕见的 Python 示例,其中 map(f, *lists) 是合理的做法。 我能想到的最接近的例子是 sumEach =partial(map,sum),这是一个单行代码,大致相当于:
def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • 仅使用for循环:当然,您也可以仅使用for循环。 虽然从函数式编程的角度来看并不那么优雅,但有时​​非局部变量会使命令式编程语言(例如 python)中的代码更清晰,因为人们非常习惯以这种方式阅读代码。 一般来说,当您只是执行任何复杂的操作而不是构建列表(例如列表推导式和映射)时,For 循环也是最有效的(例如求和或创建树等) - 至少在内存方面高效(不一定在时间方面,我预计最坏的情况是一个恒定因素,除非出现一些罕见的病态垃圾收集问题)。

“Pythonism”

我不喜欢“Pythonic”这个词,因为我不认为Pythonic在我眼里总是优雅的。 尽管如此,mapfilter 以及类似的函数(比如非常有用的 itertools 模块)在风格上可能被认为是非 Python 的。

惰性

在效率方面,与大多数函数式编程结构一样,MAP可以是惰性的,事实上在Python中是惰性的。 这意味着您可以执行此操作(在python3中),并且您的计算机不会耗尽内存并丢失所有未保存的数据:

>>> map(str, range(10**100))
<map object at 0x2201d50>

尝试使用列表理解来做到这一点:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

请注意列表理解本质上也是惰性的,但是 python 选择将它们实现为非惰性的。 尽管如此,Python 确实支持生成器表达式形式的惰性列表推导式,如下所示:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

您基本上可以将 [...] 语法视为将生成器表达式传递给列表构造函数,例如 <代码>列表(x for x in range(5)).

简短的设计示例

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

列表推导式是非惰性的,因此可能需要更多内存(除非您使用生成器推导式)。 方括号 [...] 通常会使事情变得显而易见,尤其是在括号混乱的情况下。 另一方面,有时您最终会变得冗长,例如输入 [x for x in...。 只要保持迭代器变量简短,如果不缩进代码,列表推导式通常会更清晰。 但您始终可以缩进代码。

print(
    {x:x**2 for x in (-y for y in range(5))}
)

或者分解:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

python3 的效率比较

map 现在是惰性的:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

因此,如果您不会使用所有数据,或者提前不知道如何使用如果您需要大量数据,python3 中的 map (以及 python2 或 python3 中的生成器表达式)将避免计算它们的值,直到最后一刻必要。 通常,这通常会超过使用 map 的任何开销。 缺点是,与大多数函数式语言相比,这在 python 中非常有限:只有当您“按顺序”从左到右访问数据时,您才能获得此好处,因为 python 生成器表达式只能按顺序求值 x[0], x[1], x[2], ....

但是,假设我们有一个预制函数 f 我们想要到 map,我们通过立即使用 list(...) 强制计算来忽略 map 的惰性。 我们得到了一些非常有趣的结果:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

结果采用 AAA/BBB/CCC 形式,其中 A 是在大约 2010 年的 Intel 工作站上使用 python 3.?.? 执行的,B 和 C 是在大约 2013 年的 AMD 上执行的使用 python 3.2.1 的工作站,具有截然不同的硬件。 结果似乎是映射和列表理解在性能上具有可比性,而这受其他随机因素的影响最大。 我们唯一能说的似乎是,奇怪的是,虽然我们期望列表推导式 [...] 比生成器表达式 (...) 表现更好,但 < code>map 也比生成器表达式更有效(再次假设所有值都被评估/使用)。

重要的是要认识到这些测试假设一个非常简单的函数(恒等函数); 然而这很好,因为如果函数很复杂,那么与程序中的其他因素相比,性能开销可以忽略不计。 (使用其他简单的东西(例如 f=lambda x:x+x)进行测试可能仍然很有趣)

如果您擅长阅读 python 程序集,则可以使用 dis 模块来查看这是否真的是幕后发生的事情:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

看来使用 [...] 语法比 list(...) 更好。 遗憾的是,map 类对于反汇编来说有点不透明,但我们可以通过速度测试来完成。

Cases

  • Common case: Almost always, you will want to use a list comprehension in python because it will be more obvious what you're doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.) It will even be more obvious what you're doing to python programmers, since list comprehensions are the de-facto standard in python for iteration; they are expected.
  • Less-common case: However if you already have a function defined, it is often reasonable to use map, though it is considered 'unpythonic'. For example, map(sum, myLists) is more elegant/terse than [sum(x) for x in myLists]. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x... or sum(_) for _... or sum(readableName) for readableName...) which you have to type twice, just to iterate. The same argument holds for filter and reduce and anything from the itertools module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)... but the readability of your code highly depends on your comments anyway.
  • Almost never: You may want to use the map function as a pure abstract function while doing functional programming, where you're mapping map, or currying map, or otherwise benefit from talking about map as a function. In Haskell for example, a functor interface called fmap generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can't generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists) is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum), which is a one-liner that is very roughly equivalent to:
def sumEach(myLists):
    return [sum(_) for _ in myLists]
  • Just using a for-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) -- at least efficient in terms of memory (not necessarily in terms of time, where I'd expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping).

"Pythonism"

I dislike the word "pythonic" because I don't find that pythonic is always elegant in my eyes. Nevertheless, map and filter and similar functions (like the very useful itertools module) are probably considered unpythonic in terms of style.

Laziness

In terms of efficiency, like most functional programming constructs, MAP CAN BE LAZY, and in fact is lazy in python. That means you can do this (in python3) and your computer will not run out of memory and lose all your unsaved data:

>>> map(str, range(10**100))
<map object at 0x2201d50>

Try doing that with a list comprehension:

>>> [str(n) for n in range(10**100)]
# DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #

Do note that list comprehensions are also inherently lazy, but python has chosen to implement them as non-lazy. Nevertheless, python does support lazy list comprehensions in the form of generator expressions, as follows:

>>> (str(n) for n in range(10**100))
<generator object <genexpr> at 0xacbdef>

You can basically think of the [...] syntax as passing in a generator expression to the list constructor, like list(x for x in range(5)).

Brief contrived example

from operator import neg
print({x:x**2 for x in map(neg,range(5))})

print({x:x**2 for x in [-y for y in range(5)]})

print({x:x**2 for x in (-y for y in range(5))})

List comprehensions are non-lazy, so may require more memory (unless you use generator comprehensions). The square brackets [...] often make things obvious, especially when in a mess of parentheses. On the other hand, sometimes you end up being verbose like typing [x for x in.... As long as you keep your iterator variables short, list comprehensions are usually clearer if you don't indent your code. But you could always indent your code.

print(
    {x:x**2 for x in (-y for y in range(5))}
)

or break things up:

rangeNeg5 = (-y for y in range(5))
print(
    {x:x**2 for x in rangeNeg5}
)

Efficiency comparison for python3

map is now lazy:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)'
1000000 loops, best of 3: 0.336 usec per loop            ^^^^^^^^^

Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right "in order", because python generator expressions can only be evaluated the order x[0], x[1], x[2], ....

However let's say that we have a pre-made function f we'd like to map, and we ignore the laziness of map by immediately forcing evaluation with list(...). We get some very interesting results:

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))'                                                                                                                                                
10000 loops, best of 3: 165/124/135 usec per loop        ^^^^^^^^^^^^^^^
                    for list(<map object>)

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]'                                                                                                                                      
10000 loops, best of 3: 181/118/123 usec per loop        ^^^^^^^^^^^^^^^^^^
                    for list(<generator>), probably optimized

% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)'                                                                                                                                    
1000 loops, best of 3: 215/150/150 usec per loop         ^^^^^^^^^^^^^^^^^^^^^^
                    for list(<generator>)

In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...] to perform better than generator expressions (...), map is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).

It is important to realize that these tests assume a very simple function (the identity function); however this is fine because if the function were complicated, then performance overhead would be negligible compared to other factors in the program. (It may still be interesting to test with other simple things like f=lambda x:x+x)

If you're skilled at reading python assembly, you can use the dis module to see if that's actually what's going on behind the scenes:

>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval')
>>> dis.dis(listComp)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_NAME                0 (xs) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 RETURN_VALUE         
>>> listComp.co_consts
(<code object <listcomp> at 0x2511a48, file "listComp", line 1>,)
>>> dis.dis(listComp.co_consts[0])
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                18 (to 27) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (f) 
             15 LOAD_FAST                1 (x) 
             18 CALL_FUNCTION            1 
             21 LIST_APPEND              2 
             24 JUMP_ABSOLUTE            6 
        >>   27 RETURN_VALUE

 

>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval')
>>> dis.dis(listComp2)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_CONST               0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 
              6 MAKE_FUNCTION            0 
              9 LOAD_NAME                1 (xs) 
             12 GET_ITER             
             13 CALL_FUNCTION            1 
             16 CALL_FUNCTION            1 
             19 RETURN_VALUE         
>>> listComp2.co_consts
(<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,)
>>> dis.dis(listComp2.co_consts[0])
  1           0 LOAD_FAST                0 (.0) 
        >>    3 FOR_ITER                17 (to 23) 
              6 STORE_FAST               1 (x) 
              9 LOAD_GLOBAL              0 (f) 
             12 LOAD_FAST                1 (x) 
             15 CALL_FUNCTION            1 
             18 YIELD_VALUE          
             19 POP_TOP              
             20 JUMP_ABSOLUTE            3 
        >>   23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE

 

>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval')
>>> dis.dis(evalledMap)
  1           0 LOAD_NAME                0 (list) 
              3 LOAD_NAME                1 (map) 
              6 LOAD_NAME                2 (f) 
              9 LOAD_NAME                3 (xs) 
             12 CALL_FUNCTION            2 
             15 CALL_FUNCTION            1 
             18 RETURN_VALUE 

It seems it is better to use [...] syntax than list(...). Sadly the map class is a bit opaque to disassembly, but we can make due with our speed test.

攀登最高峰 2024-08-09 03:27:40

Python 2:您应该使用 mapfilter 而不是列表推导式。

尽管它们不是“Pythonic”,但您应该更喜欢它们的一个客观原因是:
它们需要函数/lambda 作为参数,这引入了新的作用域

我不止一次被这个问题所困扰:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

但如果我说:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

那么一切都会好起来的。

你可能会说我在同一范围内使用相同的变量名很愚蠢。

我不是。 代码最初很好 - 两个 x 不在同一范围内。
只是在我将内部块移动到代码的不同部分后,问题才出现(阅读:维护期间的问题,而不是开发期间的问题),这是我没想到的。

是的,如果你从不犯这个错误,那么列表推导式会更优雅。
但根据个人经验(以及看到其他人犯同样的错误),我已经看到这种情况发生了足够多次,因此我认为当这些错误渗透到您的代码中时,您所经历的痛苦是不值得的。

结论:

使用mapfilter。 它们可以防止与范围相关的难以诊断的微妙错误。

旁注:

如果 imapifilter(在 itertools 中)适合您的情况,请不要忘记考虑使用它们!

Python 2: You should use map and filter instead of list comprehensions.

An objective reason why you should prefer them even though they're not "Pythonic" is this:
They require functions/lambdas as arguments, which introduce a new scope.

I've gotten bitten by this more than once:

for x, y in somePoints:
    # (several lines of code here)
    squared = [x ** 2 for x in numbers]
    # Oops, x was silently overwritten!

but if instead I had said:

for x, y in somePoints:
    # (several lines of code here)
    squared = map(lambda x: x ** 2, numbers)

then everything would've been fine.

You could say I was being silly for using the same variable name in the same scope.

I wasn't. The code was fine originally -- the two xs weren't in the same scope.
It was only after I moved the inner block to a different section of the code that the problem came up (read: problem during maintenance, not development), and I didn't expect it.

Yes, if you never make this mistake then list comprehensions are more elegant.
But from personal experience (and from seeing others make the same mistake) I've seen it happen enough times that I think it's not worth the pain you have to go through when these bugs creep into your code.

Conclusion:

Use map and filter. They prevent subtle hard-to-diagnose scope-related bugs.

Side note:

Don't forget to consider using imap and ifilter (in itertools) if they are appropriate for your situation!

晨曦慕雪 2024-08-09 03:27:40

实际上,map 和列表推导式在 Python 3 语言中的行为非常不同。 看一下下面的 Python 3 程序:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

您可能期望它打印“[1, 4, 9]”行两次,但它却打印“[1, 4, 9]”,后跟“[]”。 第一次查看 Squares 时,它似乎表现为三个元素的序列,但第二次查看时则表现为空元素。

在 Python 2 语言中,map 返回一个普通的旧列表,就像两种语言中的列表推导式一样。 关键在于 Python 3 中的 map(以及 Python 2 中的 imap)的返回值不是一个列表 - 它是一个迭代器!

与迭代列表不同,当您迭代迭代器时,元素会被消耗。 这就是为什么 squares 在最后一个 print(list(squares)) 行中看起来是空的。

总结一下:

  • 在处理迭代器时,您必须记住它们是有状态的,并且它们会在您遍历它们时发生变化。
  • 列表更可预测,因为它们仅在您显式改变它们时才会发生变化; 它们是容器
  • 还有一个好处:数字、字符串和元组更加可预测,因为它们根本无法改变; 它们是价值观

Actually, map and list comprehensions behave quite differently in the Python 3 language. Take a look at the following Python 3 program:

def square(x):
    return x*x
squares = map(square, [1, 2, 3])
print(list(squares))
print(list(squares))

You might expect it to print the line "[1, 4, 9]" twice, but instead it prints "[1, 4, 9]" followed by "[]". The first time you look at squares it seems to behave as a sequence of three elements, but the second time as an empty one.

In the Python 2 language map returns a plain old list, just like list comprehensions do in both languages. The crux is that the return value of map in Python 3 (and imap in Python 2) is not a list - it's an iterator!

The elements are consumed when you iterate over an iterator unlike when you iterate over a list. This is why squares looks empty in the last print(list(squares)) line.

To summarize:

  • When dealing with iterators you have to remember that they are stateful and that they mutate as you traverse them.
  • Lists are more predictable since they only change when you explicitly mutate them; they are containers.
  • And a bonus: numbers, strings, and tuples are even more predictable since they cannot change at all; they are values.
[旋木] 2024-08-09 03:27:40

这是一种可能的情况:

map(lambda op1,op2: op1*op2, list1, list2)

对比:

[op1*op2 for op1,op2 in zip(list1,list2)]

我猜如果您坚持使用列表推导而不是映射,那么 zip() 是一种不幸且不必要的开销,您需要沉迷其中。 如果有人澄清这一点,无论是肯定的还是否定的,那就太好了。

Here is one possible case:

map(lambda op1,op2: op1*op2, list1, list2)

versus:

[op1*op2 for op1,op2 in zip(list1,list2)]

I am guessing the zip() is an unfortunate and unnecessary overhead you need to indulge in if you insist on using list comprehensions instead of the map. Would be great if someone clarifies this whether affirmatively or negatively.

π浅易 2024-08-09 03:27:40

如果您计划编写任何异步、并行或分布式代码,您可能会更喜欢 map 而不是列表理解 - 因为大多数异步、并行或分布式包都提供 map > 函数来重载Python的map。 然后,通过将适当的 map 函数传递给代码的其余部分,您可能无需修改原始串行代码即可使其并行运行(等等)。

If you plan on writing any asynchronous, parallel, or distributed code, you will probably prefer map over a list comprehension -- as most asynchronous, parallel, or distributed packages provide a map function to overload python's map. Then by passing the appropriate map function to the rest of your code, you may not have to modify your original serial code to have it run in parallel (etc).

守护在此方 2024-08-09 03:27:40

我发现列表推导式通常比 map 更能表达我想要做的事情 - 它们都完成了,但前者节省了尝试理解复杂的 内容的精神负担>lambda 表达式。

还有一个采访(我暂时找不到),其中 Guido 列出了 lambda 和函数式函数,这是他最后悔接受 Python 的事情,所以你可以这样说:因此它们不是 Pythonic 的。

I find list comprehensions are generally more expressive of what I'm trying to do than map - they both get it done, but the former saves the mental load of trying to understand what could be a complex lambda expression.

There's also an interview out there somewhere (I can't find it offhand) where Guido lists lambdas and the functional functions as the thing he most regrets about accepting into Python, so you could make the argument that they're un-Pythonic by virtue of that.

腻橙味 2024-08-09 03:27:40

因此从 Python 3 开始, map()是一个迭代器,您需要记住您需要什么:迭代器或列表对象。

正如 @AlexMartelli 已经提到的,只有当你不这样做时, map() 才比列表理解更快。 t 使用 lambda 函数。

我将向您展示一些时间比较。

<子>
Python 3.5.2 和 CPython
我使用过 Jupiter Notebook,尤其是 %timeit 内置魔法命令

测量:s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

设置:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

内置函数:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda 函数:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

还有生成器表达式之类的东西,参见 PEP-0289。 所以我认为将其添加到比较中会很有用

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

您需要 list 对象:

如果它是自定义函数,请使用列表理解,如果有内置函数,请使用 list(map()) function

您不需要 list 对象,您只需要可迭代的对象:

始终使用 map()

So since Python 3, map() is an iterator, you need to keep in mind what do you need: an iterator or list object.

As @AlexMartelli already mentioned, map() is faster than list comprehension only if you don't use lambda function.

I will present you some time comparisons.


Python 3.5.2 and CPython
I've used Jupiter notebook and especially %timeit built-in magic command

Measurements: s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns

Setup:

x_list = [(i, i+1, i+2, i*2, i-9) for i in range(1000)]
i_list = list(range(1000))

Built-in function:

%timeit map(sum, x_list)  # creating iterator object
# Output: The slowest run took 9.91 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 277 ns per loop

%timeit list(map(sum, x_list))  # creating list with map
# Output: 1000 loops, best of 3: 214 µs per loop

%timeit [sum(x) for x in x_list]  # creating list with list comprehension
# Output: 1000 loops, best of 3: 290 µs per loop

lambda function:

%timeit map(lambda i: i+1, i_list)
# Output: The slowest run took 8.64 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 325 ns per loop

%timeit list(map(lambda i: i+1, i_list))
# Output: 1000 loops, best of 3: 183 µs per loop

%timeit [i+1 for i in i_list]
# Output: 10000 loops, best of 3: 84.2 µs per loop

There is also such thing as generator expression, see PEP-0289. So i thought it would be useful to add it to comparison

%timeit (sum(i) for i in x_list)
# Output: The slowest run took 6.66 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 495 ns per loop

%timeit list((sum(x) for x in x_list))
# Output: 1000 loops, best of 3: 319 µs per loop

%timeit (i+1 for i in i_list)
# Output: The slowest run took 6.83 times longer than the fastest. 
# This could mean that an intermediate result is being cached.
# 1000000 loops, best of 3: 506 ns per loop

%timeit list((i+1 for i in i_list))
# Output: 10000 loops, best of 3: 125 µs per loop

You need list object:

Use list comprehension if it's custom function, use list(map()) if there is builtin function

You don't need list object, you just need iterable one:

Always use map()!

断念 2024-08-09 03:27:40

我进行了一个快速测试,比较了调用对象方法的三种方法。 在这种情况下,时间差异可以忽略不计,并且是相关函数的问题(请参阅@Alex Martelli 的响应) 。 在这里,我查看了以下方法:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

我查看了整数 (Python int) 和浮点数 (Python ) 的列表(存储在变量 vals 中) float)以增加列表大小。 考虑以下虚拟类 DummyNum

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

具体来说,是 add 方法。 __slots__ 属性是 Python 中的一个简单优化,用于定义类(属性)所需的总内存,从而减少内存大小。
这是结果图。

性能映射 Python 对象方法

如前所述,所使用的技术差异很小,您应该以对您来说或在特定情况下最易读的方式进行编码。 在这种情况下,列表理解(map_compressive 技术)对于对象中的两种类型的添加来说是最快的,尤其是对于较短的列表。

访问此pastebin以获取用于生成绘图和数据的源。

I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli's response). Here, I looked at the following methods:

# map_lambda
list(map(lambda x: x.add(), vals))

# map_operator
from operator import methodcaller
list(map(methodcaller("add"), vals))

# map_comprehension
[x.add() for x in vals]

I looked at lists (stored in the variable vals) of both integers (Python int) and floating point numbers (Python float) for increasing list sizes. The following dummy class DummyNum is considered:

class DummyNum(object):
    """Dummy class"""
    __slots__ = 'n',

    def __init__(self, n):
        self.n = n

    def add(self):
        self.n += 5

Specifically, the add method. The __slots__ attribute is a simple optimization in Python to define the total memory needed by the class (attributes), reducing memory size.
Here are the resulting plots.

Performance of mapping Python object methods

As stated previously, the technique used makes a minimal difference and you should code in a way that is most readable to you, or in the particular circumstance. In this case, the list comprehension (map_comprehension technique) is fastest for both types of additions in an object, especially with shorter lists.

Visit this pastebin for the source used to generate the plot and data.

霓裳挽歌倾城醉 2024-08-09 03:27:40

我使用 perfplot (我的一个项目)对一些结果进行了计时。

正如其他人所指出的,map 实际上只返回一个迭代器,因此它是一个常量时间操作。 当通过 list() 实现迭代器时,它与列表推导式相同。 根据表达方式的不同,任何一个都可能有轻微的优势,但并不显着。

请注意,像 x ** 2 这样的算术运算在 NumPy 中要快得多,尤其是在输入数据已经是 NumPy 数组的情况下。

十六进制

在此处输入图像描述

x ** 2:

在此处输入图像描述


重现绘图的代码:

import perfplot


def standalone_map(data):
    return map(hex, data)


def list_map(data):
    return list(map(hex, data))


def comprehension(data):
    return [hex(x) for x in data]


b = perfplot.bench(
    setup=lambda n: list(range(n)),
    kernels=[standalone_map, list_map, comprehension],
    n_range=[2 ** k for k in range(20)],
    equality_check=None,
)
b.save("out.png")
b.show()
import perfplot
import numpy as np


def standalone_map(data):
    return map(lambda x: x ** 2, data[0])


def list_map(data):
    return list(map(lambda x: x ** 2, data[0]))


def comprehension(data):
    return [x ** 2 for x in data[0]]


def numpy_asarray(data):
    return np.asarray(data[0]) ** 2


def numpy_direct(data):
    return data[1] ** 2


b = perfplot.bench(
    setup=lambda n: (list(range(n)), np.arange(n)),
    kernels=[standalone_map, list_map, comprehension, numpy_direct, numpy_asarray],
    n_range=[2 ** k for k in range(20)],
    equality_check=None,
)
b.save("out2.png")
b.show()

I timed some of the results with perfplot (a project of mine).

As others have noted, map really only returns an iterator so it's a constant-time operation. When realizing the iterator by list(), it's on par with list comprehensions. Depending on the expression, either one might have a slight edge but it's hardly significant.

Note that arithmetic operations like x ** 2 are much faster in NumPy, especially if the input data is already a NumPy array.

hex:

enter image description here

x ** 2:

enter image description here


Code to reproduce the plots:

import perfplot


def standalone_map(data):
    return map(hex, data)


def list_map(data):
    return list(map(hex, data))


def comprehension(data):
    return [hex(x) for x in data]


b = perfplot.bench(
    setup=lambda n: list(range(n)),
    kernels=[standalone_map, list_map, comprehension],
    n_range=[2 ** k for k in range(20)],
    equality_check=None,
)
b.save("out.png")
b.show()
import perfplot
import numpy as np


def standalone_map(data):
    return map(lambda x: x ** 2, data[0])


def list_map(data):
    return list(map(lambda x: x ** 2, data[0]))


def comprehension(data):
    return [x ** 2 for x in data[0]]


def numpy_asarray(data):
    return np.asarray(data[0]) ** 2


def numpy_direct(data):
    return data[1] ** 2


b = perfplot.bench(
    setup=lambda n: (list(range(n)), np.arange(n)),
    kernels=[standalone_map, list_map, comprehension, numpy_direct, numpy_asarray],
    n_range=[2 ** k for k in range(20)],
    equality_check=None,
)
b.save("out2.png")
b.show()
泼猴你往哪里跑 2024-08-09 03:27:40

我尝试了 @alex-martelli 的代码,但发现一些差异

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop

python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

映射即使对于非常大的范围也需要相同的时间,而使用列表理解需要花费大量时间,从我的代码中可以明显看出。 因此,除了被认为“unpythonic”之外,我还没有遇到任何与地图使用相关的性能问题。

I tried the code by @alex-martelli but found some discrepancies

python -mtimeit -s "xs=range(123456)" "map(hex, xs)"
1000000 loops, best of 5: 218 nsec per loop

python -mtimeit -s "xs=range(123456)" "[hex(x) for x in xs]"
10 loops, best of 5: 19.4 msec per loop

map takes the same amount of time even for very large ranges while using list comprehension takes a lot of time as is evident from my code. So apart from being considered "unpythonic", I have not faced any performance issues relating to usage of map.

时光磨忆 2024-08-09 03:27:40

性能测量

在此处输入图像描述

图片来源:Experfy

您可以亲自了解 - 列表理解和 map 函数之间哪个更好。

(与映射函数相比,列表理解处理 100 万条记录所需的时间更少。)

Performance measurement

Enter image description here

Image Source: Experfy

You can see for yourself which is better between - list comprehension and the map function.

(list comprehension takes less time to process 1 million records when compared to a map function.)

め七分饶幸 2024-08-09 03:27:40

我认为最 Pythonic 的方法是使用列表理解而不是 mapfilter。 原因是列表推导式比 mapfilter 更清晰。

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

正如您所看到的,推导式不需要像 map 那样额外的 lambda 表达式。 此外,推导式还可以轻松过滤,而 map 需要 filter 才能进行过滤。

I consider that the most Pythonic way is to use a list comprehension instead of map and filter. The reason is that list comprehensions are clearer than map and filter.

In [1]: odd_cubes = [x ** 3 for x in range(10) if x % 2 == 1] # using a list comprehension

In [2]: odd_cubes_alt = list(map(lambda x: x ** 3, filter(lambda x: x % 2 == 1, range(10)))) # using map and filter

In [3]: odd_cubes == odd_cubes_alt
Out[3]: True

As you an see, a comprehension does not require extra lambda expressions as map needs. Furthermore, a comprehension also allows filtering easily, while map requires filter to allow filtering.

在你怀里撒娇 2024-08-09 03:27:40

我的用例:

def sum_items(*args):
    return sum(args)


list_a = [1, 2, 3]
list_b = [1, 2, 3]

list_of_sums = list(map(sum_items,
                        list_a, list_b))
>>> [3, 6, 9]

comprehension = [sum(items) for items in iter(zip(list_a, list_b))]

我发现自己开始使用更多的地图,我认为由于传递和返回参数,地图可能比 comp 慢,这就是我找到这篇文章的原因。

我相信使用地图可以更具可读性和灵活性,特别是当我需要构造列表的值时。

如果你用过地图的话,读到这里你其实就明白了。

def pair_list_items(*args):
    return args

packed_list = list(map(pair_list_items,
                       lista, *listb, listc.....listn))

再加上灵活性奖金。
感谢所有其他答案以及绩效奖金。

My use case:

def sum_items(*args):
    return sum(args)


list_a = [1, 2, 3]
list_b = [1, 2, 3]

list_of_sums = list(map(sum_items,
                        list_a, list_b))
>>> [3, 6, 9]

comprehension = [sum(items) for items in iter(zip(list_a, list_b))]

I found myself starting to use more map, I thought map could be slower than comp due to pass and return arguments, that's why I found this post.

I believe using map could be much more readable and flexible, especially when I need to construct the values of the list.

You actually understand it when you read it if you used map.

def pair_list_items(*args):
    return args

packed_list = list(map(pair_list_items,
                       lista, *listb, listc.....listn))

Plus the flexibility bonus.
And thank for all other answers, plus the performance bonus.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文