列表理解与映射
是否有理由更喜欢使用 map() 而不是列表理解或反之反之亦然? 它们中的任何一个通常比另一个更高效还是被认为更Pythonic?
Is there a reason to prefer using map() over list comprehension or vice versa? Is either of them generally more efficient or considered generally more Pythonic than the other?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
在某些情况下,map 可能会在微观上更快(当您不为此目的创建 lambda,而是在 map 和 列表理解)。 在其他情况下,列表推导可能会更快,并且大多数(不是全部)Pythonista 认为它们更直接、更清晰。
使用完全相同的函数时 map 的微小速度优势的示例:
当 map 需要 lambda 时性能比较如何完全相反的示例:
map may be microscopically faster in some cases (when you're not making a lambda for the purpose, but using the same function in map and a list comprehension). List comprehensions may be faster in other cases and most (not all) Pythonistas consider them more direct and clearer.
An example of the tiny speed advantage of map when using exactly the same function:
An example of how performance comparison gets completely reversed when map needs a lambda:
案例
map
通常是合理的,尽管它被认为是“非Pythonic” 。 例如,map(sum, myLists)
比[sum(x) for x in myLists]
更优雅/简洁。 您无需构造虚拟变量(例如sum(x) for x...
或sum(_) for _...
或 < code>sum(readName) for ReadableName...),您必须输入两次才能迭代。 同样的论点也适用于filter
和reduce
以及来自itertools
模块的任何内容:如果您已经有一个方便的函数,您可以继续执行一些函数式编程。 这在某些情况下获得了可读性,但在其他情况下却失去了可读性(例如,新手程序员、多个参数)……但是代码的可读性很大程度上取决于您的注释。map
)时,您可能希望将map
函数用作纯抽象函数,或者柯里化map
,或者从将map
作为函数讨论中获益。 例如,在 Haskell 中,名为fmap
的仿函数接口概括了对任何数据结构的映射。 这在Python中非常罕见,因为Python语法迫使你使用生成器风格来谈论迭代; 你不能轻易地概括它。 (这有时是好的,有时是坏的。)您可能会想出一些罕见的 Python 示例,其中map(f, *lists)
是合理的做法。 我能想到的最接近的例子是 sumEach =partial(map,sum),这是一个单行代码,大致相当于:for
循环:当然,您也可以仅使用for循环。 虽然从函数式编程的角度来看并不那么优雅,但有时非局部变量会使命令式编程语言(例如 python)中的代码更清晰,因为人们非常习惯以这种方式阅读代码。 一般来说,当您只是执行任何复杂的操作而不是构建列表(例如列表推导式和映射)时,For 循环也是最有效的(例如求和或创建树等) - 至少在内存方面高效(不一定在时间方面,我预计最坏的情况是一个恒定因素,除非出现一些罕见的病态垃圾收集问题)。“Pythonism”
我不喜欢“Pythonic”这个词,因为我不认为Pythonic在我眼里总是优雅的。 尽管如此,
map
和filter
以及类似的函数(比如非常有用的itertools
模块)在风格上可能被认为是非 Python 的。惰性
在效率方面,与大多数函数式编程结构一样,MAP可以是惰性的,事实上在Python中是惰性的。 这意味着您可以执行此操作(在python3中),并且您的计算机不会耗尽内存并丢失所有未保存的数据:
尝试使用列表理解来做到这一点:
请注意列表理解本质上也是惰性的,但是 python 选择将它们实现为非惰性的。 尽管如此,Python 确实支持生成器表达式形式的惰性列表推导式,如下所示:
您基本上可以将
[...]
语法视为将生成器表达式传递给列表构造函数,例如 <代码>列表(x for x in range(5)).简短的设计示例
列表推导式是非惰性的,因此可能需要更多内存(除非您使用生成器推导式)。 方括号
[...]
通常会使事情变得显而易见,尤其是在括号混乱的情况下。 另一方面,有时您最终会变得冗长,例如输入[x for x in...
。 只要保持迭代器变量简短,如果不缩进代码,列表推导式通常会更清晰。 但您始终可以缩进代码。或者分解:
python3 的效率比较
map
现在是惰性的:因此,如果您不会使用所有数据,或者提前不知道如何使用如果您需要大量数据,python3 中的
map
(以及 python2 或 python3 中的生成器表达式)将避免计算它们的值,直到最后一刻必要。 通常,这通常会超过使用map
的任何开销。 缺点是,与大多数函数式语言相比,这在 python 中非常有限:只有当您“按顺序”从左到右访问数据时,您才能获得此好处,因为 python 生成器表达式只能按顺序求值x[0], x[1], x[2], ...
.但是,假设我们有一个预制函数
f
我们想要到map
,我们通过立即使用list(...)
强制计算来忽略map
的惰性。 我们得到了一些非常有趣的结果:结果采用 AAA/BBB/CCC 形式,其中 A 是在大约 2010 年的 Intel 工作站上使用 python 3.?.? 执行的,B 和 C 是在大约 2013 年的 AMD 上执行的使用 python 3.2.1 的工作站,具有截然不同的硬件。 结果似乎是映射和列表理解在性能上具有可比性,而这受其他随机因素的影响最大。 我们唯一能说的似乎是,奇怪的是,虽然我们期望列表推导式
[...]
比生成器表达式(...)
表现更好,但 < code>map 也比生成器表达式更有效(再次假设所有值都被评估/使用)。重要的是要认识到这些测试假设一个非常简单的函数(恒等函数); 然而这很好,因为如果函数很复杂,那么与程序中的其他因素相比,性能开销可以忽略不计。 (使用其他简单的东西(例如
f=lambda x:x+x
)进行测试可能仍然很有趣)如果您擅长阅读 python 程序集,则可以使用
dis
模块来查看这是否真的是幕后发生的事情:看来使用
[...]
语法比list(...)
更好。 遗憾的是,map
类对于反汇编来说有点不透明,但我们可以通过速度测试来完成。Cases
map
, though it is considered 'unpythonic'. For example,map(sum, myLists)
is more elegant/terse than[sum(x) for x in myLists]
. You gain the elegance of not having to make up a dummy variable (e.g.sum(x) for x...
orsum(_) for _...
orsum(readableName) for readableName...
) which you have to type twice, just to iterate. The same argument holds forfilter
andreduce
and anything from theitertools
module: if you already have a function handy, you could go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)... but the readability of your code highly depends on your comments anyway.map
function as a pure abstract function while doing functional programming, where you're mappingmap
, or curryingmap
, or otherwise benefit from talking aboutmap
as a function. In Haskell for example, a functor interface calledfmap
generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can't generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples wheremap(f, *lists)
is a reasonable thing to do. The closest example I can come up with would besumEach = partial(map,sum)
, which is a one-liner that is very roughly equivalent to:for
-loop: You can also of course just use a for-loop. While not as elegant from a functional-programming viewpoint, sometimes non-local variables make code clearer in imperative programming languages such as python, because people are very used to reading code that way. For-loops are also, generally, the most efficient when you are merely doing any complex operation that is not building a list like list-comprehensions and map are optimized for (e.g. summing, or making a tree, etc.) -- at least efficient in terms of memory (not necessarily in terms of time, where I'd expect at worst a constant factor, barring some rare pathological garbage-collection hiccuping)."Pythonism"
I dislike the word "pythonic" because I don't find that pythonic is always elegant in my eyes. Nevertheless,
map
andfilter
and similar functions (like the very usefulitertools
module) are probably considered unpythonic in terms of style.Laziness
In terms of efficiency, like most functional programming constructs, MAP CAN BE LAZY, and in fact is lazy in python. That means you can do this (in python3) and your computer will not run out of memory and lose all your unsaved data:
Try doing that with a list comprehension:
Do note that list comprehensions are also inherently lazy, but python has chosen to implement them as non-lazy. Nevertheless, python does support lazy list comprehensions in the form of generator expressions, as follows:
You can basically think of the
[...]
syntax as passing in a generator expression to the list constructor, likelist(x for x in range(5))
.Brief contrived example
List comprehensions are non-lazy, so may require more memory (unless you use generator comprehensions). The square brackets
[...]
often make things obvious, especially when in a mess of parentheses. On the other hand, sometimes you end up being verbose like typing[x for x in...
. As long as you keep your iterator variables short, list comprehensions are usually clearer if you don't indent your code. But you could always indent your code.or break things up:
Efficiency comparison for python3
map
is now lazy:Therefore if you will not be using all your data, or do not know ahead of time how much data you need,
map
in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from usingmap
. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right "in order", because python generator expressions can only be evaluated the orderx[0], x[1], x[2], ...
.However let's say that we have a pre-made function
f
we'd like tomap
, and we ignore the laziness ofmap
by immediately forcing evaluation withlist(...)
. We get some very interesting results:In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions
[...]
to perform better than generator expressions(...)
,map
is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).It is important to realize that these tests assume a very simple function (the identity function); however this is fine because if the function were complicated, then performance overhead would be negligible compared to other factors in the program. (It may still be interesting to test with other simple things like
f=lambda x:x+x
)If you're skilled at reading python assembly, you can use the
dis
module to see if that's actually what's going on behind the scenes:It seems it is better to use
[...]
syntax thanlist(...)
. Sadly themap
class is a bit opaque to disassembly, but we can make due with our speed test.Python 2:您应该使用
map
和filter
而不是列表推导式。尽管它们不是“Pythonic”,但您应该更喜欢它们的一个客观原因是:
它们需要函数/lambda 作为参数,这引入了新的作用域。
我不止一次被这个问题所困扰:
但如果我说:
那么一切都会好起来的。
你可能会说我在同一范围内使用相同的变量名很愚蠢。
我不是。 代码最初很好 - 两个
x
不在同一范围内。只是在我将内部块移动到代码的不同部分后,问题才出现(阅读:维护期间的问题,而不是开发期间的问题),这是我没想到的。
是的,如果你从不犯这个错误,那么列表推导式会更优雅。
但根据个人经验(以及看到其他人犯同样的错误),我已经看到这种情况发生了足够多次,因此我认为当这些错误渗透到您的代码中时,您所经历的痛苦是不值得的。
结论:
使用
map
和filter
。 它们可以防止与范围相关的难以诊断的微妙错误。旁注:
如果
imap
和ifilter
(在itertools
中)适合您的情况,请不要忘记考虑使用它们!Python 2: You should use
map
andfilter
instead of list comprehensions.An objective reason why you should prefer them even though they're not "Pythonic" is this:
They require functions/lambdas as arguments, which introduce a new scope.
I've gotten bitten by this more than once:
but if instead I had said:
then everything would've been fine.
You could say I was being silly for using the same variable name in the same scope.
I wasn't. The code was fine originally -- the two
x
s weren't in the same scope.It was only after I moved the inner block to a different section of the code that the problem came up (read: problem during maintenance, not development), and I didn't expect it.
Yes, if you never make this mistake then list comprehensions are more elegant.
But from personal experience (and from seeing others make the same mistake) I've seen it happen enough times that I think it's not worth the pain you have to go through when these bugs creep into your code.
Conclusion:
Use
map
andfilter
. They prevent subtle hard-to-diagnose scope-related bugs.Side note:
Don't forget to consider using
imap
andifilter
(initertools
) if they are appropriate for your situation!实际上,
map
和列表推导式在 Python 3 语言中的行为非常不同。 看一下下面的 Python 3 程序:您可能期望它打印“[1, 4, 9]”行两次,但它却打印“[1, 4, 9]”,后跟“[]”。 第一次查看 Squares 时,它似乎表现为三个元素的序列,但第二次查看时则表现为空元素。
在 Python 2 语言中,
map
返回一个普通的旧列表,就像两种语言中的列表推导式一样。 关键在于 Python 3 中的map
(以及 Python 2 中的imap
)的返回值不是一个列表 - 它是一个迭代器!与迭代列表不同,当您迭代迭代器时,元素会被消耗。 这就是为什么
squares
在最后一个print(list(squares))
行中看起来是空的。总结一下:
Actually,
map
and list comprehensions behave quite differently in the Python 3 language. Take a look at the following Python 3 program:You might expect it to print the line "[1, 4, 9]" twice, but instead it prints "[1, 4, 9]" followed by "[]". The first time you look at
squares
it seems to behave as a sequence of three elements, but the second time as an empty one.In the Python 2 language
map
returns a plain old list, just like list comprehensions do in both languages. The crux is that the return value ofmap
in Python 3 (andimap
in Python 2) is not a list - it's an iterator!The elements are consumed when you iterate over an iterator unlike when you iterate over a list. This is why
squares
looks empty in the lastprint(list(squares))
line.To summarize:
这是一种可能的情况:
对比:
我猜如果您坚持使用列表推导而不是映射,那么 zip() 是一种不幸且不必要的开销,您需要沉迷其中。 如果有人澄清这一点,无论是肯定的还是否定的,那就太好了。
Here is one possible case:
versus:
I am guessing the zip() is an unfortunate and unnecessary overhead you need to indulge in if you insist on using list comprehensions instead of the map. Would be great if someone clarifies this whether affirmatively or negatively.
如果您计划编写任何异步、并行或分布式代码,您可能会更喜欢
map
而不是列表理解 - 因为大多数异步、并行或分布式包都提供map
> 函数来重载Python的map
。 然后,通过将适当的map
函数传递给代码的其余部分,您可能无需修改原始串行代码即可使其并行运行(等等)。If you plan on writing any asynchronous, parallel, or distributed code, you will probably prefer
map
over a list comprehension -- as most asynchronous, parallel, or distributed packages provide amap
function to overload python'smap
. Then by passing the appropriatemap
function to the rest of your code, you may not have to modify your original serial code to have it run in parallel (etc).我发现列表推导式通常比
map
更能表达我想要做的事情 - 它们都完成了,但前者节省了尝试理解复杂的内容的精神负担>lambda
表达式。还有一个采访(我暂时找不到),其中 Guido 列出了 lambda 和函数式函数,这是他最后悔接受 Python 的事情,所以你可以这样说:因此它们不是 Pythonic 的。
I find list comprehensions are generally more expressive of what I'm trying to do than
map
- they both get it done, but the former saves the mental load of trying to understand what could be a complexlambda
expression.There's also an interview out there somewhere (I can't find it offhand) where Guido lists
lambda
s and the functional functions as the thing he most regrets about accepting into Python, so you could make the argument that they're un-Pythonic by virtue of that.因此从 Python 3 开始,
map()
是一个迭代器,您需要记住您需要什么:迭代器或列表对象。正如 @AlexMartelli 已经提到的,只有当你不这样做时,
map()
才比列表理解更快。 t 使用 lambda 函数。我将向您展示一些时间比较。
<子>
Python 3.5.2 和 CPython
我使用过 Jupiter Notebook,尤其是
%timeit
内置魔法命令测量:s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns
设置:
内置函数:
lambda
函数:还有生成器表达式之类的东西,参见 PEP-0289。 所以我认为将其添加到比较中会很有用
您需要
list
对象:如果它是自定义函数,请使用列表理解,如果有内置函数,请使用
list(map())
function您不需要
list
对象,您只需要可迭代的对象:始终使用
map()
!So since Python 3,
map()
is an iterator, you need to keep in mind what do you need: an iterator orlist
object.As @AlexMartelli already mentioned,
map()
is faster than list comprehension only if you don't uselambda
function.I will present you some time comparisons.
Python 3.5.2 and CPython
I've used Jupiter notebook and especially
%timeit
built-in magic commandMeasurements: s == 1000 ms == 1000 * 1000 µs = 1000 * 1000 * 1000 ns
Setup:
Built-in function:
lambda
function:There is also such thing as generator expression, see PEP-0289. So i thought it would be useful to add it to comparison
You need
list
object:Use list comprehension if it's custom function, use
list(map())
if there is builtin functionYou don't need
list
object, you just need iterable one:Always use
map()
!我进行了一个快速测试,比较了调用对象方法的三种方法。 在这种情况下,时间差异可以忽略不计,并且是相关函数的问题(请参阅@Alex Martelli 的响应) 。 在这里,我查看了以下方法:
我查看了整数 (Python
int
) 和浮点数 (Python) 的列表(存储在变量
)以增加列表大小。 考虑以下虚拟类vals
中) floatDummyNum
:具体来说,是
add
方法。__slots__
属性是 Python 中的一个简单优化,用于定义类(属性)所需的总内存,从而减少内存大小。这是结果图。
如前所述,所使用的技术差异很小,您应该以对您来说或在特定情况下最易读的方式进行编码。 在这种情况下,列表理解(
map_compressive
技术)对于对象中的两种类型的添加来说是最快的,尤其是对于较短的列表。访问此pastebin以获取用于生成绘图和数据的源。
I ran a quick test comparing three methods for invoking the method of an object. The time difference, in this case, is negligible and is a matter of the function in question (see @Alex Martelli's response). Here, I looked at the following methods:
I looked at lists (stored in the variable
vals
) of both integers (Pythonint
) and floating point numbers (Pythonfloat
) for increasing list sizes. The following dummy classDummyNum
is considered:Specifically, the
add
method. The__slots__
attribute is a simple optimization in Python to define the total memory needed by the class (attributes), reducing memory size.Here are the resulting plots.
As stated previously, the technique used makes a minimal difference and you should code in a way that is most readable to you, or in the particular circumstance. In this case, the list comprehension (
map_comprehension
technique) is fastest for both types of additions in an object, especially with shorter lists.Visit this pastebin for the source used to generate the plot and data.
我使用 perfplot (我的一个项目)对一些结果进行了计时。
正如其他人所指出的,map 实际上只返回一个迭代器,因此它是一个常量时间操作。 当通过
list()
实现迭代器时,它与列表推导式相同。 根据表达方式的不同,任何一个都可能有轻微的优势,但并不显着。请注意,像
x ** 2
这样的算术运算在 NumPy 中要快得多,尤其是在输入数据已经是 NumPy 数组的情况下。十六进制
:x ** 2
:重现绘图的代码:
I timed some of the results with perfplot (a project of mine).
As others have noted,
map
really only returns an iterator so it's a constant-time operation. When realizing the iterator bylist()
, it's on par with list comprehensions. Depending on the expression, either one might have a slight edge but it's hardly significant.Note that arithmetic operations like
x ** 2
are much faster in NumPy, especially if the input data is already a NumPy array.hex
:x ** 2
:Code to reproduce the plots:
我尝试了 @alex-martelli 的代码,但发现一些差异
映射即使对于非常大的范围也需要相同的时间,而使用列表理解需要花费大量时间,从我的代码中可以明显看出。 因此,除了被认为“unpythonic”之外,我还没有遇到任何与地图使用相关的性能问题。
I tried the code by @alex-martelli but found some discrepancies
map takes the same amount of time even for very large ranges while using list comprehension takes a lot of time as is evident from my code. So apart from being considered "unpythonic", I have not faced any performance issues relating to usage of map.
性能测量
图片来源:Experfy
您可以亲自了解 - 列表理解和 map 函数之间哪个更好。
(与映射函数相比,列表理解处理 100 万条记录所需的时间更少。)
Performance measurement
Image Source: Experfy
You can see for yourself which is better between - list comprehension and the map function.
(list comprehension takes less time to process 1 million records when compared to a map function.)
我认为最 Pythonic 的方法是使用列表理解而不是
map
和filter
。 原因是列表推导式比map
和filter
更清晰。正如您所看到的,推导式不需要像
map
那样额外的lambda
表达式。 此外,推导式还可以轻松过滤,而map
需要filter
才能进行过滤。I consider that the most Pythonic way is to use a list comprehension instead of
map
andfilter
. The reason is that list comprehensions are clearer thanmap
andfilter
.As you an see, a comprehension does not require extra
lambda
expressions asmap
needs. Furthermore, a comprehension also allows filtering easily, whilemap
requiresfilter
to allow filtering.我的用例:
我发现自己开始使用更多的地图,我认为由于传递和返回参数,地图可能比 comp 慢,这就是我找到这篇文章的原因。
我相信使用地图可以更具可读性和灵活性,特别是当我需要构造列表的值时。
如果你用过地图的话,读到这里你其实就明白了。
再加上灵活性奖金。
感谢所有其他答案以及绩效奖金。
My use case:
I found myself starting to use more map, I thought map could be slower than comp due to pass and return arguments, that's why I found this post.
I believe using map could be much more readable and flexible, especially when I need to construct the values of the list.
You actually understand it when you read it if you used map.
Plus the flexibility bonus.
And thank for all other answers, plus the performance bonus.