运行总计的列表理解
我想从数字列表中获取运行总计。
的数字顺序列表开始
a = range(20)
runningTotal = []
for n in range(len(a)):
new = runningTotal[n-1] + a[n] if n > 0 else a[n]
runningTotal.append(new)
# This one is a syntax error
# runningTotal = [a[n] for n in range(len(a)) if n == 0 else runningTotal[n-1] + a[n]]
for i in zip(a, runningTotal):
print "{0:>3}{1:>5}".format(*i)
出于演示目的,我从使用 range
产量
0 0
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
10 55
11 66
12 78
13 91
14 105
15 120
16 136
17 153
18 171
19 190
如您所见,我初始化一个空列表 []
,然后 append()< /code> 在每个循环迭代中。有没有更优雅的方法,比如列表理解?
I want to get a running total from a list of numbers.
For demo purposes, I start with a sequential list of numbers using range
a = range(20)
runningTotal = []
for n in range(len(a)):
new = runningTotal[n-1] + a[n] if n > 0 else a[n]
runningTotal.append(new)
# This one is a syntax error
# runningTotal = [a[n] for n in range(len(a)) if n == 0 else runningTotal[n-1] + a[n]]
for i in zip(a, runningTotal):
print "{0:>3}{1:>5}".format(*i)
yields
0 0
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
10 55
11 66
12 78
13 91
14 105
15 120
16 136
17 153
18 171
19 190
As you can see, I initialize an empty list []
, then append()
in each loop iteration. Is there a more elegant way to this, like a list comprehension?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
列表理解没有好的(干净的、可移植的)方式来引用它正在构建的列表。一种好的而优雅的方法可能是在生成器中完成这项工作:
当然,要使用列表来获取它作为列表,请使用
list(running_sum(a))
。A list comprehension has no good (clean, portable) way to refer to the very list it's building. One good and elegant approach might be to do the job in a generator:
to get this as a list instead, of course, use
list(running_sum(a))
.如果您可以使用 numpy,它有一个名为
cumsum
的内置函数可以执行此操作。If you can use numpy, it has a built-in function named
cumsum
that does this.我不确定“优雅”,但我认为以下内容更简单、更直观(以额外变量为代价):
做同样事情的功能方法是:
……但这可读性要差得多/可维护等。
@Omnifarous 建议这应该改进为:
...但我仍然发现这不如我最初的建议那么容易理解。
请记住 Kernighan 的话:“调试的难度是最初编写代码的两倍。因此,如果您尽可能巧妙地编写代码,那么根据定义,您就不够聪明,无法调试它。”
I'm not sure about 'elegant', but I think the following is much simpler and more intuitive (at the cost of an extra variable):
The functional way to do the same thing is:
...but that's much less readable/maintainable, etc.
@Omnifarous suggests this should be improved to:
...but I still find that less immediately comprehensible than my initial suggestion.
Remember the words of Kernighan: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
使用
itertools.accumulate()
。这是一个示例:这只适用于 Python 3。在 Python 2 上,您可以使用 more 中的反向移植-itertools 包。
Use
itertools.accumulate()
. Here is an example:This only works on Python 3. On Python 2 you can use the backport in the more-itertools package.
这可以在 Python 中用两行代码实现。
使用默认参数就无需在外部维护 aux 变量,然后我们只需对列表进行
映射
即可。This can be implemented in 2 lines in Python.
Using a default parameter eliminates the need to maintain an aux variable outside, and then we just do a
map
to the list.当我们对列表求和时,我们指定一个累加器(
memo
),然后遍历列表,将二元函数“x+y”应用于每个元素和累加器。从程序上看,这看起来像:这是一种常见的模式,除了求和之外,对于其他事情也很有用——我们可以将其推广到任何二进制函数,我们将其作为参数提供,并且还让调用者指定一个初始值。这为我们提供了一个名为
reduce
、foldl
或inject
[1] 的函数:在 Python 2 中,< code>reduce 是一个内置函数,但在 Python 3 中它已被移至
functools
模块:我们可以使用
reduce
做各种很酷的事情取决于我们作为第一个参数提供的函数。如果我们用“列表串联”替换“sum”,用“空列表”替换“零”,我们就得到了(浅)copy
函数:如果我们添加一个
transform
函数作为copy
的另一个参数,并在连接之前应用它,我们得到map
:如果我们添加一个采用
e 的
作为参数并返回一个布尔值,并用它来决定是否连接,我们得到predicate
函数filter
:map
和filter
是编写列表推导式的一种奇特方式 - 我们也可以说[x*2 for x in range(10)]
或[x for x in range(10)如果x%2==0]
。reduce
没有相应的列表理解语法,因为reduce
根本不需要返回列表(正如我们之前在sum
中看到的那样) ,Python 也恰好将其作为内置函数提供)。事实证明,对于计算运行总和,
reduce
的列表构建能力正是我们想要的,并且可能是解决此问题的最优雅的方法,尽管它享有盛誉(与lambda
)作为一种非Pythonic的陈词滥调。在运行时留下旧值副本的reduce
版本称为reductions
或scanl
[1],看起来像这样:装备齐全,我们现在可以定义:
虽然概念上很优雅,但这种精确的方法在 Python 实践中表现不佳。因为 Python 的
list.append()
会就地改变列表但不会返回它,所以我们无法在 lambda 中有效地使用它,而必须使用+
改为运算符。这会构造一个全新的列表,所花费的时间与到目前为止累积的列表的长度成正比(即 O(n) 操作)。由于执行此操作时我们已经处于reduce
的 O(n)for
循环中,因此整体时间复杂度复合为 O(n2)。在 Ruby[2] 这样的语言中,
array.push e
返回变异的array
,等效的运行时间为 O(n) 时间:在 JavaScript[2] 中也是如此,其
array.push(e)
返回e
(不是array
),但是其匿名函数允许我们包含多个语句,我们可以使用这些语句来单独指定返回值:那么,我们如何解决这个问题,同时保留我们只需传递的
reductions
函数的概念简单性>lambda x, y: x + y 为了创建运行求和函数?让我们按程序重写reductions
。我们可以解决意外二次问题,并且在我们解决这个问题时,将结果列表预先分配给避免堆抖动[3]:这对我来说是最佳点:O(n) 性能,优化的过程代码隐藏在一个有意义的名称下,下次可以重复使用您需要编写一个将中间值累积到列表中的函数。
reduce
/reductions
来自 LISP 传统,foldl
/scanl
来自 ML 传统,而>inject
来自 Smalltalk 传统。List
和 Ruby 的Array
都是自动调整大小的数据结构的实现,称为“动态数组”(或 C++ 中的std::vector
) )。 JavaScript 的Array
有点巴洛克风格,但只要您不分配给越界索引或改变Array.length
,其行为就相同。When we take the sum of a list, we designate an accumulator (
memo
) and then walk through the list, applying the binary function "x+y" to each element and the accumulator. Procedurally, this looks like:This is a common pattern, and useful for things other than taking sums — we can generalize it to any binary function, which we'll supply as a parameter, and also let the caller specify an initial value. This gives us a function known as
reduce
,foldl
, orinject
[1]:In Python 2,
reduce
was a built-in function, but in Python 3 it's been moved to thefunctools
module:We can do all kinds of cool stuff with
reduce
depending on the function we supply as its the first argument. If we replace "sum" with "list concatenation", and "zero" with "empty list", we get the (shallow)copy
function:If we add a
transform
function as another parameter tocopy
, and apply it before concatenating, we getmap
:If we add a
predicate
function that takese
as a parameter and returns a boolean, and use it to decide whether or not to concatenate, we getfilter
:map
andfilter
are sort of unfancy ways of writing list comprehensions — we could also have said[x*2 for x in range(10)]
or[x for x in range(10) if x%2==0]
. There's no corresponding list comprehension syntax forreduce
, becausereduce
isn't required to return a list at all (as we saw withsum
, earlier, which Python also happens to offer as a built-in function).It turns out that for computing a running sum, the list-building abilities of
reduce
are exactly what we want, and probably the most elegant way to solve this problem, despite its reputation (along withlambda
) as something of an un-pythonic shibboleth. The version ofreduce
that leaves behind copies of its old values as it runs is calledreductions
orscanl
[1], and it looks like this:So equipped, we can now define:
While conceptually elegant, this precise approach fares poorly in practice with Python. Because Python's
list.append()
mutates a list in place but doesn't return it, we can't use it effectively in a lambda, and have to use the+
operator instead. This constructs a whole new list, which takes time proportional to the length of the accumulated list so far (that is, an O(n) operation). Since we're already inside the O(n)for
loop ofreduce
when we do this, the overall time complexity compounds to O(n2).In a language like Ruby[2], where
array.push e
returns the mutatedarray
, the equivalent runs in O(n) time:same in JavaScript[2], whose
array.push(e)
returnse
(notarray
), but whose anonymous functions allow us to include multiple statements, which we can use to separately specify a return value:So, how can we solve this while retaining the conceptual simplicity of a
reductions
function that we just passlambda x, y: x + y
to in order to create the running sum function? Let's rewritereductions
procedurally. We can fix the accidentally quadratic problem, and while we're at it, pre-allocate the result list to avoid heap thrashing[3]:This is the sweet spot for me: O(n) performance, and the optimized procedural code is tucked away under a meaningful name where it can be re-used the next time you need to write a function that accumulates intermediate values into a list.
reduce
/reductions
come from the LISP tradition,foldl
/scanl
from the ML tradition, andinject
from the Smalltalk tradition.List
and Ruby'sArray
are both implementations of an automatically resizing data structure known as a "dynamic array" (orstd::vector
in C++). JavaScript'sArray
is a little more baroque, but behaves identically provided you don't assign to out of bounds indices or mutateArray.length
.从
Python 3.8
开始,并引入赋值表达式 (PEP 572)(:=
运算符),我们可以在列表理解中使用并递增变量:此:
total
初始化为0< /code> 代表
total
增加当前循环项(total :=total + x
)total
的新值作为生成的映射元组的一部分Starting
Python 3.8
, and the introduction of assignment expressions (PEP 572) (:=
operator), we can use and increment a variable within a list comprehension:This:
total
to0
which symbolizes the running sumtotal
by the current looped item (total := total + x
) via an assignment expressiontotal
as part of the produced mapped tuple我想做同样的事情来生成可以使用 bisect_left 的累积频率 - 这就是我生成列表的方式;
I wanted to do the same thing to generate cumulative frequencies that I could use bisect_left over - this is the way I've generated the list;
这是一个线性时间解决方案:
示例:
简而言之,reduce 遍历列表累加总和并构造一个列表。最终的x[0] 返回列表,
x[1]
将是运行总值。Here's a linear time solution one liner:
Example:
In short, the reduce goes over the list accumulating sum and constructing an list. The final
x[0]
returns the list,x[1]
would be the running total value.又是一句单行话,线性时空。
我在这里强调线性空间,因为我在其他建议的答案中看到的大多数单行代码 --- 那些基于模式
list + [sum]
或使用chain
code> 迭代器 --- 生成 O(n) 列表或生成器,并对垃圾收集器施加很大压力,以至于与此相比,它们的性能非常差。Another one-liner, in linear time and space.
I'm stressing linear space here, because most of the one-liners I saw in the other proposed answers --- those based on the pattern
list + [sum]
or usingchain
iterators --- generate O(n) lists or generators and stress the garbage collector so much that they perform very poorly, in comparison to this.我会为此使用协程:
I would use a coroutine for this:
您正在寻找两件事:折叠(减少)和一个有趣的函数,该函数保留另一个函数的结果列表,我将其称为运行。我制作了带有和不带有初始参数的版本;不管怎样,这些都需要用初始的 [] 来减少。
由于 + 运算符,在非常大的列表上这些将花费很长时间。在函数式语言中,如果正确完成,此列表构造将是 O(n)。
以下是输出的前几行:
You are looking for two things: fold (reduce) and a funny function that keeps a list of the results of another function, which I have called running. I made versions both with and without an initial parameter; either way these need to go to reduce with an initial [].
These will take a long time on really large lists due to the + operator. In a functional language, if done correctly, this list construction would be O(n).
Here are the first few lines of output:
这是低效的,因为它从一开始就每次都这样做,但可能是:
This is inefficient as it does it every time from beginning but possible it is:
在 Python 3.8 及更高版本中,您现在可以使用 walrus 运算符
with Python 3.8 and above you can now use walrus operator