python 直方图单线
有很多方法可以编写计算直方图的 Python 程序。
我所说的直方图是指计算可迭代对象中对象的出现次数并在字典中输出计数的函数。例如:
>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}
编写这个函数的一种方法是:
def histogram(L):
d = {}
for x in L:
if x in d:
d[x] += 1
else:
d[x] = 1
return d
有没有更简洁的方法来编写这个函数?
如果我们在 Python 中有字典推导式,我们可以这样写:
>>> { x: L.count(x) for x in set(L) }
但由于 Python 2.6 没有它们,我们必须这样写:
>>> dict([(x, L.count(x)) for x in set(L)])
虽然这种方法可能可读,但效率不高: L 被遍历多次。此外,这不适用于单寿命发电机;该函数对于迭代器生成器应该同样有效,例如:
def gen(L):
for x in L:
yield x
我们可以尝试使用 reduce
函数 (RIP):
>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!
哎呀,这不起作用:键名称是 'x',而不是<代码>x。 :(
我的结尾是:(
>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})
在 Python 3 中,我们必须编写 list(d.items())
而不是 d.items()
,但这是假设的,因为那里没有reduce
。)
请用更好、更易读的单行来打败我;)
There are many ways to write a Python program that computes a histogram.
By histogram, I mean a function that counts the occurrence of objects in an iterable
and outputs the counts in a dictionary. For example:
>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}
One way to write this function is:
def histogram(L):
d = {}
for x in L:
if x in d:
d[x] += 1
else:
d[x] = 1
return d
Are there more concise ways of writing this function?
If we had dictionary comprehensions in Python, we could write:
>>> { x: L.count(x) for x in set(L) }
but since Python 2.6 doesn't have them, we have to write:
>>> dict([(x, L.count(x)) for x in set(L)])
Although this approach may be readable, it is not efficient: L is walked-through multiple times. Furthermore, this won't work for single-life generators; the function should work equally well for iterator generators such as:
def gen(L):
for x in L:
yield x
We might try to use the reduce
function (R.I.P.):
>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!
Oops, this does not work: the key name is 'x'
, not x
. :(
I ended with:
>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})
(In Python 3, we would have to write list(d.items())
instead of d.items()
, but it's hypothethical, since there is no reduce
there.)
Please beat me with a better, more readable one-liner! ;)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
Python 3.x 确实有
reduce
,您只需执行from functools import reduce
即可。它还具有“字典理解”,其语法与您的示例中完全相同。Python 2.7 和 3.x 还有一个 Counter 类,它可以正是你想要的:
在Python 2.6或更早版本中,我个人会使用 defaultdict< /a> 并用 2 行完成:
这干净、高效、Python 风格,并且比任何涉及
reduce
的内容更容易让大多数人理解。Python 3.x does have
reduce
, you just have to do afrom functools import reduce
. It also has "dict comprehensions", which have exactly the syntax in your example.Python 2.7 and 3.x also have a Counter class which does exactly what you want:
In Python 2.6 or earlier, I'd personally use a defaultdict and do it in 2 lines:
That's clean, efficient, Pythonic, and much easier for most people to understand than anything involving
reduce
.为 oneliner 导入模块有点作弊,所以这里有一个 O(n) 的 oneliner,并且至少可以追溯到 Python2.4
如果你认为
__
方法很 hacky,你总是可以这样做这:)
It's kinda cheaty to import modules for oneliners, so here's a oneliner that is O(n) and works at least as far back as Python2.4
And if you think
__
methods are hacky, you can always do this:)
对于 python 2.7,您可以使用这个小列表理解:
For python 2.7, you can use this small list comprehension:
一个可以追溯到 2.3 的版本(比 Timmerman 的稍短,我认为更具可读性):
One that works back to 2.3 (slightly shorter than Timmerman's, I think more readable) :
有一段时间,任何使用
itertools
的东西都被定义为 Pythonic。尽管如此,这还是有点不透明:我目前运行的是 Python 2.5.4。
For a while there, anything using
itertools
was by definition Pythonic. Still, this is a bit on the opaque side:I'm currently running Python 2.5.4.
你的单行使用
reduce
几乎没问题,你只需要稍微调整一下:当然,这不会击败就地解决方案(无论是在速度上,还是在Python性上),但是作为交换,你得到了一个漂亮的纯功能片段。顺便说一句,如果 Python 有一个方法 dict.merge() ,这会更漂亮。
Your one-liner using
reduce
was almost ok, you only needed to tweak it a little bit:Of course, this won't beat in-place solutions (nor in speed, nor in pythonicity), but in exchange you've got yourself a nice purely functional snippet. BTW, this would be somewhat prettier if Python had a method
dict.merge()
.我需要一个直方图实现来在 python 2.2 到 2.7 中工作,并想出了这个:
我受到 Eli Courtwright 的 defaultdict 帖子的启发。这些是在 python 2.5 中引入的,所以不能使用。但可以使用 dict.setdefault(key,default) 来模拟它们。
这基本上和 gnibbler 正在做的事情是一样的,但是在我完全理解他的 lambda 函数之前我必须先写这个。
I needed a histogram implementation to work in python 2.2 up to 2.7, and came up with this:
I was inspired by Eli Courtwright's post of a defaultdict. These were introduced in python 2.5 so can't be used. But they can be emulated with the dict.setdefault(key,default).
This is basically the same thing gnibbler is doing, but I had to write this first before I could completely understand his lambda function.