python 直方图单线

发布于 2024-09-02 08:10:27 字数 1270 浏览 17 评论 0原文

有很多方法可以编写计算直方图的 Python 程序。

我所说的直方图是指计算可迭代对象中对象的出现次数并在字典中输出计数的函数。例如：

>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}

编写这个函数的一种方法是：

def histogram(L):
    d = {}
    for x in L:
        if x in d:
            d[x] += 1
        else:
            d[x] = 1
    return d

有没有更简洁的方法来编写这个函数？

如果我们在 Python 中有字典推导式，我们可以这样写：

>>> { x: L.count(x) for x in set(L) }

但由于 Python 2.6 没有它们，我们必须这样写：

>>> dict([(x, L.count(x)) for x in set(L)])

虽然这种方法可能可读，但效率不高： L 被遍历多次。此外，这不适用于单寿命发电机；该函数对于迭代器生成器应该同样有效，例如：

def gen(L):
    for x in L:
        yield x

我们可以尝试使用 reduce 函数 (RIP)：

>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!

哎呀，这不起作用：键名称是 'x'，而不是<代码>x。 :(

我的结尾是：（

>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})

在 Python 3 中，我们必须编写 list(d.items()) 而不是 d.items()，但这是假设的，因为那里没有reduce。）

请用更好、更易读的单行来打败我；）

原文

There are many ways to write a Python program that computes a histogram.

By histogram, I mean a function that counts the occurrence of objects in an iterable and outputs the counts in a dictionary. For example:

>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}

One way to write this function is:

def histogram(L):
    d = {}
    for x in L:
        if x in d:
            d[x] += 1
        else:
            d[x] = 1
    return d

Are there more concise ways of writing this function?

If we had dictionary comprehensions in Python, we could write:

>>> { x: L.count(x) for x in set(L) }

but since Python 2.6 doesn't have them, we have to write:

>>> dict([(x, L.count(x)) for x in set(L)])

Although this approach may be readable, it is not efficient: L is walked-through multiple times. Furthermore, this won't work for single-life generators; the function should work equally well for iterator generators such as:

def gen(L):
    for x in L:
        yield x

We might try to use the reduce function (R.I.P.):

>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!

Oops, this does not work: the key name is 'x', not x. :(

I ended with:

>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})

(In Python 3, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there.)

Please beat me with a better, more readable one-liner! ;)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

世界等同你 2024-09-09 08:10:27

Python 3.x 确实有 reduce，您只需执行 from functools import reduce 即可。它还具有“字典理解”，其语法与您的示例中完全相同。

Python 2.7 和 3.x 还有一个 Counter 类，它可以正是你想要的：

from collections import Counter
cnt = Counter("abracadabra")

在Python 2.6或更早版本中，我个人会使用 defaultdict< /a> 并用 2 行完成：

d = defaultdict(int)
for x in xs: d[x] += 1

这干净、高效、Python 风格，并且比任何涉及 reduce 的内容更容易让大多数人理解。

Python 3.x does have reduce, you just have to do a from functools import reduce. It also has "dict comprehensions", which have exactly the syntax in your example.

Python 2.7 and 3.x also have a Counter class which does exactly what you want:

from collections import Counter
cnt = Counter("abracadabra")

In Python 2.6 or earlier, I'd personally use a defaultdict and do it in 2 lines:

d = defaultdict(int)
for x in xs: d[x] += 1

That's clean, efficient, Pythonic, and much easier for most people to understand than anything involving reduce.

回复收藏 0 原文

︶￣淡然 2024-09-09 08:10:27

为 oneliner 导入模块有点作弊，所以这里有一个 O(n) 的 oneliner，并且至少可以追溯到 Python2.4

>>> f=lambda s,d={}:([d.__setitem__(i,d.get(i,0)+1) for i in s],d)[-1]
>>> f("ABRACADABRA")
{'A': 5, 'R': 2, 'B': 2, 'C': 1, 'D': 1}

如果你认为 __ 方法很 hacky，你总是可以这样做这

>>> f=lambda s,d=lambda:0:vars(([setattr(d,i,getattr(d,i,0)+1) for i in s],d)[-1])
>>> f("ABRACADABRA")
{'A': 5, 'R': 2, 'B': 2, 'C': 1, 'D': 1}

：）

It's kinda cheaty to import modules for oneliners, so here's a oneliner that is O(n) and works at least as far back as Python2.4

>>> f=lambda s,d={}:([d.__setitem__(i,d.get(i,0)+1) for i in s],d)[-1]
>>> f("ABRACADABRA")
{'A': 5, 'R': 2, 'B': 2, 'C': 1, 'D': 1}

And if you think __ methods are hacky, you can always do this

>>> f=lambda s,d=lambda:0:vars(([setattr(d,i,getattr(d,i,0)+1) for i in s],d)[-1])
>>> f("ABRACADABRA")
{'A': 5, 'R': 2, 'B': 2, 'C': 1, 'D': 1}

回复收藏 0 原文

独自←快乐 2024-09-09 08:10:27

import pandas as pd

pd.Series(list(L)).value_counts()

import pandas as pd

pd.Series(list(L)).value_counts()

回复收藏 0 原文

海之角 2024-09-09 08:10:27

$d{$_} += 1 for split //, 'abracadabra';

$d{$_} += 1 for split //, 'abracadabra';

回复收藏 0 原文

绾颜 2024-09-09 08:10:27

对于 python 2.7，您可以使用这个小列表理解：

v = list('abracadabra')
print {x: v.count(x) for x in set(v)}

For python 2.7, you can use this small list comprehension:

v = list('abracadabra')
print {x: v.count(x) for x in set(v)}

回复收藏 0 原文

深居我梦 2024-09-09 08:10:27

一个可以追溯到 2.3 的版本（比 Timmerman 的稍短，我认为更具可读性）：

L = 'abracadabra'
hist = {}
for x in L: hist[x] = hist.pop(x,0) + 1
print hist
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}

One that works back to 2.3 (slightly shorter than Timmerman's, I think more readable) :

L = 'abracadabra'
hist = {}
for x in L: hist[x] = hist.pop(x,0) + 1
print hist
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}

回复收藏 0 原文

深居我梦 2024-09-09 08:10:27

有一段时间，任何使用 itertools 的东西都被定义为 Pythonic。尽管如此，这还是有点不透明：

>>> from itertools import groupby
>>> grouplen = lambda grp : sum(1 for i in grp)
>>> hist = dict((a[0], grouplen(a[1])) for a in groupby(sorted("ABRACADABRA")))
>>> print hist
{'A': 5, 'R': 2, 'C': 1, 'B': 2, 'D': 1}

我目前运行的是 Python 2.5.4。

For a while there, anything using itertools was by definition Pythonic. Still, this is a bit on the opaque side:

>>> from itertools import groupby
>>> grouplen = lambda grp : sum(1 for i in grp)
>>> hist = dict((a[0], grouplen(a[1])) for a in groupby(sorted("ABRACADABRA")))
>>> print hist
{'A': 5, 'R': 2, 'C': 1, 'B': 2, 'D': 1}

I'm currently running Python 2.5.4.

回复收藏 0 原文

虫児飞 2024-09-09 08:10:27

你的单行使用 reduce 几乎没问题，你只需要稍微调整一下：

>>> reduce(lambda d, x: dict(d, **{x: d.get(x, 0) + 1}), L, {})
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}

当然，这不会击败就地解决方案（无论是在速度上，还是在Python性上），但是作为交换，你得到了一个漂亮的纯功能片段。顺便说一句，如果 Python 有一个方法 dict.merge() ，这会更漂亮。

Your one-liner using reduce was almost ok, you only needed to tweak it a little bit:

>>> reduce(lambda d, x: dict(d, **{x: d.get(x, 0) + 1}), L, {})
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}

Of course, this won't beat in-place solutions (nor in speed, nor in pythonicity), but in exchange you've got yourself a nice purely functional snippet. BTW, this would be somewhat prettier if Python had a method dict.merge().

回复收藏 0 原文

|煩躁 2024-09-09 08:10:27

我需要一个直方图实现来在 python 2.2 到 2.7 中工作，并想出了这个：

>>> L = 'abracadabra'
>>> hist = {}
>>> for x in L: hist[x] = hist.setdefault(x,0)+1
>>> print hist
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}

我受到 Eli Courtwright 的 defaultdict 帖子的启发。这些是在 python 2.5 中引入的，所以不能使用。但可以使用 dict.setdefault(key,default) 来模拟它们。

这基本上和 gnibbler 正在做的事情是一样的，但是在我完全理解他的 lambda 函数之前我必须先写这个。

I needed a histogram implementation to work in python 2.2 up to 2.7, and came up with this:

>>> L = 'abracadabra'
>>> hist = {}
>>> for x in L: hist[x] = hist.setdefault(x,0)+1
>>> print hist
{'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1}

I was inspired by Eli Courtwright's post of a defaultdict. These were introduced in python 2.5 so can't be used. But they can be emulated with the dict.setdefault(key,default).

This is basically the same thing gnibbler is doing, but I had to write this first before I could completely understand his lambda function.

回复收藏 0 原文

~没有更多了~