列列表中元素的熊猫价值计数

发布于 2025-01-21 16:25:36 字数 1132 浏览 1 评论 0原文

我有一个列,其中包含不同大小但项目数量有限的列表。

print(df['channels'].value_counts(), '\n')

输出:

[web, email, mobile, social]    77733
[web, email, mobile]            43730
[email, mobile, social]         32367
[web, email]                    13751

因此,我想要网络,电子邮件,移动设备和社交活动的总数。

这些应该是:

web =    77733 + 43730 + 13751            135,214
email =  77733 + 43730 + 13751 + 32367    167,581
mobile = 77733 + 43730 + 32367            153,830
social = 77733 + 32367                    110,100

我尝试了以下两种方法:

sum_channels_items = pd.Series([x for item in df['channels'] for x in item]).value_counts()
print(sum_channels_items)

from itertools import chain
test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
print(test)

两者都以相同的错误失败(仅显示第二个错误)。

Traceback (most recent call last):
  File "C:/Users/Mark/PycharmProjects/main/main.py", line 416, in <module>
    test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
TypeError: 'float' object is not iterable

I have a column that contains lists of varying size but a limited number of items.

print(df['channels'].value_counts(), '\n')

Output:

[web, email, mobile, social]    77733
[web, email, mobile]            43730
[email, mobile, social]         32367
[web, email]                    13751

So I want the total number of times that web, email, mobile and social each occur.

These should be:

web =    77733 + 43730 + 13751            135,214
email =  77733 + 43730 + 13751 + 32367    167,581
mobile = 77733 + 43730 + 32367            153,830
social = 77733 + 32367                    110,100

I have tried the following two methods:

sum_channels_items = pd.Series([x for item in df['channels'] for x in item]).value_counts()
print(sum_channels_items)

from itertools import chain
test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
print(test)

Both fail with the same error (just the second one shown).

Traceback (most recent call last):
  File "C:/Users/Mark/PycharmProjects/main/main.py", line 416, in <module>
    test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
TypeError: 'float' object is not iterable

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

木格 2025-01-28 16:25:37

一个选项是爆炸,然后计数值:

out = df['channels'].explode().value_counts()

另一个可能是使用collections.counter。请注意,您的错误表明您在列中缺少值,因此您可以先删除它们:

from itertools import chain
from collections import Counter
out = pd.Series(Counter(chain.from_iterable(df['channels'].dropna())))

One option is to explode, then count values:

out = df['channels'].explode().value_counts()

Another could be to use collections.Counter. Note that your error suggests you have missing values in the column, so you could drop them first:

from itertools import chain
from collections import Counter
out = pd.Series(Counter(chain.from_iterable(df['channels'].dropna())))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文