如何在滚动间隔上选择最常见的类别 /文本?
我有一个带有一列的数据框架,该列显示了一个字母的某个类别:
date gradient category
0 2022-04-15 10:00:00 0.135626 S
1 2022-04-15 11:00:00 0.017990 A
2 2022-04-15 12:00:00 0.026333 S-A
3 2022-04-15 13:00:00 0.028347 S-A
4 2022-04-15 14:00:00 0.147611 S
.. ... ... ...
411 2022-05-02 13:00:00 0.006906 D
412 2022-05-02 14:00:00 0.003823 D
413 2022-05-02 15:00:00 0.145872 S
414 2022-05-02 16:00:00 0.186694 S
415 2022-05-02 17:00:00 0.955833 NaN
类别的差异是频繁的。因此,我想制作一个函数,使我在新列中的滚动间隔中提供最多的类别。因此:
(s,a,a,a,d,sa,a)
将导致“ a”,因为那是最常见的。
I have a data frame with a column that shows a certain category with a letter:
date gradient category
0 2022-04-15 10:00:00 0.135626 S
1 2022-04-15 11:00:00 0.017990 A
2 2022-04-15 12:00:00 0.026333 S-A
3 2022-04-15 13:00:00 0.028347 S-A
4 2022-04-15 14:00:00 0.147611 S
.. ... ... ...
411 2022-05-02 13:00:00 0.006906 D
412 2022-05-02 14:00:00 0.003823 D
413 2022-05-02 15:00:00 0.145872 S
414 2022-05-02 16:00:00 0.186694 S
415 2022-05-02 17:00:00 0.955833 NaN
The variance of the categories changing is to frequent. So I would like to make a function that gives me the most presented category over a rolling interval in a new column. So for example:
(S, A, A, A, D, S-A, A)
Will result in 'A', because that's the most frequent one.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以用计数器对象来计算外观,然后进行最频繁的外观:
输出:
You can count the appearances with the Counter object and then take the most frequent appearance:
output: