如何在滚动间隔上选择最常见的类别 /文本？

发布于 2025-01-29 19:16:27 字数 649 浏览 2 评论 0原文

我有一个带有一列的数据框架，该列显示了一个字母的某个类别：

                    date  gradient category
0   2022-04-15 10:00:00  0.135626     S
1   2022-04-15 11:00:00  0.017990     A
2   2022-04-15 12:00:00  0.026333   S-A
3   2022-04-15 13:00:00  0.028347   S-A
4   2022-04-15 14:00:00  0.147611     S
..                  ...       ...   ...
411 2022-05-02 13:00:00  0.006906     D
412 2022-05-02 14:00:00  0.003823     D
413 2022-05-02 15:00:00  0.145872     S
414 2022-05-02 16:00:00  0.186694     S
415 2022-05-02 17:00:00  0.955833   NaN

类别的差异是频繁的。因此，我想制作一个函数，使我在新列中的滚动间隔中提供最多的类别。因此：

（s，a，a，a，d，sa，a）

将导致“ a”，因为那是最常见的。

原文

I have a data frame with a column that shows a certain category with a letter:

                    date  gradient category
0   2022-04-15 10:00:00  0.135626     S
1   2022-04-15 11:00:00  0.017990     A
2   2022-04-15 12:00:00  0.026333   S-A
3   2022-04-15 13:00:00  0.028347   S-A
4   2022-04-15 14:00:00  0.147611     S
..                  ...       ...   ...
411 2022-05-02 13:00:00  0.006906     D
412 2022-05-02 14:00:00  0.003823     D
413 2022-05-02 15:00:00  0.145872     S
414 2022-05-02 16:00:00  0.186694     S
415 2022-05-02 17:00:00  0.955833   NaN

The variance of the categories changing is to frequent. So I would like to make a function that gives me the most presented category over a rolling interval in a new column. So for example:

(S, A, A, A, D, S-A, A)

Will result in 'A', because that's the most frequent one.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青衫负雪 2025-02-05 19:16:27

您可以用计数器对象来计算外观，然后进行最频繁的外观：

from collections import Counter

column = ("S", "A", "A", "A", "D", "S-A", "A")

counter = Counter(column)
# returns a sorted list of elements and number of appearances
frequencies = counter.most_common()

# grab the first (most common) element and its frequency.
most_frequent_element, frequency = fequencies[0]

print(most_frequent_element, frequency)

输出：

A 4

You can count the appearances with the Counter object and then take the most frequent appearance:

from collections import Counter

column = ("S", "A", "A", "A", "D", "S-A", "A")

counter = Counter(column)
# returns a sorted list of elements and number of appearances
frequencies = counter.most_common()

# grab the first (most common) element and its frequency.
most_frequent_element, frequency = fequencies[0]

print(most_frequent_element, frequency)

output: