列出具有大于特定数字pandas的独特值的分类列列表
我有一个带有分类,数字和日期列的DF。 分类列的列表。
date_time1 date_time2 cat_col1 cat_col_2 num_col1 num_col2 cat_col3
2020-10-08 2021-11-08 ABC xyz 20 40 PQR
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 BCD xyz 30 50 ABC
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 ABC yza 40 30 MNO
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 CDE xyz 10 80 CDE
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 BCD xyz 20 70 MNO
19:09:21.884 15:18:26.864
我想列出所有具有唯一值超过2的 应该
mylist =['cat_col1', 'cat_col3']
有人可以帮我吗?
I have a DF with categorical, numeric and date columns. I want to make a list of all categorical columns that have unique values more than 2. So my df is something like this
date_time1 date_time2 cat_col1 cat_col_2 num_col1 num_col2 cat_col3
2020-10-08 2021-11-08 ABC xyz 20 40 PQR
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 BCD xyz 30 50 ABC
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 ABC yza 40 30 MNO
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 CDE xyz 10 80 CDE
19:09:21.884 15:18:26.864
2020-10-08 2021-11-08 BCD xyz 20 70 MNO
19:09:21.884 15:18:26.864
I want to now get a list of only categorical column names which have unique value counts more than 2. So in this case it should be
mylist =['cat_col1', 'cat_col3']
Can someone please help me with this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果要仅按名称选择列:
结果:
如果要按类型选择:
If you want to select the columns just by the name:
Result:
If you want to select by type: