如何在PANDAS DataFrame的任何列中有关其他列中的数字数据（映射）更改分类数据？

发布于 2025-02-02 16:40:50 字数 713 浏览 2 评论 0原文

我想用pandas dataframe中的aqi_bucket列映射AQI列我尝试使用循环，但无法得到

for aqi in df['AQI']:
    col1,col2 = df['AQI'],df['AQI_Bucket']
    _col1,_col2 = col1[0],col2[0]
    if df[aqi] == df['AQI_Bucket']:
        
        if pd.isnull(_col2):
            if _col1 in range(51):
                _col2 = "Good"
            elif _col1 in range(51, 101):
                _col2 = "Satisfactory"
            elif _col1 in range(101,201):
                _col2 = "Moderate"
            elif _col1 in range(201, 301):
                _col2 = "Poor"
            elif _col1 in range(301, 401):
                _col2 == "Very Poor"
            elif _col1 in range(401, 500):
                _col2 == "Severe"

原文

i want to map the the AQI column with AQI_Bucket column in pandas dataframe
i tried it using for loop but couldnt get it

for aqi in df['AQI']:
    col1,col2 = df['AQI'],df['AQI_Bucket']
    _col1,_col2 = col1[0],col2[0]
    if df[aqi] == df['AQI_Bucket']:
        
        if pd.isnull(_col2):
            if _col1 in range(51):
                _col2 = "Good"
            elif _col1 in range(51, 101):
                _col2 = "Satisfactory"
            elif _col1 in range(101,201):
                _col2 = "Moderate"
            elif _col1 in range(201, 301):
                _col2 = "Poor"
            elif _col1 in range(301, 401):
                _col2 == "Very Poor"
            elif _col1 in range(401, 500):
                _col2 == "Severe"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

握住我的手 2025-02-09 16:40:50

我很确定我知道您要在这里做什么，并且您绝对应该使用.cut（）方法：

import numpy as np
import pandas as pd

#generate a DF with 2 columns, 1 with random values, the other with fixed values
df = pd.DataFrame({'AQI_rand': np.random.randint(500, size=6),\
                   'AQI_fixed': [25,75,150,250,350,450]}) #I've made this one to check if the "bins" were correct

#create the first "AQI_Bucket" column linked to "AQI_rand"
df['AQI_Bucket'] = pd.cut(df['AQI_rand'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#create the first "AQI_Bucket2" column linked to "AQI_fixed"
df['AQI_Bucket2'] = pd.cut(df['AQI_fixed'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#rearrange columns order
df = df.iloc[:,[0,2,1,3]]
df

索引	aqi_rand	aqi_bucket	aqi_fixed	aqi_bucket2
0	326 0 326	很差	25好的25	好
1	238	可怜	75 75	令人满意的
2	182	中度	150	中度
3	341	非常差	250	糟糕
4	88	令人满意的	350	非常差
5	459	RESTRING	450	RESTRY

如果不是您想要的，请让我知道，我会删除我的答案，否则请检查标记它✅

I'm pretty sure I know what you're trying to do here, and you should definitely use the .cut() method:

import numpy as np
import pandas as pd

#generate a DF with 2 columns, 1 with random values, the other with fixed values
df = pd.DataFrame({'AQI_rand': np.random.randint(500, size=6),\
                   'AQI_fixed': [25,75,150,250,350,450]}) #I've made this one to check if the "bins" were correct

#create the first "AQI_Bucket" column linked to "AQI_rand"
df['AQI_Bucket'] = pd.cut(df['AQI_rand'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#create the first "AQI_Bucket2" column linked to "AQI_fixed"
df['AQI_Bucket2'] = pd.cut(df['AQI_fixed'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#rearrange columns order
df = df.iloc[:,[0,2,1,3]]
df