如何在PANDAS DataFrame的任何列中有关其他列中的数字数据(映射)更改分类数据?

发布于 2025-02-02 16:40:50 字数 713 浏览 2 评论 0原文

我想用pandas dataframe中的aqi_bucket列映射AQI列 我尝试使用循环,但无法得到

for aqi in df['AQI']:
    col1,col2 = df['AQI'],df['AQI_Bucket']
    _col1,_col2 = col1[0],col2[0]
    if df[aqi] == df['AQI_Bucket']:
        
        if pd.isnull(_col2):
            if _col1 in range(51):
                _col2 = "Good"
            elif _col1 in range(51, 101):
                _col2 = "Satisfactory"
            elif _col1 in range(101,201):
                _col2 = "Moderate"
            elif _col1 in range(201, 301):
                _col2 = "Poor"
            elif _col1 in range(301, 401):
                _col2 == "Very Poor"
            elif _col1 in range(401, 500):
                _col2 == "Severe"

i want to map the the AQI column with AQI_Bucket column in pandas dataframe
i tried it using for loop but couldnt get it

for aqi in df['AQI']:
    col1,col2 = df['AQI'],df['AQI_Bucket']
    _col1,_col2 = col1[0],col2[0]
    if df[aqi] == df['AQI_Bucket']:
        
        if pd.isnull(_col2):
            if _col1 in range(51):
                _col2 = "Good"
            elif _col1 in range(51, 101):
                _col2 = "Satisfactory"
            elif _col1 in range(101,201):
                _col2 = "Moderate"
            elif _col1 in range(201, 301):
                _col2 = "Poor"
            elif _col1 in range(301, 401):
                _col2 == "Very Poor"
            elif _col1 in range(401, 500):
                _col2 == "Severe"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

握住我的手 2025-02-09 16:40:50

我很确定我知道您要在这里做什么,并且您绝对应该使用.cut()方法:

import numpy as np
import pandas as pd

#generate a DF with 2 columns, 1 with random values, the other with fixed values
df = pd.DataFrame({'AQI_rand': np.random.randint(500, size=6),\
                   'AQI_fixed': [25,75,150,250,350,450]}) #I've made this one to check if the "bins" were correct

#create the first "AQI_Bucket" column linked to "AQI_rand"
df['AQI_Bucket'] = pd.cut(df['AQI_rand'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#create the first "AQI_Bucket2" column linked to "AQI_fixed"
df['AQI_Bucket2'] = pd.cut(df['AQI_fixed'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#rearrange columns order
df = df.iloc[:,[0,2,1,3]]
df
索引aqi_randaqi_bucketaqi_fixedaqi_bucket2
0326 0 326很差25好的25
1238可怜75 75令人满意的
2182中度150中度
3341非常差250糟糕
488令人满意的350非常差
5459RESTRING450RESTRY

如果不是您想要的,请让我知道,我会删除我的答案,否则请检查标记它

I'm pretty sure I know what you're trying to do here, and you should definitely use the .cut() method:

import numpy as np
import pandas as pd

#generate a DF with 2 columns, 1 with random values, the other with fixed values
df = pd.DataFrame({'AQI_rand': np.random.randint(500, size=6),\
                   'AQI_fixed': [25,75,150,250,350,450]}) #I've made this one to check if the "bins" were correct

#create the first "AQI_Bucket" column linked to "AQI_rand"
df['AQI_Bucket'] = pd.cut(df['AQI_rand'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#create the first "AQI_Bucket2" column linked to "AQI_fixed"
df['AQI_Bucket2'] = pd.cut(df['AQI_fixed'], bins=[0,50,100,200,300,400,500], labels=['Good','Satisfactory','Moderate','Poor','Very Poor','Severe'])

#rearrange columns order
df = df.iloc[:,[0,2,1,3]]
df
indexAQI_randAQI_BucketAQI_fixedAQI_Bucket2
0326Very Poor25Good
1238Poor75Satisfactory
2182Moderate150Moderate
3341Very Poor250Poor
488Satisfactory350Very Poor
5459Severe450Severe

If it's not what you were looking for, please let me know and I'll delete my answer, otherwise check mark it

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文