创建一个函数,该函数将接受 DataFrame 作为输入并返回所有适当分类特征的饼图

发布于 2025-01-10 21:22:13 字数 519 浏览 0 评论 0原文

我可以使用“Churn”列创建 1 个饼图来对数据进行分组,但是,不确定如何创建一个函数来接受 DataFrame 作为输入并返回所有适当的分类特征和饼图。在饼图中显示百分比分布?

作为 DataFrame,我使用“电信客户流失.csv"

f,axes=plt.subplots(1,2,figsize=(17,7))
df_churn['Churn'].value_counts().plot.pie(autopct='%1.1f%%',ax=axes[0])
sns.countplot('Churn',data=df_churn,ax=axes[1])
axes[0].set_title('Categorical Variable Pie Chart')
plt.show()

I can create 1 pie-chart using the 'Churn' column to group the data, however, not sure how to create a function that will accept a DataFrame as input and return pie-charts for all the appropriate Categorical features & show percentage distribution in the pie charts?

As DataFrame, I am using "Telco-Customer-Churn.csv"

f,axes=plt.subplots(1,2,figsize=(17,7))
df_churn['Churn'].value_counts().plot.pie(autopct='%1.1f%%',ax=axes[0])
sns.countplot('Churn',data=df_churn,ax=axes[1])
axes[0].set_title('Categorical Variable Pie Chart')
plt.show()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

2025-01-17 21:22:13

我做了这样的事情,不确定我是否做对了:-

#%% PlotMultiplePie

输入:df = Pandas dataframe,categorical_features = 功能列表,dropna = 使用 NaN 的布尔变量
输出:打印多个 px.pie()

def PlotMultiplePie(df_churn,categorical_features = None,dropna = False):
# 设置 30 个唯一变量的阈值,超过 50 个会导致饼图难看
阈值 = 40

# if user did not set categorical_features 
if categorical_features == None: 
    categorical_features = df_churn.select_dtypes(['object','category']).columns.to_list()
    print(categorical_features)

# loop through the list of categorical_features 
for cat_feature in categorical_features: 
    num_unique = df_churn[cat_feature].nunique(dropna = dropna)
    num_missing = df_churn[cat_feature].isna().sum()
    # prints pie chart and info if unique values below threshold 
    if num_unique <= threshold:
        print('Pie Chart for: ', cat_feature)
        print('Number of Unique Values: ', num_unique)
        print('Number of Missing Values: ', num_missing)
        fig = px.pie(df_churn[cat_feature].value_counts(dropna = dropna), values=cat_feature, 
             names = df_churn[cat_feature].value_counts(dropna = dropna).index,title = cat_feature,template='ggplot2')
        fig.show()
    else: 
        print('Pie Chart for ',cat_feature,' is unavailable due high number of Unique Values ')
        print('Number of Unique Values: ', num_unique)
        print('Number of Missing Values: ', num_missing)
        print('\n')

I did something like this, not sure if i did it right:-

#%% PlotMultiplePie

Input: df = Pandas dataframe, categorical_features = list of features , dropna = boolean variable to use NaN or not
Output: prints multiple px.pie()

def PlotMultiplePie(df_churn,categorical_features = None,dropna = False):
# set a threshold of 30 unique variables, more than 50 can lead to ugly pie charts
threshold = 40

# if user did not set categorical_features 
if categorical_features == None: 
    categorical_features = df_churn.select_dtypes(['object','category']).columns.to_list()
    print(categorical_features)

# loop through the list of categorical_features 
for cat_feature in categorical_features: 
    num_unique = df_churn[cat_feature].nunique(dropna = dropna)
    num_missing = df_churn[cat_feature].isna().sum()
    # prints pie chart and info if unique values below threshold 
    if num_unique <= threshold:
        print('Pie Chart for: ', cat_feature)
        print('Number of Unique Values: ', num_unique)
        print('Number of Missing Values: ', num_missing)
        fig = px.pie(df_churn[cat_feature].value_counts(dropna = dropna), values=cat_feature, 
             names = df_churn[cat_feature].value_counts(dropna = dropna).index,title = cat_feature,template='ggplot2')
        fig.show()
    else: 
        print('Pie Chart for ',cat_feature,' is unavailable due high number of Unique Values ')
        print('Number of Unique Values: ', num_unique)
        print('Number of Missing Values: ', num_missing)
        print('\n')
离笑几人歌 2025-01-17 21:22:13

这对我有用。定义了一个函数来绘制 dataframe 中所有分类变量的饼图。

#Function to plot Pie-Charts for all categorical variables in the dataframe
def pie_charts_for_CategoricalVar(df_pie,m):
    '''Takes in a dataframe(df_pie) and plots pie charts for all categorical columns. m = number of columns required in grid'''
    
    #get all the column names in the dataframe
    a = []
    for i in df_pie:
        a.append(i)
    
    #isolate the categorical variable names from a to b
    b = []
    for i in a:
        if (df[i].dtype.name) == 'category':
            b.append(i)
        
    plt.figure(figsize=(15, 12))
    plt.subplots_adjust(hspace=0.2)
    plt.suptitle("Pie-Charts for Categorical Variables in the dataframe", fontsize=18, y=0.95)
    
    # number of columns, as inputted while calling the function
    ncols = m
    # calculate number of rows
    nrows = len(b) // ncols + (len(b) % ncols > 0)
    
    # loop through the length of 'b' and keep track of index
    for n, i in enumerate(b):
        # add a new subplot iteratively using nrows and ncols
        ax = plt.subplot(nrows, ncols, n + 1)

        # filter df and plot 'i' on the new subplot axis
        df.groupby(i).size().plot(kind='pie', autopct='%.2f%%',ax=ax)
        
        ax.set_title(i.upper())
        ax.set_xlabel("")
        ax.set_ylabel("")
    plt.show()
#calling the function to plot pie-charts for categorical variable

pie_charts_for_CategoricalVar(df,5)   #dataframe, no. of cols in the grid

输入图片此处描述

This worked for me. Defined a function to plot the pie charts, for all categorical variables in a dataframe.

#Function to plot Pie-Charts for all categorical variables in the dataframe
def pie_charts_for_CategoricalVar(df_pie,m):
    '''Takes in a dataframe(df_pie) and plots pie charts for all categorical columns. m = number of columns required in grid'''
    
    #get all the column names in the dataframe
    a = []
    for i in df_pie:
        a.append(i)
    
    #isolate the categorical variable names from a to b
    b = []
    for i in a:
        if (df[i].dtype.name) == 'category':
            b.append(i)
        
    plt.figure(figsize=(15, 12))
    plt.subplots_adjust(hspace=0.2)
    plt.suptitle("Pie-Charts for Categorical Variables in the dataframe", fontsize=18, y=0.95)
    
    # number of columns, as inputted while calling the function
    ncols = m
    # calculate number of rows
    nrows = len(b) // ncols + (len(b) % ncols > 0)
    
    # loop through the length of 'b' and keep track of index
    for n, i in enumerate(b):
        # add a new subplot iteratively using nrows and ncols
        ax = plt.subplot(nrows, ncols, n + 1)

        # filter df and plot 'i' on the new subplot axis
        df.groupby(i).size().plot(kind='pie', autopct='%.2f%%',ax=ax)
        
        ax.set_title(i.upper())
        ax.set_xlabel("")
        ax.set_ylabel("")
    plt.show()
#calling the function to plot pie-charts for categorical variable

pie_charts_for_CategoricalVar(df,5)   #dataframe, no. of cols in the grid

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文