如何从分类数据框架中创建100%堆叠的条形图
我有一个像这样的数据框架:
用户 | 食物1 | 食物2 | 食物3 | 食物4 |
---|---|---|---|---|
史蒂芬 | ·洋葱 | 西红柿 | 卷心菜汤汤 | 汤汤汤汤 |
汤 | 汤汤 | 番茄薯片 | 土豆 | 弗雷德 |
胡萝卜 | 茄子 | 茄子 | 茄子 | |
Phil | 洋葱 | 茄茄子 | 茄子 |
我想用各种食物柱类别中使用独特的价值。然后,我想创建一个海洋地块,以便将每一列的每个类别的%绘制为100%水平堆叠棒。
我尝试这样做:
data = {
'User' : ['Steph', 'Tom', 'Fred', 'Phil'],
'Food 1' : ["Onions", "Potatoes", "Carrots", "Onions"],
'Food 2' : ['Tomatoes', 'Tomatoes', 'Cabbages', 'Eggplant'],
'Food 3' : ["Cabbages", "Potatoes", "", "Eggplant"],
'Food 4' : ['Potatoes', 'Potatoes', 'Eggplant', ''],
}
df = pd.DataFrame(data)
x_ax = ["Onions", "Potatoes", "Carrots", "Onions", "", 'Eggplant', "Cabbages"]
df.plot(kind="barh", x=x_ax, y=["Food 1", "Food 2", "Food 3", "Food 4"], stacked=True, ax=axes[1])
plt.show()
I have a dataframe structured like this:
User | Food 1 | Food 2 | Food 3 | Food 4 |
---|---|---|---|---|
Steph | Onions | Tomatoes | Cabbages | Potatoes |
Tom | Potatoes | Tomatoes | Potatoes | Potatoes |
Fred | Carrots | Cabbages | Eggplant | |
Phil | Onions | Eggplant | Eggplant |
I want to use the distinct values from across the food columns as categories. I then want to create a Seaborn plot so the % of each category for each column is plotted as a 100% horizontal stacked bar.
My attempt to do this:
data = {
'User' : ['Steph', 'Tom', 'Fred', 'Phil'],
'Food 1' : ["Onions", "Potatoes", "Carrots", "Onions"],
'Food 2' : ['Tomatoes', 'Tomatoes', 'Cabbages', 'Eggplant'],
'Food 3' : ["Cabbages", "Potatoes", "", "Eggplant"],
'Food 4' : ['Potatoes', 'Potatoes', 'Eggplant', ''],
}
df = pd.DataFrame(data)
x_ax = ["Onions", "Potatoes", "Carrots", "Onions", "", 'Eggplant', "Cabbages"]
df.plot(kind="barh", x=x_ax, y=["Food 1", "Food 2", "Food 3", "Food 4"], stacked=True, ax=axes[1])
plt.show()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
''''
替换为np.nan
,因为空刺将被计为值。pandas.dataframe. -Melt
将数据框架转换为长表单。PANDAS.CROSSTAB
与归一化
参数来计算每个'food'
的百分比。pandas.dataframe.plot
和and- 将食物名称放在X轴上并不是创建100%堆叠棒图的正确方法。一个轴必须是数字。酒吧将通过食物类型进行颜色。
kint ='barh'
绘制数据框。seaborn
是matplotlib
和pandas
使用matplotlib
作为默认后端的高级API,并且更容易用pandas
产生一个堆叠的条图。seaborn
不支持堆叠的小花,除非histplot
以黑客式使用方式使用,如此答案,并且需要额外的步骤熔化百分比
。python 3.10
,pandas 1.4.2
,matplotlib 3.5.1
中测试。:=
)需要python> = 3.8
。否则,请使用[f'{v.get_width():. 2f}%'如果V.Get_width()> 0否则''对于c]
中的v。dataFrame视图
dfm
百分比
''
withnp.nan
because empty stings will be counted as values.pandas.DataFrame.melt
to convert the dataframe to a long form.pandas.crosstab
with thenormalize
parameter to calculate the percent for each'Food'
.pandas.DataFrame.plot
andkind='barh'
.seaborn
is a high-level API formatplotlib
, andpandas
usesmatplotlib
as the default backend, and it's easier to produce a stacked bar plot withpandas
.seaborn
doesn't support stacked barplots, unlesshistplot
is used in a hacked way, as shown in this answer, and would require an extra step of meltingpercent
.python 3.10
,pandas 1.4.2
,matplotlib 3.5.1
:=
) requirepython >= 3.8
. Otherwise, use[f'{v.get_width():.2f}%' if v.get_width() > 0 else '' for v in c ]
.DataFrame Views
dfm
percent