计算列值计数为Python数据框中的条图

发布于 2025-02-06 05:40:18 字数 6257 浏览 1 评论 0原文

我有时间序列数据，希望看到 sepislabel 列中的化粪池（1）和非性（0）患者的总数。非性能患者没有“ 1”条目。虽然化粪池患者首先具有“零（0）”，但它变为“ 1”意味着它现在变成化粪池。数据看起来像这样：

HR	SBP	DBP	SEPSISLABEL	性别	P_ID
92	120	80	0	0	0
98	115	85	0	0 0	0 1
93	125	75	0	1	1
95	130	90 90	1	93	93
125	75	1	1	1	1
93	95 130	90 90	125 75 1	125 1	1 1 1
93 125	125	75	1	1	1
95	130	90	1	1 1	1
102	120	80	0 0	0	2
109	115	75	0	0	2
94	135	100	0 0	0	2
97	100	70	0	0	3
85	120	80	0	0	3
88	115	75	1	0	3
93 125	85	85	1	0	3
78	130	90	1	1 1	4
115	140	110	1	1	4

，这里有3名化粪池患者（P_ID = 1、3、4）和2名非骨化患者（P_ID = 0、2）。我想将这个数字绘制为条形图。因此，我使用以下代码手动执行此操作：

import matplotlib.pyplot as plt
fig = plt.figure(figsize=(7, 6))
ax = fig.add_axes([0,0,1,1])
sepsis = ['Non-Septic patients', 'Septic patients']
count = [2, 3]
ax.bar(sepsis, count)

ax.set_title("Septic and Non-septic patient count in the dataset", y = 1, fontsize = 15)
ax.set_xlabel('Patients', fontsize = 12)
ax.set_ylabel('Count', fontsize = 12)

for bars in ax.containers:
    ax.bar_label(bars)
ax.margins(y=0.1)  
plt.show()

但是，我不想手动计算化粪池和非性能患者的数量，因为我拥有的数据很大。这只是虚拟数据。我知道我必须使用P_ID列，但不确定如何。
我要绘制的第二件事是从这些化粪池和非性能患者中，基于 Gender 列的男性（1）和女性（1）。我想要这样的图表：

****更新****

使用drop_duplicates默认情况下仅保留第一行。因此，最初具有0s的化粪池患者，然后将其更改为1，就会出现问题。即使患者是化粪池，也只使用代码也只有第一行。因此，化粪池患者的总数下降，而非性患者人数增加，这不应增加。是否只能将这些行保留在化粪池患者中，0更改为1？因此，所有化粪池患者的第一行中的sepislabel中都有1，而不是0。这将提供正确数量的化粪池患者。

原文

I have time series data and want to see total number of Septic (1) and Non-septic (0) patients in the SepsisLabel column. The Non-septic patients don't have entries of '1'. While the Septic patients have first 'Zeros (0)' then it changes to '1' means it now becomes septic. The data looks like this:

HR	SBP	DBP	SepsisLabel	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	0	1	1
95	130	90	0	1	1
93	125	75	1	1	1
95	130	90	1	1	1
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	0	0	3
85	120	80	0	0	3
88	115	75	1	0	3
93	125	85	1	0	3
78	130	90	1	1	4
115	140	110	1	1	4

Here, there are 3 Septic patients (P_ID = 1, 3, 4) and 2 Non-septic patients (P_ID = 0, 2). I want to plot this number as a bar plot. So, I manually did this using the following code:

import matplotlib.pyplot as plt
fig = plt.figure(figsize=(7, 6))
ax = fig.add_axes([0,0,1,1])
sepsis = ['Non-Septic patients', 'Septic patients']
count = [2, 3]
ax.bar(sepsis, count)

ax.set_title("Septic and Non-septic patient count in the dataset", y = 1, fontsize = 15)
ax.set_xlabel('Patients', fontsize = 12)
ax.set_ylabel('Count', fontsize = 12)

for bars in ax.containers:
    ax.bar_label(bars)
ax.margins(y=0.1)  
plt.show()

However, I don't want to manually calculate the septic and non-septic patient count as the data I have is very large. This is just the dummy data. I know I must use P_ID column but not sure how.
Second thing I want to plot is Out of these septic and non-septic patients, how many are Male (1) and Female (0) based on the Gender column. I want something like this graph:

****Update****

Using drop_duplicates keeps only first row by default. So, the septic patient which has initially 0s then it changes to 1, there arise problem for them. Using the code only take first row even the patient is septic. Thus total number of septic patients drops, while number of non-septic patients increases, which shouldn't. Is it possible to keep only those rows in septic patients where 0 changes to 1? So, all septic patients will have 1 in SepsisLabel in their first row instead of 0. This will give the correct number of septic patients.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

揽月 2025-02-13 05:40:18

1）使用np.Where。对于2），您可以使用seaborn用于第二目的：

dedup = df.groupby('P_ID')[['SepsisLabel', 'Gender']].max().reset_index()

dedup['SepticType'] = np.where(dedup.SepsisLabel, 'Septic', 'NonSeptic')
sns.countplot(data=dedup, x='SepticType', hue='Gender')

output：

For 1) use np.where. For 2), you can use seaborn for the second purpose:

dedup = df.groupby('P_ID')[['SepsisLabel', 'Gender']].max().reset_index()

dedup['SepticType'] = np.where(dedup.SepsisLabel, 'Septic', 'NonSeptic')
sns.countplot(data=dedup, x='SepticType', hue='Gender')

Output:

回复收藏 0 原文

~没有更多了~

关于作者

红墙和绿瓦

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

计算列值计数为Python数据框中的条图

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

夢野间

百度③文鱼

小草泠泠

zhuwenyan

weirdo

坚持沉默

友情链接

HR	SBP	DBP	SepsisLabel	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	0	1	1
95	130	90	0	1	1
93	125	75	1	1	1
95	130	90	1	1	1
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	0	0	3
85	120	80	0	0	3
88	115	75	1	0	3
93	125	85	1	0	3
78	130	90	1	1	4
115	140	110	1	1	4

HR	SBP	DBP	SepsisLabel	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	0	1	1
95	130	90	0	1	1
93	125	75	1	1	1
95	130	90	1	1	1
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	0	0	3
85	120	80	0	0	3
88	115	75	1	0	3
93	125	85	1	0	3
78	130	90	1	1	4
115	140	110	1	1	4

计算列值计数为Python数据框中的条图

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

夢野间

百度③文鱼

小草泠泠

zhuwenyan

weirdo

坚持沉默

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

HR	SBP	DBP	SepsisLabel	Gender	P_ID
92	120	80	0	0	0
98	115	85	0	0	0
93	125	75	0	1	1
95	130	90	0	1	1
93	125	75	1	1	1
95	130	90	1	1	1
93	125	75	1	1	1
95	130	90	1	1	1
102	120	80	0	0	2
109	115	75	0	0	2
94	135	100	0	0	2
97	100	70	0	0	3
85	120	80	0	0	3
88	115	75	1	0	3
93	125	85	1	0	3
78	130	90	1	1	4
115	140	110	1	1	4