如何按价值索引,按
我正在尝试分析与健康个体的糖尿病不同诊断中的高血压率。 我要获得的输出是:
0 0.371132
8 0.752674
64 0.629022
我需要的输出是
Diabetes_012 average HBP occurence
0 0.371132
2 0.752674
1 0.629022
输出索引是糖尿病类型,值是糖尿病的平均出现。
这是此处的完整代码
import csv
import pandas as pd
import seaborn as sns
df = pd.read_csv ('diabetes_012_health_indicators_BRFSS2015.csv')
df2=df.copy
pd.set_option('display.max_columns', None)
df
import matplotlib.pyplot as plt
grouped=df.groupby(['Diabetes_012'])['HighBP'].transform('mean').drop_duplicates()
print(grouped)
是指向数据集的链接: https:/> https:/ /www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset
I am trying analyze the rates of high blood pressure in different diagnosis of diabetes with that of healthy individuals.
The output I am getting is this:
0 0.371132
8 0.752674
64 0.629022
The output I need is this
Diabetes_012 average HBP occurence
0 0.371132
2 0.752674
1 0.629022
Where the output index is the diabetes types and the value is the average occurrence of diabetes.
Here is the full code
import csv
import pandas as pd
import seaborn as sns
df = pd.read_csv ('diabetes_012_health_indicators_BRFSS2015.csv')
df2=df.copy
pd.set_option('display.max_columns', None)
df
import matplotlib.pyplot as plt
grouped=df.groupby(['Diabetes_012'])['HighBP'].transform('mean').drop_duplicates()
print(grouped)
Here is the link to the dataset: https://www.kaggle.com/datasets/alexteboul/diabetes-health-indicators-dataset
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不要使用
.transform
,只需抓住要执行均值的列(或列):带有多个列的示例:
Don't use
.transform
, just grab the column (or columns) on which you want to perform the mean:Example with multiple columns: