使用Groupby和滚动平均值填充DF中的NAN
我有一个看起来像这样的数据框,
d = {'date': ['1999-01-01', '1999-01-02', '1999-01-03', '1999-01-04', '1999-01-05', '1999-01-06'], 'ID': [1,1,1,1,1,1], 'Value':[1,2,3,np.NaN,5,6]}
df = pd.DataFrame(data = d)
date ID Value
0 1999-01-01 1 1
1 1999-01-02 1 2
2 1999-01-03 1 3
3 1999-01-04 1 NaN
4 1999-01-05 1 5
5 1999-01-06 1 6
我想使用滚动平均值(例如2)填充NAN,并将其扩展到具有多个ID和日期的DF。我尝试了这样的S.Th,但是需要很长时间,并且失败了,错误“无法加入没有重叠索引名称”
df.groupby(['date','ID']).fillna(df.rolling(2, min_periods=1).mean().shift())
或
df.groupby(['date','ID']).fillna(df.groupby(['date','ID']).rolling(2, min_periods=1).mean().shift())
I have a dataframe that looks like this
d = {'date': ['1999-01-01', '1999-01-02', '1999-01-03', '1999-01-04', '1999-01-05', '1999-01-06'], 'ID': [1,1,1,1,1,1], 'Value':[1,2,3,np.NaN,5,6]}
df = pd.DataFrame(data = d)
date ID Value
0 1999-01-01 1 1
1 1999-01-02 1 2
2 1999-01-03 1 3
3 1999-01-04 1 NaN
4 1999-01-05 1 5
5 1999-01-06 1 6
I would like to fill in NaNs using a rolling mean (e.g 2) and extend that to a df with multiple IDs and dates. I tried s.th like this but it takes a very long time and fails with the error "cannot join with no overlapping index names"
df.groupby(['date','ID']).fillna(df.rolling(2, min_periods=1).mean().shift())
or
df.groupby(['date','ID']).fillna(df.groupby(['date','ID']).rolling(2, min_periods=1).mean().shift())
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
iiuc,这是一种做到这一点的方法
如果添加预期输出,这将有助于验证此解决方案
IIUC, here is one way to do it
if you add expected output that will help validate this solution