使用pythn的平均值来填充缺失时间
我有一个数据框,缺少一段时间(我希望每分钟)。请参阅下面的示例:
time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-01 00:00:07"),pd.to_datetime("2022-01-01 00:00:09"), pd.to_datetime("2022-01-01 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 59,59.2, 59.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.55, 1.57, 1.59]
df = pd.DataFrame({"time": time, "lat": lat, "lng" :lng})
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-01 00:00:07 59.0 1.55
2022-01-01 00:00:09 59.2 1.57
2022-01-01 00:00:10 59.5 1.59
我想弥补时间的差距,因此每分钟都有数据的数据,lat/lng填充了平均值之间的值。我的计划是为每分钟创建一系列时间,并尝试使用FFIL或类似的内容来填写缺失点。但是我不知道如何。预期的是,
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:01 58.45 1.45
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-01 00:00:05 58.933 1.5033
2022-01-01 00:00:06 58.966 1.5233
2022-01-01 00:00:07 59.0 1.55
2022-01-01 00:00:08 59.1 1.56
2022-01-01 00:00:09 59.2 1.57
2022-01-01 00:00:10 59.5 1.59
请给我一些有关如何执行此操作的建议
I have this dataframe with some time missing (I want it to be every minute). Please see the sample below:
time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-01 00:00:07"),pd.to_datetime("2022-01-01 00:00:09"), pd.to_datetime("2022-01-01 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 59,59.2, 59.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.55, 1.57, 1.59]
df = pd.DataFrame({"time": time, "lat": lat, "lng" :lng})
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-01 00:00:07 59.0 1.55
2022-01-01 00:00:09 59.2 1.57
2022-01-01 00:00:10 59.5 1.59
I want to fill out the gaps in time so there is data for every minute annd the lat/lng is filled in with an average of the values in between. my plan was to create an array of time for each minute and try using ffil or something similar to fill out the missing points. But I cannot figure out how. The expected out put would be this
time lat lng
2022-01-01 00:00:00 58.1 1.34
2022-01-01 00:00:01 58.4 1.44
2022-01-01 00:00:01 58.45 1.45
2022-01-01 00:00:03 58.5 1.46
2022-01-01 00:00:04 58.9 1.48
2022-01-01 00:00:05 58.933 1.5033
2022-01-01 00:00:06 58.966 1.5233
2022-01-01 00:00:07 59.0 1.55
2022-01-01 00:00:08 59.1 1.56
2022-01-01 00:00:09 59.2 1.57
2022-01-01 00:00:10 59.5 1.59
Please give me some advice on how to do this
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
创建
dateTimeIndex
然后添加div。 =“ nofollow noreferrer”>dataFrame.asfreq
和dataframe.interpaly
:Create
DatetimeIndex
then add missing times by div.DataFrame.asfreq
and interpolate byDataFrame.interpolate
: