使用pythn的平均值来填充缺失时间

发布于 2025-01-17 18:49:43 字数 1323 浏览 1 评论 0原文

我有一个数据框,缺少一段时间(我希望每分钟)。请参阅下面的示例:

time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-01 00:00:07"),pd.to_datetime("2022-01-01 00:00:09"), pd.to_datetime("2022-01-01 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 59,59.2, 59.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.55, 1.57, 1.59]

df =  pd.DataFrame({"time": time, "lat": lat, "lng" :lng})


time                lat     lng
2022-01-01 00:00:00 58.1    1.34
2022-01-01 00:00:01 58.4    1.44
2022-01-01 00:00:03 58.5    1.46
2022-01-01 00:00:04 58.9    1.48
2022-01-01 00:00:07 59.0    1.55
2022-01-01 00:00:09 59.2    1.57
2022-01-01 00:00:10 59.5    1.59

我想弥补时间的差距,因此每分钟都有数据的数据,lat/lng填充了平均值之间的值。我的计划是为每分钟创建一系列时间,并尝试使用FFIL或类似的内容来填写缺失点。但是我不知道如何。预期的是,

time                lat     lng
2022-01-01 00:00:00 58.1    1.34
2022-01-01 00:00:01 58.4    1.44
2022-01-01 00:00:01 58.45   1.45
2022-01-01 00:00:03 58.5    1.46
2022-01-01 00:00:04 58.9    1.48
2022-01-01 00:00:05 58.933  1.5033
2022-01-01 00:00:06 58.966  1.5233
2022-01-01 00:00:07 59.0    1.55
2022-01-01 00:00:08 59.1    1.56
2022-01-01 00:00:09 59.2    1.57
2022-01-01 00:00:10 59.5    1.59

请给我一些有关如何执行此操作的建议

I have this dataframe with some time missing (I want it to be every minute). Please see the sample below:

time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-01 00:00:07"),pd.to_datetime("2022-01-01 00:00:09"), pd.to_datetime("2022-01-01 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 59,59.2, 59.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.55, 1.57, 1.59]

df =  pd.DataFrame({"time": time, "lat": lat, "lng" :lng})


time                lat     lng
2022-01-01 00:00:00 58.1    1.34
2022-01-01 00:00:01 58.4    1.44
2022-01-01 00:00:03 58.5    1.46
2022-01-01 00:00:04 58.9    1.48
2022-01-01 00:00:07 59.0    1.55
2022-01-01 00:00:09 59.2    1.57
2022-01-01 00:00:10 59.5    1.59

I want to fill out the gaps in time so there is data for every minute annd the lat/lng is filled in with an average of the values in between. my plan was to create an array of time for each minute and try using ffil or something similar to fill out the missing points. But I cannot figure out how. The expected out put would be this

time                lat     lng
2022-01-01 00:00:00 58.1    1.34
2022-01-01 00:00:01 58.4    1.44
2022-01-01 00:00:01 58.45   1.45
2022-01-01 00:00:03 58.5    1.46
2022-01-01 00:00:04 58.9    1.48
2022-01-01 00:00:05 58.933  1.5033
2022-01-01 00:00:06 58.966  1.5233
2022-01-01 00:00:07 59.0    1.55
2022-01-01 00:00:08 59.1    1.56
2022-01-01 00:00:09 59.2    1.57
2022-01-01 00:00:10 59.5    1.59

Please give me some advice on how to do this

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

千と千尋 2025-01-24 18:49:43

创建dateTimeIndex然后添加div。 =“ nofollow noreferrer”> dataFrame.asfreq dataframe.interpaly

df = df.set_index('time').asfreq(freq='S').interpolate()
print (df)
                           lat       lng
time                                    
2022-01-01 00:00:00  58.100000  1.340000
2022-01-01 00:00:01  58.400000  1.440000
2022-01-01 00:00:02  58.450000  1.450000
2022-01-01 00:00:03  58.500000  1.460000
2022-01-01 00:00:04  58.900000  1.480000
2022-01-01 00:00:05  58.933333  1.503333
2022-01-01 00:00:06  58.966667  1.526667
2022-01-01 00:00:07  59.000000  1.550000
2022-01-01 00:00:08  59.100000  1.560000
2022-01-01 00:00:09  59.200000  1.570000
2022-01-01 00:00:10  59.500000  1.590000

Create DatetimeIndex then add missing times by div.DataFrame.asfreq and interpolate by DataFrame.interpolate:

df = df.set_index('time').asfreq(freq='S').interpolate()
print (df)
                           lat       lng
time                                    
2022-01-01 00:00:00  58.100000  1.340000
2022-01-01 00:00:01  58.400000  1.440000
2022-01-01 00:00:02  58.450000  1.450000
2022-01-01 00:00:03  58.500000  1.460000
2022-01-01 00:00:04  58.900000  1.480000
2022-01-01 00:00:05  58.933333  1.503333
2022-01-01 00:00:06  58.966667  1.526667
2022-01-01 00:00:07  59.000000  1.550000
2022-01-01 00:00:08  59.100000  1.560000
2022-01-01 00:00:09  59.200000  1.570000
2022-01-01 00:00:10  59.500000  1.590000
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文