我如何在熊猫的一定时间段内选择最少的NAN值?
我的数据集和数据集缺失了很多数据,这些数据将每小时数据存储几年。现在,我将实施一种季节性填充方法,其中我需要二年来我拥有的最佳数据(2*8760条目)。这意味着在接下来的两年中,丢失的数据最少(或最少的NAN值)。然后,我需要以DateTime格式的结束时间和开始时间的开始时间。我的数据存储在索引是每小时DateTime的数据框中。我该如何实现?
编辑: 为了使之更加清晰,我需要从两年(或2*8760行)的时间段中选择所有条目(值和NAN值),其中NAN值最少。
I have dataset with quite a lot data missing which stores hourly data for several years. I would now to implement a seasonal filling method where I need the best data I have for two following years (2*8760 entries). This means the least amount of data missing (or least amount of nan values) for two following years. I then need then the end time and start time of this period in datetime format. My data is stored in a dataframe where the index is the hourly datetime. How can I achieve this?
EDIT:
To make it a bit clearer I need to select all entries (values and nan values) from a time period of of two years (or of 2*8760 rows) where the least amount of nan values occur.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用
df = df.dropna()
从数据中删除所有NAN值You can remove all the NAN values from your data by using
df = df.dropna()