数据集时间窗口
I have a Dataset under this form
I want to split the data set by making a windowing which includes the lines that happen every 2 minutes, then i m going to include the result in another data set which will be under this form
i'm asking if anyone can offer me a hand to speed up my work?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个随机数据框,DF:
现在,我将使用
pd.grouper
用于'2min'频率和apply(list)
df_out:
如果您将第二列作为一个列表然后使用
.tolist()
:要获取每个元素使用
df_out [i]
##i = 0,1,2,等等要将其转换为数据框
如果 using:
Entire code for a test csv file:
If you don't know how to create a sample df, here I put another example:
NB: Please explain clearly what you want to do with your results so that I can answer accordingly.
Here is a random dataframe, df:
Now, I will use
pd.Grouper
for '2Min' frequency andapply(list)
df_out:
if you want the 2nd column as a list then use
.tolist()
:to get each element use
df_out[i]
# i=0,1,2, etcif you want to convert it into a data frame then use
pd.DataFrame(df_out)
Remember if you are reading the text file from a csv or whatever file you will have to convert your df index to datetime index using:
Entire code for a test csv file:
If you don't know how to create a sample df, here I put another example:
N.B: Please explain clearly what you want to do with your results so that I can answer accordingly.
我花了一些时间,我发布了一个答案,但由于我意识到这与预期的结果有所不同,但被删除了……还取了一块@shuvashish-申请(列表)。但是无论如何..这应该给您带来预期的结果:
Shuvashish已经显示了PD.Greuper-我只爆炸了结果,并将其设置为第一次“地板”到分钟的第一个时间 - 在您的预期表中, t是04:50:00的时间,因为我们从一个奇数开始每2分钟开始汇总每2分钟04:07:00
It took me some time and I posted an answer but deleted as I realised it was different from the expected result...also took a piece of @Shuvashish - the apply(list). But anyways..this should give you the expected result:
Shuvashish already showed the pd.Grouper - I only exploded the results and set the origin to be the first time value 'floored' to the minute - btw in your expected table ,there shouldn't be 04:50:00 time as we started binning every 2 minutes from an odd number 04:07:00