是只能加载腌制文件的一部分还是可以使用流?
我有一个大的pickle 文件,它是一个pandas 数据框。该数据是测量某些天气指标的分钟数据。
每天需要运行两项操作,这两项操作需要的时间比所需的时间长。我相信这是因为我加载数据的方式。
# load the pickle file
with open('my_file.pickle', 'rb') as handle:
df = pd.read_pickle(handle)
因此,目前我加载整个文件只是为了获取数据框中的最后一条记录,这给了我最后一次观察的最后一个时间戳。有没有办法加载部分文件?
另外,一旦我有了最后一个时间戳,我就想附加新数据。我需要打开整个文件吗?将数据附加到数据框还是有更好的选择?
I have a large pickle file which is a pandas dataframe. The data is minute data measuring certain weather metrics.
There are two operations that need to be run, on a daily basis which take longer than needed & I believe that is because of how I am loading the data.
# load the pickle file
with open('my_file.pickle', 'rb') as handle:
df = pd.read_pickle(handle)
So currently I load whole file just to get the last record in the dataframe which gives me the last time stamp of the last observation. Is there a way of loading part of the file?
Also once I have last time stamp, I then want to append the new data. Do I need to open the whole file & append the data to the dataframe or is there a better alternative?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论