是只能加载腌制文件的一部分还是可以使用流？

发布于 2025-01-18 09:31:31 字数 365 浏览 2 评论 0原文

我有一个大的pickle 文件，它是一个pandas 数据框。该数据是测量某些天气指标的分钟数据。

每天需要运行两项操作，这两项操作需要的时间比所需的时间长。我相信这是因为我加载数据的方式。

# load the pickle file
with open('my_file.pickle', 'rb') as handle:
    df = pd.read_pickle(handle)

因此，目前我加载整个文件只是为了获取数据框中的最后一条记录，这给了我最后一次观察的最后一个时间戳。有没有办法加载部分文件？

另外，一旦我有了最后一个时间戳，我就想附加新数据。我需要打开整个文件吗？将数据附加到数据框还是有更好的选择？

原文

I have a large pickle file which is a pandas dataframe. The data is minute data measuring certain weather metrics.

There are two operations that need to be run, on a daily basis which take longer than needed & I believe that is because of how I am loading the data.

# load the pickle file
with open('my_file.pickle', 'rb') as handle:
    df = pd.read_pickle(handle)

So currently I load whole file just to get the last record in the dataframe which gives me the last time stamp of the last observation. Is there a way of loading part of the file?

Also once I have last time stamp, I then want to append the new data. Do I need to open the whole file & append the data to the dataframe or is there a better alternative?

分享到QQ

分享到微博