泡菜的空间太多了

发布于 2025-01-30 16:22:42 字数 1042 浏览 2 评论 0原文

我有一个列表,其中包含一个非常大的numpy数组,一个很小的numpy数组和一些非常小的字段。我希望将我的列表保存为文件,然后将其加载。如果我使用在Python中全部保存/读取类,然后(尽管它很快加载)保存的文件太大了。

问题: 如果我们不压缩文件,从空间角度保存此列表的最佳方法是什么?

使用Pickle保存的描述列表的示例。

import pickle
import numpy as np

class MyClass:
    def __init__(self):
        self.largearray = np.random.rand(10000,10000,3) * 255
        self.smallarray = np.random.rand(100,100,3) * 255
        self.attribute1 = True
        self.attribute2 = 'Some String'
        self.attribute3 = 888
        self.list = [self.largearray, self.smallarray, self.attribute1, self.attribute2, self.attribute3]

a = MyClass()

with open(f'test.pickle', 'wb') as file:
    pickle.dump(a.list, file) 

with open(f'test.pickle', 'rb') as file2:
    a_loaded = pickle.load(file2)

编辑: 正如评论指出的那样,问题来自numpy而不是泡菜。我应该将numpy转换为其他一些数据结构,以便它不需要太多空间,并且可以在加载时迅速将其转换为numpy。实现它的最佳结构是什么?

I have a list which contains a very large numpy array, a very small numpy array, and some fields which are very small in size. I wish to save my list as a file and load it later on. If I use pickle as described in how to save/read class wholly in Python, then (although it loads fast) the saved file is way too large.

Question:
What is the best way of saving such a list from a space point of view if we do not compress the file?

An example of the described list that is saved using pickle.

import pickle
import numpy as np

class MyClass:
    def __init__(self):
        self.largearray = np.random.rand(10000,10000,3) * 255
        self.smallarray = np.random.rand(100,100,3) * 255
        self.attribute1 = True
        self.attribute2 = 'Some String'
        self.attribute3 = 888
        self.list = [self.largearray, self.smallarray, self.attribute1, self.attribute2, self.attribute3]

a = MyClass()

with open(f'test.pickle', 'wb') as file:
    pickle.dump(a.list, file) 

with open(f'test.pickle', 'rb') as file2:
    a_loaded = pickle.load(file2)

Edit:
As the comment points out, the problem comes from numpy rather than pickle. I should convert numpy to some other data structure such that it does not take too much space and it can be quickly converted to numpy when loaded. What is the best structure to achieve it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文