如何首先读取二进制 pickle 数据,然后取消它?
我正在解封磁盘上大小约为 1GB 的 NetworkX 对象。虽然我将它保存为二进制格式(使用协议 2),但解封这个文件需要很长时间——至少半个小时。我运行的系统有足够的系统内存(128 GB),所以这不是瓶颈。
我在此处读到,首先将整个文件读入内存,然后再读取,可以加快酸洗速度unpickle 它(该特定线程指的是 python 3.0,我没有使用它,但这一点在 python 2.6 中仍然应该是正确的)。
如何首先读取二进制文件,然后取消它?我尝试过:
import cPickle as pickle
f = open("big_networkx_graph.pickle","rb")
bin_data = f.read()
graph_data = pickle.load(bin_data)
但这返回:
TypeError: argument must have 'read' and 'readline' attributes
有什么想法吗?
I'm unpickling a NetworkX object that's about 1GB in size on disk. Although I saved it in the binary format (using protocol 2), it is taking a very long time to unpickle this file---at least half an hour. The system I'm running on has plenty of system memory (128 GB), so that's not the bottleneck.
I've read here that pickling can be sped up by first reading the entire file into memory, and then unpickling it (that particular thread refers to python 3.0, which I'm not using, but the point should still be true in python 2.6).
How do I first read the binary file, and then unpickle it? I have tried:
import cPickle as pickle
f = open("big_networkx_graph.pickle","rb")
bin_data = f.read()
graph_data = pickle.load(bin_data)
But this returns:
TypeError: argument must have 'read' and 'readline' attributes
Any ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
pickle.load(file)
需要一个类似文件的对象。相反,请使用:pickle.loads(string)
pickle.load(file)
expects a file-like object. Instead, use:pickle.loads(string)
该文档提到 StringIO ,我认为这是一种可能的解决方案。
尝试:
The documentation mentions StringIO, which I think is one possible solution.
Try: