如何使用 cPickle Python 将包含 utf-8 字符作为其键的字典保存到文件中？

发布于 2024-10-21 06:03:04 字数 865 浏览 8 评论 0原文

我想知道如何使用 cPickle 将包含 utf-8 字符的字典作为其键保存到 Python 中的文件中？这本字典非常大，而且我听说 cPickle 比 pickle 快得多。另外我认为使用 utf-8 编码的密钥也是有问题的。也欢迎任何其他快速解决方案。这就是我所做的，下面是错误消息：

unique_ngrams_dict = defaultdict(lambda: 0)# just to show how I defined my dict


dict_file = codecs.open('ngram_dict', 'w', 'utf-8')
cPickle.dump(unique_ngrams_dict,dict_file)
dict_file.close()

错误消息：

Traceback (most recent call last):
  File "Generate_NGram.py", line 81, in <module>
    save_ngram_dict(unique_ngrams_dict)
  File "Generate_NGram.py", line 70, in save_ngram_dict
    cPickle.dump(unique_ngrams_dict,dict_file)
  File "/usr/lib/python2.6/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle function objects

谢谢

原文

I want to know How to save a dictionary containing utf-8 characters as its keys to a file in Python with cPickle? this dictionary is very large and I've heard that cPickle is much faster than pickle. Also I suppose having utf-8 encoded keys is also problematic.
Any other fast solutions are also welcome.
here is what I do and below is the error message:

unique_ngrams_dict = defaultdict(lambda: 0)# just to show how I defined my dict


dict_file = codecs.open('ngram_dict', 'w', 'utf-8')
cPickle.dump(unique_ngrams_dict,dict_file)
dict_file.close()

error message:

Traceback (most recent call last):
  File "Generate_NGram.py", line 81, in <module>
    save_ngram_dict(unique_ngrams_dict)
  File "Generate_NGram.py", line 70, in save_ngram_dict
    cPickle.dump(unique_ngrams_dict,dict_file)
  File "/usr/lib/python2.6/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle function objects

thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

何以畏孤独 2024-10-28 06:03:04

Pickle 是一种二进制格式，因此您不应使用任何编解码器打开该文件，只需：
```
文件('ngram_dict', 'w')
```
这不是它失败的原因，只是效率很低。
实际问题是您尝试保存的对象包含函数引用
（默认值lambda: 0）并且pickle格式不支持序列化函数。
您将有三个选择：
1. 使用常规 dict 并使用其带有默认参数的 .get 方法。
2. 设置
```
unique_ngrams_dict.default_factory = 无
```
  酸洗前并将其设置回
```
unique_ngrams_dict.default_factory = lambda：0
```
  解酸后。
3. 定义一个类：
```
NgramDefault 类：
    def __call__():
        返回0
```
  并使用 NgramDefault() 作为默认工厂，而不是 lambda: 0。

Pickle is a binary format, so you shouldn't open the file with any codecs, just:
```
file('ngram_dict', 'w')
```
It's not a reason it's failing, just quite inefficient.
The actual problem is the object you are trying to save contains a function reference
(the default value lambda: 0) and pickle format does not support serializing functions.
You'll have three options:
1. Use a regular dict and use it's .get method with default argument.
2. Set
```
unique_ngrams_dict.default_factory = None
```
  before pickling and set it back to
```
unique_ngrams_dict.default_factory = lambda: 0
```
  after unpickling.
3. Define a class like:
```
class NgramDefault:
    def __call__():
        return 0
```
  and use NgramDefault() as the default factory instead of lambda: 0.