存储 pickle'd 数据时出现 DjangoUnicodeDecodeError
我有一个简单的 dict
对象,在通过 pickle
运行后,我试图将其存储在数据库中。 Django 似乎不喜欢尝试编码此错误。我已经检查过 MySQL,查询在抛出错误之前甚至还没有到达那里,所以我不认为这是问题所在。我存储的 dict 看起来像这样:
{
'ordered': [
{ 'value': u'First\xd1ame Last\xd1ame',
'label': u'Full Name' },
{ 'value': u'123-456-7890',
'label': u'Phone Number' },
{ 'value': u'[email protected]',
'label': u'Email Address' } ],
'cleaned_data': {
u'Phone Number': u'123-456-7890',
u'Full Name': u'First\xd1ame Last\xd1ame',
u'Email Address': u'[email protected]' },
'post_data': <QueryDict: {
u'Phone Number': [u'1234567890'],
u'Full Name_1': [u'Last\xd1ame'],
u'Full Name_0': [u'First\xd1ame'],
u'Email Address': [u'[email protected]'] }>,
'user': <User: itis>
}
抛出的错误是:
“utf8”编解码器无法解码位置 52-53 中的字节:无效数据。
位置 52-53 是腌制数据中 \xd1
(Ñ) 的第一个实例。
到目前为止,我已经深入研究了 StackOverflow,发现了一些对象的数据库编码错误的问题。这对我没有帮助,因为还没有 MySQL 查询。这是在数据库之前发生的。在搜索 pickled 数据上的 unicode 错误时,Google 也没有提供太多帮助。
可能值得一提的是,如果我不使用 Ñ,此代码可以正常工作。
I've got a simple dict
object I'm trying to store in the database after it has been run through pickle
. It seems that Django doesn't like trying to encode this error. I've checked with MySQL, and the query isn't even getting there before it is throwing the error, so I don't believe that is the problem. The dict
I'm storing looks like this:
{
'ordered': [
{ 'value': u'First\xd1ame Last\xd1ame',
'label': u'Full Name' },
{ 'value': u'123-456-7890',
'label': u'Phone Number' },
{ 'value': u'[email protected]',
'label': u'Email Address' } ],
'cleaned_data': {
u'Phone Number': u'123-456-7890',
u'Full Name': u'First\xd1ame Last\xd1ame',
u'Email Address': u'[email protected]' },
'post_data': <QueryDict: {
u'Phone Number': [u'1234567890'],
u'Full Name_1': [u'Last\xd1ame'],
u'Full Name_0': [u'First\xd1ame'],
u'Email Address': [u'[email protected]'] }>,
'user': <User: itis>
}
The error that gets thrown is:
'utf8' codec can't decode bytes in position 52-53: invalid data.
Position 52-53 is the first instance of \xd1
(Ñ) in the pickled data.
So far, I've dug around StackOverflow and found a few questions where the database encoding for the objects was wrong. This doesn't help me because there is no MySQL query yet. This is happening before the database. Google also didn't help much when searching for unicode errors on pickled data.
It is probably worth mentioning that if I don't use the Ñ, this code works fine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
非常感谢@prometheus,我找到了解决方案。基本上,您可以在将
pickle.dumps()
的输出插入数据库之前使用 base64 进行编码。然后,您将转身并使用 base64 来解码数据库的输出,然后将其传递给pickle.loads()
。我的代码现在如下所示:
再次感谢@prometheus。
With much thanks to @prometheus, I found a solution for this. Basically you can use base64 to encode the output of
pickle.dumps()
before plugging it into the database. You would then turn around and use base64 to decode the output of the database before passing it topickle.loads()
.My code now looks like this:
Again, thank you @prometheus.
这是一个已知问题,Python bug-tracker 上对此进行了讨论:
That's a known problem, and there was a discussion about this on the Python bug-tracker:
我认为没有必要这样做。通常,应该可以在数据库中存储任何二进制数据。
更糟糕的问题是 pickling 不安全 - 如果数据库可以从任何地方获取数据,它可能会获取恶意的 pickling 数据。
I see no need to do so. Normally, it should be possible to store any binary data in a database.
A worse problem is that pickling is not safe - if the database could get its data from anywhere, it could get malicious pickling data.