DjangoUnicodeDecodeError和force_unicode
我有简单的 Django 新闻条目模型:
class NewsEntry(models.Model):
pub_date = models.DateTimeField('date published')
title = models.CharField(max_length = 200)
summary = models.TextField()
content = models.TextField()
def __unicode__(self):
return self.title
使用英文文本添加新新闻(在管理页面中)效果很好,但是当我尝试使用俄语文本添加新闻时出现错误:
/admin/news/newsentry/ 处的模板语法错误
渲染时捕获 DjangoUnicodeDecodeError:“ascii”编解码器无法解码位置 0 中的字节 0xd0:序数不在范围内 (128)。您传入 NewsEntry:[错误的 Unicode 数据](类 'antek.news.models.NewsEntry')
Django 版本:1.2.2
异常类型:TemplateSyntaxError
异常值:渲染时捕获 DjangoUnicodeDecodeError:“ascii”编解码器无法解码位置 0 中的字节 0xd0:序数不在范围内 (128)。您传入 NewsEntry:[错误的 Unicode 数据](类 'antek.news.models.NewsEntry')
异常位置:/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py >force_unicode,第 88 行
Python版本:2.6.5
回溯列表中的最后一项是:
force_unicode 中的/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py
本地变量:
e: UnicodeDecodeError('ascii', '\xd0\xa2\xd0\xb5\xd1\x81\xd1\x82 \xd1\x80\xd1\x83\xd1\x81\xd1\x81\xd0\xba\xd0 \xbe\xd0\xb3\xd0\xbe', 0, 1, '序号不在范围(128)')
代码看起来正确: self.title 是 unicode 对象。此外,djangoproject.com 在其 博客应用程序中使用类似的代码。
我花了很多时间来解决这个问题并找到了奇怪的解决方案:
from django.utils.encoding import force_unicode
# ...
def __unicode__(self):
return force_unicode(self.title)
但是由于 self.title 是 unicode 对象,force_unicode 应该返回它而不做任何更改。
为什么 return self.title
不起作用?
I've simple Django model of news entry:
class NewsEntry(models.Model):
pub_date = models.DateTimeField('date published')
title = models.CharField(max_length = 200)
summary = models.TextField()
content = models.TextField()
def __unicode__(self):
return self.title
Adding new news (in Admin page) with english text works fine but when i try to add news with russian text there is error:
TemplateSyntaxError at /admin/news/newsentry/
Caught DjangoUnicodeDecodeError while rendering: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128). You passed in NewsEntry: [Bad Unicode data] (class 'antek.news.models.NewsEntry')
Django Version: 1.2.2
Exception Type: TemplateSyntaxError
Exception Value: Caught DjangoUnicodeDecodeError while rendering: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128). You passed in NewsEntry: [Bad Unicode data] (class 'antek.news.models.NewsEntry')
Exception Location: /usr/local/lib/python2.6/dist-packages/django/utils/encoding.py in >force_unicode, line 88
Python Version: 2.6.5
The last item in traceback list is:
/usr/local/lib/python2.6/dist-packages/django/utils/encoding.py in force_unicode
Local vars:
e: UnicodeDecodeError('ascii', '\xd0\xa2\xd0\xb5\xd1\x81\xd1\x82 \xd1\x80\xd1\x83\xd1\x81\xd1\x81\xd0\xba\xd0\xbe\xd0\xb3\xd0\xbe', 0, 1, 'ordinal not in range(128)')
Code looks correct: self.title is unicode object. Also, djangoproject.com use similar code in their blog application.
I spend much time to solve this problem and founded strange solution:
from django.utils.encoding import force_unicode
# ...
def __unicode__(self):
return force_unicode(self.title)
But due to self.title is unicode object, force_unicode should return it without any changes.
Why return self.title
doesn't work?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
问题出在 MySQL 服务器中的 utf8_bin 排序规则中。完整信息此处。
Problem was in utf8_bin collation in MySQL server. Full information here.
force_unicode 可能会丢失数据。如果您知道所获取的数据类型,那么简单地使用 Python 的解码方法来正确转换数据会更加现实。这可以通过“latin1”字符串(例如)轻松完成,如下所示:
my_unicode_string = my_latin1_string.decode('latin1')
force_unicode comes with the potential of lost data. If you know the type of data that you are getting, it is much more realistic to simply use Python's decode method to properly convert the data. This can easily be done with a 'latin1' string (for example) like so:
my_unicode_string = my_latin1_string.decode('latin1')
我的情况更加奇怪,我从 JSON 文件导入数据,内存中创建的实例将抛出一个 Unicode,如下所示:
但是从数据库检索它并再次运行代码没有问题,所以如果您遇到问题,收到包含
[Bad Unicode data]
的 Django 错误,尝试在保存后重新检索对象作为解决方法。如果有人希望正确解释为什么,请随意 - 我的猜测是输入数据不是用 utf-8 编码的:
My situation was even more paculiar, I was importing data from JSON file, the in memory created instance would throw a Unicode as follows:
But retrieving it from database and running the code again worked without an issue, so if you having an issue where you get a Django error that contains
[Bad Unicode data]
try re-retrieving the object after save as a workaround.If anyone wishes to properly explain as to why feel free - my guess is the input data is not encoded in utf-8: