如何在 Google App Engine 数据存储区中存储非 ASCII 字符
我已经尝试了不少于 5 种不同的“解决方案”,但我无法让它工作,请帮忙。
这是错误
'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 636, in __call__
handler.post(*groups)
File "/base/data/home/apps/elmovieplace/1.350096827241428223/script/pftv.py", line 114, in post
movie.put()
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 984, in put
return datastore.Put(self._entity, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 455, in Put
return _GetConnection().async_put(config, entities, extra_hook).get_result()
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1219, in async_put
for pbs in pbsgen:
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1070, in __generate_pb_lists
pb = value_to_pb(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 239, in entity_to_pb
return entity._ToPb()
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 841, in _ToPb
properties = datastore_types.ToPropertyPb(name, values)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1672, in ToPropertyPb
pbvalue = pack_prop(name, v, pb.mutable_value())
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1485, in PackString
pbvalue.set_stringvalue(unicode(value).encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
这是给我带来问题的代码部分。
if imdbValues[5] == 'N/A':
movie.diector = ''
else:
movie.director = imdbValues[5]
...
movie.put()
在本例中,imdbValues[5]
等于 Claudio Fàh
I've tried no less then 5 different "solutions" and i cant get it to work, please help.
This is the error
'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 636, in __call__
handler.post(*groups)
File "/base/data/home/apps/elmovieplace/1.350096827241428223/script/pftv.py", line 114, in post
movie.put()
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 984, in put
return datastore.Put(self._entity, config=config)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 455, in Put
return _GetConnection().async_put(config, entities, extra_hook).get_result()
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1219, in async_put
for pbs in pbsgen:
File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1070, in __generate_pb_lists
pb = value_to_pb(value)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 239, in entity_to_pb
return entity._ToPb()
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 841, in _ToPb
properties = datastore_types.ToPropertyPb(name, values)
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1672, in ToPropertyPb
pbvalue = pack_prop(name, v, pb.mutable_value())
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1485, in PackString
pbvalue.set_stringvalue(unicode(value).encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
This is the part of the code that's giving me problems.
if imdbValues[5] == 'N/A':
movie.diector = ''
else:
movie.director = imdbValues[5]
...
movie.put()
In this case imdbValues[5]
is equal to Claudio Fäh
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这行代码引发了异常:
当您将一个值传递给
movie.director
时,该值首先会转换为 unicode:然后使用
encode('utf-8 ')
。unicode()
函数通常使用 ASCII 作为默认解码编码;这意味着您只传递这些类型的值是安全的:您的代码可能正在传递带有某种编码的字节字符串,而
unicode(value)
无法以 ASCII 进行解码。建议:
如果您正在处理字节字符串,您必须知道它们的编码,否则您的程序将遇到这种编码/解码问题。
如何修复它:
发现您正在处理的字节字符串中使用的编码(utf-8?)并将它们转换为 unicode 字符串。
例如,如果
imdbValues
是一些奇特的 Imdb python 库 返回的列表,其中包含 utf-8 编码的字节字符串,则应使用以下方法转换它们:The exception is raised by this line of code:
When you pass a value to
movie.director
, that value is first converted in unicode with:then it is encoded with
encode('utf-8')
.The
unicode()
function tipically uses ASCII as default decode encoding; it means that you are safe only passing these kind of values:Your code is probably passing a byte string with some encoding that the
unicode(value)
fails to decode in ASCII.Recommendation:
if you are dealing with byte strings, you MUST know their encoding or your program will suffer this kind of encoding/decoding problem.
How to fix it:
discover the encoding used in the byte strings you are dealing with (utf-8?) and convert them in unicode strings.
If, for example,
imdbValues
is a list returned by some fancy Imdb python libraries that contains utf-8 encoded byte strings, you should convert them using:您应该开始使用
unicode
作为您的文本数据。无论您从何处获取数据,它们都是编码为字节的 Unicode 字符。编码可以是
UTF-8
、UTF-16
、Windows-1252
或ISO-8859-1
code> 或许多其他编码。如果数据存在于您的系统上,您就知道编码。如果它们来自网页,则编码包含在响应标头中,并且通常包含在页面的开头。使用该编码,.decode
为非常有用的unicode
Python 对象,并在代码中使用它。对输入进行解码,对输出进行编码(如有必要)。在通过 App Engine 使用数据之前无需进行编码。
PS 这个 Unicode 相关问题的答案可能会有所帮助。
You should start using
unicode
for your textual data.Wherever you get your data, they are Unicode characters encoded as bytes. The encoding could be
UTF-8
, orUTF-16
, orWindows-1252
, orISO-8859-1
or many other encodings. If the data exist on your system, you know the encoding. If they came from a web page, the encoding is included in the response headers, and often in the beginning of the page. Using that encoding,.decode
to the very usefulunicode
Python object and use that in your code.Decode on input, encode (if necessary) on output. It's not necessary to encode before using the data with App Engine.
PS that answer in a Unicode-related question might be of help.