Google App Engine 中的 Python 问题 - UTF-8 和 ASCII
所以在过去的几天里我一直在尝试在 App Engine 中学习 Python。然而,我在 ASCII 和 UTF 编码方面遇到了许多问题。最新的问题如下:
我有以下一段来自《云中的代码》一书中的简单聊天室的代码,
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime
# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
self.user = user
self.message = msg
self.time = datetime.datetime.now()
def __str__(self):
return "%s (%s): %s" % (self.user, self.time, self.message)
Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
self.response.headers["Content-Type"] = "text/html"
self.response.out.write("""
<html>
<head>
<title>MarkCC's AppEngine Chat Room</title>
</head>
<body>
<h1>Welcome to MarkCC's AppEngine Chat Room</h1>
<p>(Current time is %s)</p>
""" % (datetime.datetime.now()))
# Output the set of chat messages
global Messages
for msg in Messages:
self.response.out.write("<p>%s</p>" % msg)
self.response.out.write("""
<form action="" method="post">
<div><b>Name:</b>
<textarea name="name" rows="1" cols="20"></textarea></div>
<p><b>Message</b></p>
<div><textarea name="message" rows="5" cols="60"></textarea></div>
<div><input type="submit" value="Send ChatMessage"></input></div>
</form>
</body>
</html>
""")
# END: MainPage
# START: PostHandler
def post(self):
chatter = self.request.get("name")
msg = self.request.get("message")
global Messages
Messages.append(ChatMessage(chatter, msg))
# Now that we've added the message to the chat, we'll redirect
# to the root page, which will make the user's browser refresh to
# show the chat including their new message.
self.redirect('/')
# END: PostHandler
# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])
def main():
run_wsgi_app(chatapp)
if __name__ == "__main__":
main()
# END: Frame
它在英语中工作正常。然而,当我添加一些非标准字符时,各种各样的问题就开始了。
首先,为了让东西能够在 HTML 中显示字符,我添加了元标记 - charset=UTF-8" 等
奇怪的是,如果你输入非标准字母,程序可以很好地处理它们,并且没有问题地显示它们,但是,如果我在脚本中向网页布局本身输入任何非 ASCII 字母,则无法加载。编码线将所以我添加了(# -- 编码:utf-8 --)。当然,我忘记了以 UTF-8 格式保存文件。 它
这将是故事的好结局,唉......
不起作用
长话短说:
# -*- coding: utf-8 -*-
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime
# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
self.user = user
self.message = msg
self.time = datetime.datetime.now()
def __str__(self):
return "%s (%s): %s" % (self.user, self.time, self.message)
Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
self.response.headers["Content-Type"] = "text/html"
self.response.out.write("""
<html>
<head>
<title>Witaj w pokoju czatu MarkCC w App Engine</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<h1>Witaj w pokoju czatu MarkCC w App Engine</h1>
<p>(Dokladny czas Twojego logowania to: %s)</p>
""" % (datetime.datetime.now()))
# Output the set of chat messages
global Messages
for msg in Messages:
self.response.out.write("<p>%s</p>" % msg)
self.response.out.write("""
<form action="" method="post">
<div><b>Twój Nick:</b>
<textarea name="name" rows="1" cols="20"></textarea></div>
<p><b>Twoja Wiadomość</b></p>
<div><textarea name="message" rows="5" cols="60"></textarea></div>
<div><input type="submit" value="Send ChatMessage"></input></div>
</form>
</body>
</html>
""")
# END: MainPage
# START: PostHandler
def post(self):
chatter = self.request.get(u"name")
msg = self.request.get(u"message")
global Messages
Messages.append(ChatMessage(chatter, msg))
# Now that we've added the message to the chat, we'll redirect
# to the root page, which will make the user's browser refresh to
# show the chat including their new message.
self.redirect('/')
# END: PostHandler
# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])
def main():
run_wsgi_app(chatapp)
if __name__ == "__main__":
main()
# END: Frame
无法处理我在聊天应用程序运行时编写的任何内容,但在我输入消息的那一刻。 (即使仅使用标准字符)我收到
File "D:\Python25\lib\StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in range(128)
错误消息。换句话说,如果我希望能够在应用程序中使用任何字符,我就不能在界面中放置非英语字符。或者反过来,只有当我不使用 utf-8 编码文件时,我才能在应用程序中使用非英语字符。如何让这一切协同工作?
So for the past few days I've been trying to learn Python in App Engine. However, I've been encountering a number of problems with ASCII and UTF encoding. The freshest issue is as follows:
I have the following piece of code of a simplistic chatroom from the book 'Code in the Cloud'
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime
# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
self.user = user
self.message = msg
self.time = datetime.datetime.now()
def __str__(self):
return "%s (%s): %s" % (self.user, self.time, self.message)
Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
self.response.headers["Content-Type"] = "text/html"
self.response.out.write("""
<html>
<head>
<title>MarkCC's AppEngine Chat Room</title>
</head>
<body>
<h1>Welcome to MarkCC's AppEngine Chat Room</h1>
<p>(Current time is %s)</p>
""" % (datetime.datetime.now()))
# Output the set of chat messages
global Messages
for msg in Messages:
self.response.out.write("<p>%s</p>" % msg)
self.response.out.write("""
<form action="" method="post">
<div><b>Name:</b>
<textarea name="name" rows="1" cols="20"></textarea></div>
<p><b>Message</b></p>
<div><textarea name="message" rows="5" cols="60"></textarea></div>
<div><input type="submit" value="Send ChatMessage"></input></div>
</form>
</body>
</html>
""")
# END: MainPage
# START: PostHandler
def post(self):
chatter = self.request.get("name")
msg = self.request.get("message")
global Messages
Messages.append(ChatMessage(chatter, msg))
# Now that we've added the message to the chat, we'll redirect
# to the root page, which will make the user's browser refresh to
# show the chat including their new message.
self.redirect('/')
# END: PostHandler
# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])
def main():
run_wsgi_app(chatapp)
if __name__ == "__main__":
main()
# END: Frame
It works ok in English. However, the moment I add some non-standard characters all sorts of problems start
First of all, in order for the thing to be actually able to display characters in HTML I add meta tag - charset=UTF-8" etc
Curiously, if you enter non-standard letters, the program processes them nicely, and displays them with no issues. However, it fails to load if I enter any non-ascii letters to the web layout iteself withing the script. I figured out that adding utf-8 encoding line would work. So I added (# -- coding: utf-8 --). This was not enough. Of course I forgot to save the file in UTF-8 format. Upon that the program started running.
That would be the good end to the story, alas....
It doesn't work
Long story short this code:
# -*- coding: utf-8 -*-
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime
# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
self.user = user
self.message = msg
self.time = datetime.datetime.now()
def __str__(self):
return "%s (%s): %s" % (self.user, self.time, self.message)
Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
self.response.headers["Content-Type"] = "text/html"
self.response.out.write("""
<html>
<head>
<title>Witaj w pokoju czatu MarkCC w App Engine</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<h1>Witaj w pokoju czatu MarkCC w App Engine</h1>
<p>(Dokladny czas Twojego logowania to: %s)</p>
""" % (datetime.datetime.now()))
# Output the set of chat messages
global Messages
for msg in Messages:
self.response.out.write("<p>%s</p>" % msg)
self.response.out.write("""
<form action="" method="post">
<div><b>Twój Nick:</b>
<textarea name="name" rows="1" cols="20"></textarea></div>
<p><b>Twoja Wiadomość</b></p>
<div><textarea name="message" rows="5" cols="60"></textarea></div>
<div><input type="submit" value="Send ChatMessage"></input></div>
</form>
</body>
</html>
""")
# END: MainPage
# START: PostHandler
def post(self):
chatter = self.request.get(u"name")
msg = self.request.get(u"message")
global Messages
Messages.append(ChatMessage(chatter, msg))
# Now that we've added the message to the chat, we'll redirect
# to the root page, which will make the user's browser refresh to
# show the chat including their new message.
self.redirect('/')
# END: PostHandler
# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])
def main():
run_wsgi_app(chatapp)
if __name__ == "__main__":
main()
# END: Frame
Fails to process anything I write in the chat application when it's running. It loads but the moment I enter my message (even using only standard characters) I receive
File "D:\Python25\lib\StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in range(128)
error message. In other words, if I want to be able to use any characters within the application I cannot put non-English ones in my interface. Or the other way round, I can use non-English characters within the app only if I don't encode the file in utf-8. How to make it all work together?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的字符串包含 unicode 字符,但它们不是 unicode 字符串,而是字节字符串。您需要为每个字符串添加前缀
u
(如u"foo"
),以便将它们转换为 unicode 字符串。如果您确保所有字符串都是 Unicode 字符串,则应该消除该错误。您还应该在
Content-Type
标头中指定编码,而不是元标记,如下所示:请注意,如果您使用模板系统而不是使用 Python 内联编写 HTML,您的生活会容易得多代码。
Your strings contain unicode characters, but they're not unicode strings, they're byte strings. You need to prefix each one with
u
(as inu"foo"
) in order to make them into unicode strings. If you ensure all your strings are Unicode strings, you should eliminate that error.You should also specify the encoding in the
Content-Type
header rather than a meta tag, like this:Note your life would be a lot easier if you used a templating system instead of writing HTML inline with your Python code.
@托马斯·K.
在此感谢您的指导。感谢你,我能够想出,也许 - 正如你所说 - 一个小小的解决方案 - 所以答案的功劳应该归于你。以下代码行:
应该如下所示:
基本上我必须将所有 utf-8 字符串编码为 ascii。
@Thomas K.
Thank you for your guidance here. Thanks to you I was able to come up with, maybe - as you said - a little roudabout solution - so the credit for the answer should go to you. The following line of code:
Should look like this:
Basically I have to encode all the utf-8 string to ascii.