Google App Engine 中的 Python 问题 - UTF-8 和 ASCII

发布于 2024-12-01 01:41:06 字数 5226 浏览 5 评论 0原文

所以在过去的几天里我一直在尝试在 App Engine 中学习 Python。然而,我在 ASCII 和 UTF 编码方面遇到了许多问题。最新的问题如下:

我有以下一段来自《云中的代码》一书中的简单聊天室的代码,

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime


# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
    self.user = user
    self.message = msg
    self.time = datetime.datetime.now()

def __str__(self):
    return "%s (%s): %s" % (self.user, self.time, self.message)

Messages = []

class ChatRoomPage(webapp.RequestHandler):
def get(self):
    self.response.headers["Content-Type"] = "text/html"
    self.response.out.write("""
       <html>
         <head>
           <title>MarkCC's AppEngine Chat Room</title>
         </head>
         <body>
           <h1>Welcome to MarkCC's AppEngine Chat Room</h1>
           <p>(Current time is %s)</p>
       """ % (datetime.datetime.now()))
    # Output the set of chat messages
    global Messages
    for msg in Messages:
        self.response.out.write("<p>%s</p>" % msg)
    self.response.out.write("""
       <form action="" method="post">
       <div><b>Name:</b> 
       <textarea name="name" rows="1" cols="20"></textarea></div>
       <p><b>Message</b></p>
       <div><textarea name="message" rows="5" cols="60"></textarea></div>
       <div><input type="submit" value="Send ChatMessage"></input></div>
       </form>
     </body>
   </html>
   """)
 # END: MainPage    
 # START: PostHandler
def post(self):
    chatter = self.request.get("name")
    msg = self.request.get("message")
    global Messages
    Messages.append(ChatMessage(chatter, msg))
    # Now that we've added the message to the chat, we'll redirect
    # to the root page, which will make the user's browser refresh to
    # show the chat including their new message.
    self.redirect('/')        
# END: PostHandler




# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])


def main():
run_wsgi_app(chatapp)

if __name__ == "__main__":
main()
# END: Frame

它在英语中工作正常。然而,当我添加一些非标准字符时,各种各样的问题就开始了。

首先,为了让东西能够在 HTML 中显示字符,我添加了元标记 - charset=UTF-8" 等

奇怪的是,如果你输入非标准字母,程序可以很好地处理它们,并且没有问题地显示它们,但是,如果我在脚本中向网页布局本身输入任何非 ASCII 字母,则无法加载。编码线将所以我添加了(# -- 编码:utf-8 --)。当然,我忘记了以 UTF-8 格式保存文件。 它

这将是故事的好结局,唉......

不起作用

长话短说:

# -*- coding: utf-8 -*-
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime


# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
    self.user = user
    self.message = msg
    self.time = datetime.datetime.now()

def __str__(self):
    return "%s (%s): %s" % (self.user, self.time, self.message)

Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
    self.response.headers["Content-Type"] = "text/html"
    self.response.out.write("""
       <html>
         <head>
           <title>Witaj w pokoju czatu MarkCC w App Engine</title>
           <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
         </head>
         <body>
           <h1>Witaj w pokoju czatu MarkCC w App Engine</h1>
           <p>(Dokladny czas Twojego logowania to: %s)</p>
       """ % (datetime.datetime.now()))
    # Output the set of chat messages
    global Messages
    for msg in Messages:
        self.response.out.write("<p>%s</p>" % msg)
    self.response.out.write("""
       <form action="" method="post">
       <div><b>Twój Nick:</b> 
       <textarea name="name" rows="1" cols="20"></textarea></div>
       <p><b>Twoja Wiadomość</b></p>
       <div><textarea name="message" rows="5" cols="60"></textarea></div>
       <div><input type="submit" value="Send ChatMessage"></input></div>
       </form>
     </body>
   </html>
   """)
# END: MainPage    
# START: PostHandler
def post(self):
    chatter = self.request.get(u"name")
    msg = self.request.get(u"message")
    global Messages
    Messages.append(ChatMessage(chatter, msg))
    # Now that we've added the message to the chat, we'll redirect
    # to the root page, which will make the user's browser refresh to
    # show the chat including their new message.
    self.redirect('/')        
# END: PostHandler




# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])


def main():
run_wsgi_app(chatapp)

if __name__ == "__main__":
main()
# END: Frame

无法处理我在聊天应用程序运行时编写的任何内容,但在我输入消息的那一刻。 (即使仅使用标准字符)我收到

File "D:\Python25\lib\StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in       range(128) 

错误消息。换句话说,如果我希望能够在应用程序中使用任何字符,我就不能在界面中放置非英语字符。或者反过来,只有当我不使用 utf-8 编码文件时,我才能在应用程序中使用非英语字符。如何让这一切协同工作?

So for the past few days I've been trying to learn Python in App Engine. However, I've been encountering a number of problems with ASCII and UTF encoding. The freshest issue is as follows:

I have the following piece of code of a simplistic chatroom from the book 'Code in the Cloud'

from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime


# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
    self.user = user
    self.message = msg
    self.time = datetime.datetime.now()

def __str__(self):
    return "%s (%s): %s" % (self.user, self.time, self.message)

Messages = []

class ChatRoomPage(webapp.RequestHandler):
def get(self):
    self.response.headers["Content-Type"] = "text/html"
    self.response.out.write("""
       <html>
         <head>
           <title>MarkCC's AppEngine Chat Room</title>
         </head>
         <body>
           <h1>Welcome to MarkCC's AppEngine Chat Room</h1>
           <p>(Current time is %s)</p>
       """ % (datetime.datetime.now()))
    # Output the set of chat messages
    global Messages
    for msg in Messages:
        self.response.out.write("<p>%s</p>" % msg)
    self.response.out.write("""
       <form action="" method="post">
       <div><b>Name:</b> 
       <textarea name="name" rows="1" cols="20"></textarea></div>
       <p><b>Message</b></p>
       <div><textarea name="message" rows="5" cols="60"></textarea></div>
       <div><input type="submit" value="Send ChatMessage"></input></div>
       </form>
     </body>
   </html>
   """)
 # END: MainPage    
 # START: PostHandler
def post(self):
    chatter = self.request.get("name")
    msg = self.request.get("message")
    global Messages
    Messages.append(ChatMessage(chatter, msg))
    # Now that we've added the message to the chat, we'll redirect
    # to the root page, which will make the user's browser refresh to
    # show the chat including their new message.
    self.redirect('/')        
# END: PostHandler




# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])


def main():
run_wsgi_app(chatapp)

if __name__ == "__main__":
main()
# END: Frame

It works ok in English. However, the moment I add some non-standard characters all sorts of problems start

First of all, in order for the thing to be actually able to display characters in HTML I add meta tag - charset=UTF-8" etc

Curiously, if you enter non-standard letters, the program processes them nicely, and displays them with no issues. However, it fails to load if I enter any non-ascii letters to the web layout iteself withing the script. I figured out that adding utf-8 encoding line would work. So I added (# -- coding: utf-8 --). This was not enough. Of course I forgot to save the file in UTF-8 format. Upon that the program started running.

That would be the good end to the story, alas....

It doesn't work

Long story short this code:

# -*- coding: utf-8 -*-
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
import datetime


# START: MainPage
class ChatMessage(object):
def __init__(self, user, msg):
    self.user = user
    self.message = msg
    self.time = datetime.datetime.now()

def __str__(self):
    return "%s (%s): %s" % (self.user, self.time, self.message)

Messages = []
class ChatRoomPage(webapp.RequestHandler):
def get(self):
    self.response.headers["Content-Type"] = "text/html"
    self.response.out.write("""
       <html>
         <head>
           <title>Witaj w pokoju czatu MarkCC w App Engine</title>
           <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
         </head>
         <body>
           <h1>Witaj w pokoju czatu MarkCC w App Engine</h1>
           <p>(Dokladny czas Twojego logowania to: %s)</p>
       """ % (datetime.datetime.now()))
    # Output the set of chat messages
    global Messages
    for msg in Messages:
        self.response.out.write("<p>%s</p>" % msg)
    self.response.out.write("""
       <form action="" method="post">
       <div><b>Twój Nick:</b> 
       <textarea name="name" rows="1" cols="20"></textarea></div>
       <p><b>Twoja Wiadomość</b></p>
       <div><textarea name="message" rows="5" cols="60"></textarea></div>
       <div><input type="submit" value="Send ChatMessage"></input></div>
       </form>
     </body>
   </html>
   """)
# END: MainPage    
# START: PostHandler
def post(self):
    chatter = self.request.get(u"name")
    msg = self.request.get(u"message")
    global Messages
    Messages.append(ChatMessage(chatter, msg))
    # Now that we've added the message to the chat, we'll redirect
    # to the root page, which will make the user's browser refresh to
    # show the chat including their new message.
    self.redirect('/')        
# END: PostHandler




# START: Frame
chatapp = webapp.WSGIApplication([('/', ChatRoomPage)])


def main():
run_wsgi_app(chatapp)

if __name__ == "__main__":
main()
# END: Frame

Fails to process anything I write in the chat application when it's running. It loads but the moment I enter my message (even using only standard characters) I receive

File "D:\Python25\lib\StringIO.py", line 270, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 64: ordinal not in       range(128) 

error message. In other words, if I want to be able to use any characters within the application I cannot put non-English ones in my interface. Or the other way round, I can use non-English characters within the app only if I don't encode the file in utf-8. How to make it all work together?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

追风人 2024-12-08 01:41:06

您的字符串包含 unicode 字符,但它们不是 unicode 字符串,而是字节字符串。您需要为每个字符串添加前缀 u(如 u"foo"),以便将它们转换为 unicode 字符串。如果您确保所有字符串都是 Unicode 字符串,则应该消除该错误。

您还应该在 Content-Type 标头中指定编码,而不是元标记,如下所示:

self.response.headers['Content-Type'] = 'text/html; charset=UTF-8'

请注意,如果您使用模板系统而不是使用 Python 内联编写 HTML,您的生活会容易得多代码。

Your strings contain unicode characters, but they're not unicode strings, they're byte strings. You need to prefix each one with u (as in u"foo") in order to make them into unicode strings. If you ensure all your strings are Unicode strings, you should eliminate that error.

You should also specify the encoding in the Content-Type header rather than a meta tag, like this:

self.response.headers['Content-Type'] = 'text/html; charset=UTF-8'

Note your life would be a lot easier if you used a templating system instead of writing HTML inline with your Python code.

带刺的爱情 2024-12-08 01:41:06

@托马斯·K.
在此感谢您的指导。感谢你,我能够想出,也许 - 正如你所说 - 一个小小的解决方案 - 所以答案的功劳应该归于你。以下代码行:

Messages.append(ChatMessage(chatter, msg))

应该如下所示:

Messages.append(ChatMessage(chatter.encode( "utf-8" ), msg.encode( "utf-8" )))

基本上我必须将所有 utf-8 字符串编码为 ascii。

@Thomas K.
Thank you for your guidance here. Thanks to you I was able to come up with, maybe - as you said - a little roudabout solution - so the credit for the answer should go to you. The following line of code:

Messages.append(ChatMessage(chatter, msg))

Should look like this:

Messages.append(ChatMessage(chatter.encode( "utf-8" ), msg.encode( "utf-8" )))

Basically I have to encode all the utf-8 string to ascii.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文