Python 无法正确处理来自 HTML 文本区域的文本输入
我在 HTML 页面上有一个标准表单,具有常用的输入类型:text
、select
、submit
。使用 Python(Pyramid 框架)处理这些表单非常简单,没有任何问题。
不过,在这种特殊的形式中,我需要使用 textarea
来接受更长的多行输入。在 Python 中处理用户输入时,我使用了以下代码:
try:
some_input = request.params['form_element'].decode('utf-8')
except:
some_input = None
这适用于 text
输入,但不适用于 textarea
输入。当包含 unicode 字符时,textarea
输入不会被处理,并抛出以下错误:
(<type 'exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii', u'some text then a unicode character \u2013 and some more text', 14, 15, 'ordinal not in range(128)'), <traceback object at 0x10265ca70>)
这有什么原因吗?看起来好像假设 textarea
输入被视为 ASCII 而不是 UTF-8,但我不知道如何更改它。
更多信息:提交表单的页面是 HTML5 页面,字符集设置为 UTF-8。
编辑:Wladimir Palant建议它已经被解码,我检查一下:
print isinstance(request.params['form_element'], str)
返回 False
print isinstance(request.params['form_element'], unicode)
返回 True
I have a standard form on an HTML page with the usual input types: text
, select
, submit
. Using Python (the Pyramid framework) to process these forms has been straightforward and without issue.
In this particular form, though, I have needed to use a textarea
to accept longer, multi-line input. When processing the user input in Python, I've used the following code:
try:
some_input = request.params['form_element'].decode('utf-8')
except:
some_input = None
This works for text
input, but does not for textarea
input. textarea
input is not processed when a unicode character is included, and throws the following error:
(<type 'exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii', u'some text then a unicode character \u2013 and some more text', 14, 15, 'ordinal not in range(128)'), <traceback object at 0x10265ca70>)
Is there any reason for this? It looks like it's assuming that the textarea
input is being treated as ASCII instead of UTF-8, but I'm not sure how to change this.
More information: the page from which the form is being submitted is an HTML5 page with the charset set to UTF-8.
EDIT: Wladimir Palant suggested that it's already been decoded and I check this:
print isinstance(request.params['form_element'], str)
returns False
print isinstance(request.params['form_element'], unicode)
returns True
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
提交数据时,input[type=text] 和 textarea 没有区别。你所描述的问题应该同时发生。
如果我错了,请纠正我,但 Pyramid 中使用的 WebOb 会为您进行解码。您已经获得了 Unicode,因此无需解码或编码任何内容。另外,您可以使用 unicode 进行响应,它将自动编码。您很少需要在 Pyramid 应用程序中使用编码或解码。
There is no difference between a input[type=text] and a textarea when the data is submitted. The problem you describe should happen in both.
Correct me if I'm wrong, but WebOb, which is used in Pyramid, does the decoding for you. You get Unicode already, so there is no need to decode or encode anything. Also, you can use unicode for the response, and it will be encoded automatically. You rarely have to use encode or decode in Pyramid applications.