Python 无法正确处理来自 HTML 文本区域的文本输入

发布于 2024-11-18 09:53:25 字数 1230 浏览 2 评论 0原文

我在 HTML 页面上有一个标准表单，具有常用的输入类型：text、select、submit。使用 Python（Pyramid 框架）处理这些表单非常简单，没有任何问题。

不过，在这种特殊的形式中，我需要使用 textarea 来接受更长的多行输入。在 Python 中处理用户输入时，我使用了以下代码：

try:
    some_input = request.params['form_element'].decode('utf-8')
except:
    some_input = None

这适用于 text 输入，但不适用于 textarea 输入。当包含 unicode 字符时，textarea 输入不会被处理，并抛出以下错误：

(<type 'exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii', u'some text then a unicode character \u2013 and some more text', 14, 15, 'ordinal not in range(128)'), <traceback object at 0x10265ca70>)

这有什么原因吗？看起来好像假设 textarea 输入被视为 ASCII 而不是 UTF-8，但我不知道如何更改它。

更多信息：提交表单的页面是 HTML5 页面，字符集设置为 UTF-8。

编辑：Wladimir Palant建议它已经被解码，我检查一下：

print isinstance(request.params['form_element'], str) 返回 False

print isinstance(request.params['form_element'], unicode) 返回 True

原文

I have a standard form on an HTML page with the usual input types: text, select, submit. Using Python (the Pyramid framework) to process these forms has been straightforward and without issue.

In this particular form, though, I have needed to use a textarea to accept longer, multi-line input. When processing the user input in Python, I've used the following code:

try:
    some_input = request.params['form_element'].decode('utf-8')
except:
    some_input = None

This works for text input, but does not for textarea input. textarea input is not processed when a unicode character is included, and throws the following error:

(<type 'exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii', u'some text then a unicode character \u2013 and some more text', 14, 15, 'ordinal not in range(128)'), <traceback object at 0x10265ca70>)

Is there any reason for this? It looks like it's assuming that the textarea input is being treated as ASCII instead of UTF-8, but I'm not sure how to change this.

More information: the page from which the form is being submitted is an HTML5 page with the charset set to UTF-8.

EDIT: Wladimir Palant suggested that it's already been decoded and I check this:

print isinstance(request.params['form_element'], str) returns False

print isinstance(request.params['form_element'], unicode) returns True

分享到QQ

分享到微博