上传的 django 图像文件名错误

发布于 2024-08-19 01:17:09 字数 156 浏览 5 评论 0原文

我不知道这是否是预期的行为,但是如果我创建一个包含 ImageField 字段的单个模型的项目并上传文件名为“árvórés”的照片,则上传的文件将以难以理解的文件名(ascii,我认为)。直接的结果是,该照片无法从网站上检索。

这是正常的吗?如果是,那么如何允许这些类型的文件名?

I don't know if this is expected behavior or not, but if I create a project with a single model with an ImageField field and upload a photo with the filename "árvórés", the uploaded file is saved with an incomprehensible filename(ascii, I presume). As a direct result, that photo becomes impossible to retrieve from the site.

Is this normal? If yes, then how to allow those types of filenames?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寻找我们的幸福 2024-08-26 01:17:09

问题是您没有指定浏览器应如何编码 POST 数据,随后您将得到浏览器猜测它应该使用的任何内容 - 通常是 ISO-8859-1 而不是 Unicode (UTF-8)。

FORM 元素的 HTML 4.01 规范包括“accept-charset”属性,该属性允许您指定 POST 数据的编码偏好:

接受字符集 = 字符集列表 [CI]

该属性指定的列表
输入数据的字符编码
服务器接受的
处理此表格。该值为
空格和/或逗号分隔的列表
字符集值。客户必须
将此列表解释为异或
列表,即服务器能够
接受任何单个字符编码
每个收到的实体。

该属性的默认值
是保留字符串“UNKNOWN”。用户
代理可以将此值解释为
用于的字符编码
传送包含此内容的文件
FORM 元素。

换句话说,如果您提供以 UTF-8 编码的页面,浏览器将默认以 UTF-8 发布请求。

最好的解决方法是指定所有页面的字符编码,方法是在响应标头中包含适当的编码,或者在 HTML 的 HEAD 部分中包含类似以下内容:

<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

HTML 4.01 规范有一个关于 如何指定您所提供的字符编码

另一种但较小的修复方法是不在任何地方指定字符编码,而是手动解码文件名,假设浏览器以 ISO-8859-1 的默认编码发送:

def upload_file(request):
    if request.method == 'POST':
        form = UploadFileForm(request.POST, request.FILES)
        if form.is_valid():
            filename = form.cleaned_data.image.name.decode('iso-8859-1')
            ...

The issue is that you haven't specified how the POST data should be encoded by the browser, and subsequently you are getting whatever the browser has guessed it should use - usually ISO-8859-1 instead of Unicode (UTF-8).

The HTML 4.01 spec for the FORM element includes the "accept-charset" attribute which allows you to specify your preference for which encoding to POST data with:

accept-charset = charset list [CI]

This attribute specifies the list of
character encodings for input data
that is accepted by the server
processing this form. The value is a
space- and/or comma-delimited list of
charset values. The client must
interpret this list as an exclusive-or
list, i.e., the server is able to
accept any single character encoding
per entity received.

The default value for this attribute
is the reserved string "UNKNOWN". User
agents may interpret this value as the
character encoding that was used to
transmit the document containing this
FORM element.

In other words, if you serve a page encoded in UTF-8, the browser would default to posting requests in UTF-8.

The best fix is to specify the character encoding for all your pages by either including the appropriate encoding in your response headers, or including something like the following in your HTML within the HEAD section:

<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

The HTML 4.01 spec has a section on how to specify which character encoding you are serving.

An alternate but lesser fix is to not specify the character encoding anywhere, and instead decode your filename manually assuming the browser is sending in the default encoding of ISO-8859-1:

def upload_file(request):
    if request.method == 'POST':
        form = UploadFileForm(request.POST, request.FILES)
        if form.is_valid():
            filename = form.cleaned_data.image.name.decode('iso-8859-1')
            ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文