内容处置文件名中的特殊字符
我的问题与 如何在 HTTP 中对 Content-Disposition 标头的文件名参数进行编码? 重复 但由于这个问题已经问了很久了,而且仍然没有令人满意的答案(在我看来),我想再问一次。
我开发了一个 C++ CGI 应用程序,它提供名称中可以包含特殊字符的文件,例如
“weird # € = { } ; filename.txt”
似乎无法设置 HTTP内容处理方式适用于每个浏览器,例如
- Internet Explorer
- Firefox
- Chrome
- Opera
- Safari
我很乐意为每个浏览器提供不同的解决方案。
现在这就是我走了多远:
Internet Explorer (添加双引号并替换 # 和 ; )
Content-Disposition: attachment; filename="weird %23 € = { } %3B filename.txt"
Firefox (双引号似乎有效。没什么可做的):
Content-Disposition: attachment; filename="weird # € = { } ; filename.txt"
另一个工作替代方案:
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt
Chrome
仅使用双引号时会出现以下问题:
- = 文件名中消失
- € 将被替换为 -
但这有效:
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt
Opera
使用双引号或使用句法: filename*=UTF-8''... 产生以下问题:
- 文件名中多个粘在一起的空格被减少为一个
- { 且 } 消失:“ab{}cd.txt” -> “abcd.txt”
- 文件名在 ; 之后被截断其中:“abc ; def.txt”-> “abc”
编辑2:这是因为文件名长度限制。此语法适用于 Opera:
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt
Safari
€ 将被替换为不可见字符(使用双引号)
没有解决方案可以防止这个小问题
来自其他线程(如上所述)的建议使用
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%80%20%3D%20%7B%20%7D%20%3B%20filename.txt
对我不起作用。转义字符不会被翻译回来,或者浏览器想要使用我的 cgi 应用程序的名称保存到文件中。那是因为我的编码错误。我没有根据 RFC 5987 进行编码。但 Safari 无论如何都不使用这种编码。所以到目前为止还没有欧元字符的解决方案。
顺便说一句:UTF-8 转换器 http://www.rishida.net/tools/conversion/
我在这些测试中使用了每个浏览器的最新版本:
- Firefox 7
- Internet Explorer 9
- Chrome 15
- Opera 11.5
- Safari 5.1
PS:我尝试了键盘上的所有特殊字符。我在这个线程中只使用了那些造成麻烦的。
编辑:
我还尝试了一个包含键盘上所有特殊字符(文件名中可能存在)的文件名,但它不像上面的测试字符串那样工作:
完整测试字符串:
0 ! § $ % & ( ) = ` ´ { } [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg
编码测试字符串:
0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg
使用此方法:
Content-Disposition: attachment; filename*=UTF-8''0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg
我有结果如下:
- Firefox 可以工作
- Chrome 可以工作
- IE: $ % & ( ) = ` ´ { } [ ] ² 3 @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg(删除前 6 个字符)。 编辑2:这是因为浏览器的文件名长度限制。它开始从字符串的开头截断文件名。我没有深入探讨这一点,但看起来普通文件名的长度约为 200 个字符,而具有许多转义序列的文件名长度甚至更多,但少于 250 个。但这没关系。
- 歌剧:0! § $ % & ( ) = ` ´ [ ] ² 3 @ € µ ^ ° ~ + ' # - _ 。 , ; ü ä ö ß 9.jpg(像以前一样缺少一些字符)。 编辑2:我缩短了我的测试字符串,因为我怀疑Opera 的文件名长度“问题”,就像IE 一样,它也在那里工作。
- Safari 不支持该语法。那是例外。
编辑 2:
到目前为止,语法 filename*=UTF-8''filname escapeequence" 适用于除 Safari 之外的所有浏览器。唯一被 Safari 替换的字符是 €我想我可以接受这个。
编辑 3:文件名长度
我注意到一些文件名长度问题:
- 如果字符串不包含转义序列,则文件名长度可以是 147 个字符。如果是这样,文件名可能会有所不同,但它不同,我使用了 2 个转义序列,文件名缩短了 5 个字符,并且我使用了许多转义序列,文件名缩短了 2 个。我在这里找不到规则。
- 其他浏览器似乎没有这个问题。如果我尝试了 250 个字符,它们就会保存该文件。文件名称 (Chrome) 或者他们自己将其缩短为 220 (Opera) 或 210 (Firefox) 个字符。 Opera 切断了文件结尾。 Safari 尝试保存那么长的文件名,但最终没有保存它,并在下载列表中写入“-1”作为文件名。
My question is a duplicate of How to encode the filename parameter of Content-Disposition header in HTTP?
But since that question was asked a long time ago and there is still no satisfying answer (in my opinion), I would like to ask again.
I develop a C++ CGI application that delivers files that can contain special characters in their names like
"weird # € = { } ; filename.txt"
There seems to be no possibility to set the HTTP Content-Dispostion in a way that it works for every browser like
- Internet Explorer
- Firefox
- Chrome
- Opera
- Safari
I would be happy with a different solution for every browser.
Now that is how far I came:
Internet Explorer (added double quotes and replaced # and ; )
Content-Disposition: attachment; filename="weird %23 € = { } %3B filename.txt"
Firefox (double quotes seem to work. nothing more to do):
Content-Disposition: attachment; filename="weird # € = { } ; filename.txt"
Another working alternative:
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt
Chrome
when using only double quotes these problems arise:
- = disapears in filenames
- € will be replaced by -
but this works:
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt
Opera
Using duoble quotes or using the syntax: filename*=UTF-8''... produces the following problems:
- Multiple sticked together spaces in filenames are reduced to one
- { and } disapear: "ab{}cd.txt" -> "abcd.txt"
- filenames get cut off after ; in it: "abc ; def.txt" -> "abc"
EDIT 2: This was because of filename length limitations. This syntax works with Opera:
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%e2%82%ac%20%3D%20%7B%20%7D%20%3B%20filename.txt
Safari
€ will be replaced by an invisble character (using double quotes)
no solution that prevents that little problem
The suggestion from the other thread (mentioned above) using
Content-Disposition: attachment; filename*=UTF-8''weird%20%23%20%80%20%3D%20%7B%20%7D%20%3B%20filename.txt
didn't work for me. The escape characters won't be translated back or the browser wants to save to file with the name of my cgi application. That was because my encoding was wrong. I did not encode according to RFC 5987. But Safari isn't using this encoding anyway. So no solution for the € character so far.
BTW: An UTF-8 converter http://www.rishida.net/tools/conversion/
I used the latest version of every browser fo these tests:
- Firefox 7
- Internet Explorer 9
- Chrome 15
- Opera 11.5
- Safari 5.1
PS: I tried all special characters on my keyboard. I used in this thread only the ones that made trouble.
EDIT:
I also tried a filename with all special characters on my keyboard (that are possible in a filename) and that did not work as it did with the test string above:
Complete Test string:
0 ! § $ % & ( ) = ` ´ { } [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg
Encoded Test String:
0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg
Using this method:
Content-Disposition: attachment; filename*=UTF-8''0%20%21%20%C2%A7%20%24%20%25%20%26%20%28%20%29%20%3D%20%60%20%C2%B4%20%7B%20%7D%20%20%20%20%5B%20%5D%20%C2%B2%20%C2%B3%20%40%20%E2%82%AC%20%C2%B5%20%5E%20%C2%B0%20~%20%2B%20%27%20%23%20-%20_%20.%20%2C%20%3B%20%C3%BC%20%C3%A4%20%C3%B6%20%C3%9F%209.jpg
I had the following results:
- Firefox works
- Chrome works
- IE: $ % & ( ) = ` ´ { } [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg (removed the first 6 characters). EDIT 2: This was because of filename length limitations of the browser. It startet to cut off the filename from the start of the string. I didn't go deep into this but it looks like normal filenames can be about 200 characters long and filenames with many escape sequesnces even more but less than 250. But that's OK.
- Opera: 0 ! § $ % & ( ) = ` ´ [ ] ² ³ @ € µ ^ ° ~ + ' # - _ . , ; ü ä ö ß 9.jpg (missing some characters as before). EDIT 2: I shortened my test string because I suspected filename length "problems" with Opera as there are with IE and it worked there too.
- Safari doesn't work with that syntax. That was excepted.
EDIT 2:
Status so far is, that the syntax filename*=UTF-8''filname escape sequence" works with every browser except Safari. And the only character that is getting replaced with Safari is the €. I guess I can live with that. Thank you!
EDIT 3: Filename length
I noticed some filename length issues.
- Internet Explorer: File names can be 147 characters long. If the string doesn't contain escape sequences then that's the length of the filename. If it does the file name can vary. The resulting file name is shorter that 147 characters. But it differs. I used 2 escape sequences and the file name shortened 5 characters and I used many escape sequences and the file name shortened onyl 2 characters. I couldn't find a rule here.
- The other browsers don't seems to have that problem. They would save the file if the file system can handle it. I tried for instance 250 characters and the browsers said I have to reduce the file name (Chrome) or they did it themselfs shortening it to either 220 (Opera) or 210 (Firefox) characters. Opera cut off the file ending though. Safari tried to save that long file name and ended up not saving it and writing "-1" in the download list as filename.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Firefox、MSIE(从版本 9 开始)、Opera、Konq 和 Chrome 支持; MSIE8和Safari不支持;其他支持未知 - RFC 5987 中定义的编码。
请注意,
您的欧元字符编码错误;它的 unicode 代码点不是 %80,解决这个问题应该可以使它在除 Safari 之外的任何地方都可以工作(正确的编码是 %e2%82%ac)。
测试用例位于:
http://greenbytes.de/tech/tc2231/#attwithfn2231utf8
Firefox, MSIE (starting with version 9), Opera, Konq and Chrome support; MSIE8 and Safari not support; others support is unknown - the encoding defined in RFC 5987.
Note that in
you got the encoding for the Euro character wrong; it's unicode code point is not %80, fixing this should make it work everywhere except Safari (the correct encoding being %e2%82%ac).
Test case at:
http://greenbytes.de/tech/tc2231/#attwithfn2231utf8