空字节和多部分/表单数据的后果是什么?
第三方向我们发送了一个平面文件,该文件应该只包含可打印的 ASCII 字符。然而,我们发现文件中间有一个大约 50 0x00
字节的字符串。
我们希望能够将文件上传到我们的 Web 应用程序,但我发现 Django 似乎不喜欢 multipart/form-data 中的空字符。如果我删除空字符,则上传成功。 (抱歉,我目前没有可用的堆栈跟踪,但如有必要,我会生成一个)
我们可以预处理文件以删除空字符和/或与我们的第三方合作修复他们的文件生成器,但我不喜欢留下这样神秘的问题。
这听起来像是 Django 中的错误,还是 multipart/form-data 的某些方面我不完全理解?我是否需要设置某种传输编码,以便 Django 不会挂在空字符上?
A third party is sending us a flat file that is supposed to contain exclusively printable ASCII characters. However, we've discovered that there's a string of about 50 0x00
bytes in the middle of the file.
We want to be able to upload the file to our web application, but I've discovered that Django doesn't seem to like the null characters in the multipart/form-data. If I remove the null characters, the upload succeeds. (Sorry I don't have the stack trace available at the moment, but will produce one if necessary)
We can pre-process the file to remove the null characters and/or work with our third party to fix their file generator, but I don't like to leave mystical problems like this.
Does this sound like a bug in Django or is there some aspect of multipart/form-data that I don't fully understand? Do I need to set a transfer encoding of some sort so Django doesn't get hung up on the null characters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不,表单数据不需要(或浏览器曾经使用过)传输编码。在 multipart/form-data 值中包含 50 个空字节是完全有效的......事实上,考虑到大多数二进制文件包含大量空值,这种情况应该像文件上传一样经常出现!
这让我怀疑这是否真的是 Django bug,或者是否没有其他问题发生。让我们看看堆栈跟踪!
Nope, no transfer-encoding is needed (or ever used by browsers) on form-data. It's perfectly valid to include a run of 50 null bytes in a multipart/form-data value... indeed given that most binary files contain a lot of nulls that situation should arise as often as not with file uploads!
Which makes me question whether it's really a Django bug, or whether there's not something else going on. Let's have that stacktrace!