我能否可靠地找出正确的 MIME 类型来提供不受信任的内容?
假设我让用户将文件上传到我的服务器,然后让用户下载它们。我想将 mime 类型设置为 application/octet-stream 之外的其他类型,这样如果浏览器可以打开它们,它就可以打开它们(例如,对于图像、pdf 文件、纯文本文件等)。 ,由于文件是由用户上传的,所以我不能信任文件扩展名等。
是否有一个好的库来确定任意 blob 对应的 mime 类型?最好可以从 Python 中使用:-)
谢谢!
Say I let users upload files to my server, and I let users download them. I'd like to set the mime type to something other than just application/octet-stream, so that if the browser can just open them, it does (say, for images, pdf files, plain text files, etc.) Of course, since the files are uploaded by users, I can't trust the file extension, etc.
Is there a good library for figuring out what mime type goes with an arbitrary blob? Preferably usable from Python :-)
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试 python-magic。
Try python-magic.
当心文本文件:无法知道它们采用什么编码,也没有可靠的猜测方法,特别是因为在 Windows 中创建的大多数文件都采用 8 位 MBCS 编码,如果没有语言启发法,这些编码是无法区分的。您需要知道编码(而不仅仅是 MIME 类型)才能设置文件的完整内容类型,以便在浏览器中查看。如果您希望允许上传和显示文本,那么使用 HTML 文本表单比上传原始文件要安全得多。
另请注意,一个文件可以是多种文件类型;例如,自解压 ZIP 既是有效的 Windows 可执行文件又是 ZIP 文件,并且可以被视为其中之一。
Beware of text files: there's no way of knowing what encoding they're in, and there's no reliable way of guessing, especially since most ones created in Windows are in 8-bit MBCS encodings which are indistinguishable without language heuristics. You need to know the encoding--not just the MIME type--to set the complete Content-Type for a file to be viewable in a browser. If you want to allow uploading and displaying text, it's much safer to use an HTML text form than a raw file upload.
Also, note that a file can be multiple file types; for example, self-extracting ZIPs are both valid Windows executables and ZIP files, and can be treated as either.