在Python中,如何获取文件的content_type或mime_type?
可能的重复:
如何查找文件的 mime 类型蟒蛇?
我正在使用电子邮件处理 API (sendgrid.com),它将所有传入电子邮件发布到我的应用程序中的 Web 请求处理程序。附件以 Attachment0=xyz&attachment1=abc 的形式发布,并与其他电子邮件字段(如“收件人”、“抄送”、“主题”等)一起发布...
然后,我将这些附件作为文件存储在 BlobStore 中(使用 App Engine)。要将这些文件返回给用户,必须指定 mime_type/content_type。据我了解,它通常取决于文件类型。但我不清楚如何从传递的字符串中获取文件类型。
是否有一个库可以根据文件的字节内容确定文件类型?
只是为了澄清,没有文件名或文件扩展名。只是文件的字节内容。
Possible Duplicate:
How to find the mime type of a file in python?
I'm using an email processing API (sendgrid.com) that posts all incoming emails to a web request handler in my app. The attachments are posted as attachment0=xyz&attachment1=abc along with other email fields like 'to' 'cc' 'subject', etc...
I then store these attachments as files in the BlobStore (with App Engine). To serve these files back to the user, the mime_type/content_type must be specified. As I understand it, it is usually dependent on the file type. But it's not clear to me how to get the file type from the passed strings.
Is there a library that figures out the file type from the byte content of a file?
Just to clarify, there is no filename or file extension. Just the file's byte content.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您在上传时保存了文件名,则可以使用
mimetypes.guess_type
函数在此处进行尝试。亚历山大链接的 SO 问题很好读。不幸的是,这不是你的情况。如果您拥有的只是一个二进制 blob,恐怕您必须在这里进行一些自定义启发式处理。请遵循以下简单步骤:
例如:
ZIP文件以两个字符
PK
开头,RAR 文件以Rar!
开头,PDF 以%PDF
开头,PNG 以\x89PNG< 开头/code> 等
这将无法识别某些文件(例如 JPG),但您在这里建立了一个良好的开端。
或者,您也可以使用 https://github.com/ahupp/python-magic 。
If you saved the filename when it was uploaded, you'd use
mimetypes.guess_type
function to give it a shot here. The linked SO question by Alexander is good to read.Unfortunately, that is not your case. If all you have is a binary blob, I'm afraid you have to put on some custom heuristics here. Follow these simple steps:
For example:
ZIP file starts with two characters
PK
, RAR file starts withRar!
, PDF starts with%PDF
, PNG starts with\x89PNG
and so onThis would fail to identify some files (such as JPG) but you have a good start to build up here.
Or alternatively, you could use https://github.com/ahupp/python-magic too.