有没有一种简单的方法可以在不知道文件扩展名的情况下确定文件的类型?
我有一个带有二进制列的表,其中存储了许多不同可能文件类型(PDF、BMP、JPEG、WAV、MP3、DOC、MPEG、AVI 等)的文件,但没有存储名称或类型的列原始文件。 有没有什么简单的方法可以让我处理这些行并确定存储在二进制列中的每个文件的类型? 最好是一个只读取文件头的实用程序,这样我就不必完全提取每个文件来确定其类型。
澄清:我知道这里的方法涉及仅读取每个文件的开头。 我正在寻找一个好的资源(又名链接),可以为我做到这一点,而不需要太多麻烦。 谢谢。
另外,请仅在 Windows 上使用 C#/.NET。 我没有使用 Linux,也无法使用 Cygwin(由于其他原因,它在 Windows CE 上不起作用)。
I have a table with a binary column which stores files of a number of different possible filetypes (PDF, BMP, JPEG, WAV, MP3, DOC, MPEG, AVI etc.), but no columns that store either the name or the type of the original file. Is there any easy way for me to process these rows and determine the type of each file stored in the binary column? Preferably it would be a utility that only reads the file headers, so that I don't have to fully extract each file to determine its type.
Clarification: I know that the approach here involves reading just the beginning of each file. I'm looking for a good resource (aka links) that can do this for me without too much fuss. Thanks.
Also, just C#/.NET on Windows, please. I'm not using Linux and can't use Cygwin (doesn't work on Windows CE, among other reasons).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
最简单的方法是通过访问具有“file”命令的 *nix(或 cygwin)系统:
您可以编写一个 C# 应用程序,将每个二进制列的前 X 个字节通过管道传输到 file 命令(使用 - as文件名)
Easiest way to do this would be through access to a *nix (or cygwin) system that has the 'file' command:
You could write a C# application that piped the first X bytes of each binary column to the file command (using - as the file name)
以下是一些查找文件格式的工具:
网站
在线文件标识符
:http://mark0.net/onlinetrid.aspx< /strong> 作者:Marco Pontello
一个名为
文件分析器
作者:瓦迪姆·塔拉索夫。该网站的优点是不需要任何安装,因此不太可能提供任何恶意软件。 但是,您必须上传文件,这可能不是您想要的隐私保护。
以下是游戏保存文件的示例
Pampas & Selene:恶魔迷宫演示
:。 sav
文件被标识为TIM(PlayStation 图形)
。Here are a few tools to find the format of a file:
a website
Online File Identifier
: http://mark0.net/onlinetrid.aspx by Marco Pontelloa software called
File Analyzer
by Vadim Tarasov.The website has the advantage not to require any installation, and thus is less likely to provide any malware. However, you have to upload your file, which might not be what you want for privacy.
Here is an example with the save file of the game
Pampas & Selene: The Maze of Demons Demo
:The
.sav
file is identified asTIM (PlayStation graphics)
.这不是一个完整的答案,但可以从“幻数”库开始。 这会检查文件的前几个字节以确定“幻数”,并将其与已知的列表进行比较。 这是(至少是部分)Linux 系统上的 file 命令的工作原理。
This is not a complete answer, but a place to start would be a "magic numbers" library. This examines the first few bytes of a file to determine a "magic number", which is compared against a known list of them. This is (at least part) of how the
file
command on Linux systems works.其他人问了类似的问题并发布了用于执行此操作的代码。 您应该能够获取此处发布的内容,并稍微修改它,以便它从您的数据库中提取。
https://stackoverflow.com/questions/58510
除此之外,看起来有人写了一个库基于幻数来做到这一点,但是,该网站似乎需要注册,以及某种形式的替代访问才能下载该库。 该文档无需注册即可免费获取,这可能会有所帮助。
http://software.topcoder.com/catalog/c_component。 jsp?comp=13249160&ver=2
Someone else asked a similar question and posted the code used to do exactly this. You should be able to take what is posted here, and slightly modify it so that it pulls from your database.
https://stackoverflow.com/questions/58510
In addition to that, it looks like someone has written a library based off of magic numbers to do this, however, it looks like the site requires registration, and some form of alternate access in order to download this lirbary. The documentation is avaliable for free without registration, that may be helpful.
http://software.topcoder.com/catalog/c_component.jsp?comp=13249160&ver=2
许多文件类型都有明确定义的文件头。 您可以检查前几个字节来检查文件是如何开始的。
A lot of filetypes have well defined headers that begin the file. You could check the first few bytes to check to see how the file begins.
我知道的最简单的方法是使用 file 命令,它在 Windows 中也可用与 Cygwin 。
The easiest way I know is to use file command that it is also available in Windows with Cygwin .
您需要使用一些 p/invoke 互操作代码来调用 来自 Win32 API 的 SHGetFileInfo 方法。 这篇文章也可能有所帮助。
You need to use some p/invoke interop code to call the SHGetFileInfo method from the Win32 API. This article may also help.