有没有一种简单的方法可以在不知道文件扩展名的情况下确定文件的类型?

发布于 2024-07-10 02:01:13 字数 385 浏览 7 评论 0原文

我有一个带有二进制列的表,其中存储了许多不同可能文件类型(PDF、BMP、JPEG、WAV、MP3、DOC、MPEG、AVI 等)的文件,但没有存储名称或类型的列原始文件。 有没有什么简单的方法可以让我处理这些行并确定存储在二进制列中的每个文件的类型? 最好是一个只读取文件头的实用程序,这样我就不必完全提取每个文件来确定其类型。

澄清:我知道这里的方法涉及仅读取每个文件的开头。 我正在寻找一个好的资源(又名链接),可以为我做到这一点,而不需要太多麻烦。 谢谢。

另外,请仅在 Windows 上使用 C#/.NET。 我没有使用 Linux,也无法使用 Cygwin(由于其他原因,它在 Windows CE 上不起作用)。

I have a table with a binary column which stores files of a number of different possible filetypes (PDF, BMP, JPEG, WAV, MP3, DOC, MPEG, AVI etc.), but no columns that store either the name or the type of the original file. Is there any easy way for me to process these rows and determine the type of each file stored in the binary column? Preferably it would be a utility that only reads the file headers, so that I don't have to fully extract each file to determine its type.

Clarification: I know that the approach here involves reading just the beginning of each file. I'm looking for a good resource (aka links) that can do this for me without too much fuss. Thanks.

Also, just C#/.NET on Windows, please. I'm not using Linux and can't use Cygwin (doesn't work on Windows CE, among other reasons).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

茶色山野 2024-07-17 02:01:13

最简单的方法是通过访问具有“file”命令的 *nix(或 cygwin)系统:

$ file visitors.*
visitors.html: HTML document text
visitors.png:  PNG image data, 5360 x 2819, 8-bit colormap, non-interlaced

您可以编写一个 C# 应用程序,将每个二进制列的前 X 个字节通过管道传输到 file 命令(使用 - as文件名)

Easiest way to do this would be through access to a *nix (or cygwin) system that has the 'file' command:

$ file visitors.*
visitors.html: HTML document text
visitors.png:  PNG image data, 5360 x 2819, 8-bit colormap, non-interlaced

You could write a C# application that piped the first X bytes of each binary column to the file command (using - as the file name)

何其悲哀 2024-07-17 02:01:13

以下是一些查找文件格式的工具:

  1. 网站在线文件标识符http://mark0.net/onlinetrid.aspx< /strong> 作者:Marco Pontello


  2. 一个名为文件分析器 作者:瓦迪姆·塔拉索夫

该网站的优点是不需要任何安装,因此不太可能提供任何恶意软件。 但是,您必须上传文件,这可能不是您想要的隐私保护。


以下是游戏保存文件的示例 Pampas & Selene:恶魔迷宫演示

通过网站获得的结果

。 sav 文件被标识为 TIM(PlayStation 图形)

Here are a few tools to find the format of a file:

  1. a website Online File Identifier: http://mark0.net/onlinetrid.aspx by Marco Pontello

  2. a software called File Analyzer by Vadim Tarasov.

The website has the advantage not to require any installation, and thus is less likely to provide any malware. However, you have to upload your file, which might not be what you want for privacy.


Here is an example with the save file of the game Pampas & Selene: The Maze of Demons Demo:

Results obtained with the website

The .sav file is identified as TIM (PlayStation graphics).

樱桃奶球 2024-07-17 02:01:13

这不是一个完整的答案,但可以从“幻数”库开始。 这会检查文件的前几个字节以确定“幻数”,并将其与已知的列表进行比较。 这是(至少是部分)Linux 系统上的 file 命令的工作原理。

This is not a complete answer, but a place to start would be a "magic numbers" library. This examines the first few bytes of a file to determine a "magic number", which is compared against a known list of them. This is (at least part) of how the file command on Linux systems works.

庆幸我还是我 2024-07-17 02:01:13

其他人问了类似的问题并发布了用于执行此操作的代码。 您应该能够获取此处发布的内容,并稍微修改它,以便它从您的数据库中提取。

https://stackoverflow.com/questions/58510

除此之外,看起来有人写了一个库基于幻数来做到这一点,但是,该网站似乎需要注册,以及某种形式的替代访问才能下载该库。 该文档无需注册即可免费获取,这可能会有所帮助。

http://software.topcoder.com/catalog/c_component。 jsp?comp=13249160&ver=2

Someone else asked a similar question and posted the code used to do exactly this. You should be able to take what is posted here, and slightly modify it so that it pulls from your database.

https://stackoverflow.com/questions/58510

In addition to that, it looks like someone has written a library based off of magic numbers to do this, however, it looks like the site requires registration, and some form of alternate access in order to download this lirbary. The documentation is avaliable for free without registration, that may be helpful.

http://software.topcoder.com/catalog/c_component.jsp?comp=13249160&ver=2

尐籹人 2024-07-17 02:01:13

许多文件类型都有明确定义的文件头。 您可以检查前几个字节来检查文件是如何开始的。

A lot of filetypes have well defined headers that begin the file. You could check the first few bytes to check to see how the file begins.

悸初 2024-07-17 02:01:13

我知道的最简单的方法是使用 file 命令,它在 Windows 中也可用与 Cygwin

The easiest way I know is to use file command that it is also available in Windows with Cygwin .

魔法少女 2024-07-17 02:01:13

您需要使用一些 p/invoke 互操作代码来调用 来自 Win32 API 的 SHGetFileInfo 方法。 这篇文章也可能有所帮助。

You need to use some p/invoke interop code to call the SHGetFileInfo method from the Win32 API. This article may also help.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文