如何直接从磁盘读取文件?
当然,可以使用 fopen 或 Mac 上可用的任何其他大量 API 来读取文件,但我需要做的是打开并读取磁盘上的每个文件,并尽可能高效地执行此操作。
所以,我的想法是使用 /dev/rdisk* (?) 或 /dev/(?) 从设备开头的文件开始。我会尽力按照文件出现在磁盘上的顺序读取文件,尽量减少在设备上的查找量,因为文件可能会碎片化,并将大块数据读入 RAM,以便可以非常快速地进行处理。
因此,我面临的主要问题是,当直接从设备读取数据时,如何确定哪些数据属于哪些文件?
我假设我可以从读取文件目录开始,并且有一种方法可以确定磁盘上文件或文件片段的开始和停止位置,但我不确定在哪里可以找到有关如何获取此类信息的信息...?
我运行的是 Mac OS X 10.6.x,可以假定驱动器采用标准设置。我可能假设相同的信息也适用于由磁盘工具创建的标准、只读、未压缩的 .dmg。
任何有关该主题的信息或要阅读的文章都会引起人们的兴趣。
一旦了解了磁盘上文件的格式和布局,我不认为我想做的事情会特别困难。
谢谢
One can, of course, use fopen or any other large number of APIs available on the Mac to read a file, but what I need to do is open and read every file on the disk and to do so as efficiently as possible.
So, my thought was to using /dev/rdisk* (?) or /dev/(?) to start with the files at the beginning of the device. I would do my best to read the files in order as they appear on the disk, minimize the amount of seeking across the device since files may be fragmented, and read in large blocks of data into RAM where it can be processed very quickly.
So, the primary question I have is when reading the data from my device directly, how can I determine exactly what data belongs with what files?
I assume I could start by reading a catalog of the files and that there would be a way to determine the start and stop locations of file or file fragments on the disk, but I am not sure where to find information about how to obtain such information...?
I am running Mac OS X 10.6.x and one can assume a standard setup for the drive. I might assume the same information would apply to a standard, read-only, uncompressed .dmg created by Disk Utility as well.
Any information on this topic or articles to read would be of interest.
I don't imagine what I want to do is particularly difficult once the format and layout of the files on disk was understood.
thank you
正如评论中提到的,您需要查看文件系统格式< /a>,但是,通过按顺序读取原始磁盘,您无法保证 (1) 后续块属于同一文件,因此您可能必须无论如何寻找,从而减慢直接从 /dev/device 读取所获得的优势, (2) 如果您的磁盘仅占 50% 已满,您最终可能仍会读取 100% 的磁盘,因为您将读取未分配的空间以及分配给文件的空间,因此直接从 /dev/device 准备可能效率较低出色地。
然而,fsck 和类似的操作会执行此操作,但他们会根据修复文件系统时寻找的可能错误进行适度操作。
As mentioned in the comments, you need to look at the file system format, however by reading the raw disk sequentially, you are for (1) not guaranteed that subsequent blocks belong to same file, so you may have to seek anyway slowing down the advantage you had from reading directly from /dev/device, and (2) if your disk only is 50% full, you may still end up reading 100% of the disk, as you will be reading the unallocated space as well as the space allocated to file, and hence directly ready from /dev/device may be in efficient as well.
However fsck and similar does this operation, but they do it with moderation nased on possible error they are looking for when repairing file systems.