按设备/索引节点顺序读取文件?
我对读取磁盘上大量文件的有效方法感兴趣。我想知道如果我先按设备排序文件,然后再按 inode 排序,相对于自然文件读取,我会获得一些速度提升。
I'm interested in an efficient way to read a large number of files on the disk. I want to know if I sort files by device and then by inode I'll got some speed improvement against natural file reading.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
从旋转存储中按物理顺序读取文件可以大大提高速度。操作系统 I/O 调度机制仅在存在多个进程或线程争用 I/O 时才执行任何实际工作,因为它们没有有关您计划将来读取哪些文件的信息。因此,除了简单的预读之外,它们通常对您没有任何帮助。
此外,Linux 按哈希表顺序而不是物理顺序将目录条目返回到用户空间,从而使目录扫描期间的访问模式变得更糟。幸运的是,Linux 还提供系统调用来确定文件的物理位置,以及文件是否存储在旋转设备上,因此您可以恢复一些损失。例如,请参阅几年前我提交给 dpkg 的这个补丁:
http:// /lists.debian.org/debian-dpkg/2009/11/msg00002.html
此补丁不包含对旋转设备的测试,因为此功能直到 2012 年才添加到 Linux:
https://git.kernel.org/cgit/linux/ kernel/git/torvalds/linux.git/commit/?id=ef00f59c95fe6e002e7c6e3663cdea65e253f4cc
我还曾经运行过 mutt 的修补版本,它会按物理顺序扫描 Maildirs,通常速度会提高 5-10 倍。
请注意,索引节点很小,并且经过大量预取和缓存,因此在读取之前打开文件以获取其物理位置是非常值得的。确实,像 tar、rsync、cp 和 PostgreSQL 这样的常见工具并不使用这些技术,而且简单的事实是,这使得它们不必要地变慢。
There are vast speed improvements to be had from reading files in physical order from rotating storage. Operating system I/O scheduling mechanisms only do any real work if there are several processes or threads contending for I/O, because they have no information about what files you plan to read in the future. Hence, other than simple read-ahead, they usually don't help you at all.
Furthermore, Linux worsens your access patterns during directory scans by returning directory entries to user space in hash table order rather than physical order. Luckily, Linux also provides system calls to determine the physical location of a file, and whether or not a file is stored on a rotational device, so you can recover some of the losses. See for example this patch I submitted to dpkg a few years ago:
http://lists.debian.org/debian-dpkg/2009/11/msg00002.html
This patch does not incorporate a test for rotational devices, because this feature was not added to Linux until 2012:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ef00f59c95fe6e002e7c6e3663cdea65e253f4cc
I also used to run a patched version of mutt that would scan Maildirs in physical order, usually giving a 5x-10x speed improvement.
Note that inodes are small, heavily prefetched and cached, so opening files to get their physical location before reading is well worth the cost. It's true that common tools like tar, rsync, cp and PostgreSQL do not use these techniques, and the simple truth is that this makes them unnecessarily slow.
早在 20 世纪 70 年代,我就向我们的计算机中心提出,如果他们以最小化寻道时间的方式组织磁盘读取和/或写入队列,那么从磁盘读取/写入磁盘的总体速度会更快,计算机中心表示,他们的实验和来自 IBM 的信息表明,许多研究都是针对多种技术进行的,并且如果磁盘读/写按照先到先服务的顺序完成,则作业(不仅仅是单个作业)的整体吞吐量是最佳的。这是一个 IBM 批处理系统。
Back in the 1970s I proposed to our computer center that reading/writing from/to disk would be faster overall if they organized the queue of disk reads and/or writes in such a way as to minimize the seek time and I was told by the computer center that their experiments and information from IBM that many studies had been made of several techniques and that the overall throughput of JOBS (not just a single job) was most optimal if disk reads/writes were done in first come first serve order. This was an IBM batch system.
一般来说,文件访问的优化技术与存储子系统的体系结构过于紧密,无法像排序算法一样简单。
1) 如果您的文件分布在多个物理驱动器(而不仅仅是分区)中,并且您从不同的驱动器并行读取两个或多个文件,则可以有效地倍增读取数据速率。这可能是唯一易于实现的方法。
2)一般情况下,按名称或索引节点号对文件进行排序并不会真正改变任何内容。您想要的是按文件块在磁盘上的物理位置对文件进行排序,以便可以通过最少的查找来读取它们。然而,存在相当多的障碍:
大多数文件系统不向用户空间应用程序提供此类信息,除非出于调试原因。
每个文件的块本身可以分布在整个磁盘上,尤其是在几乎已满的文件系统上。没有办法在不来回查找的情况下顺序读取多个文件。
您假设您的进程是唯一访问存储子系统的进程。一旦至少有其他人在做同样的事情,您提出的每一项优化都会失效。
您正在尝试比操作系统及其自身的缓存和 I/O 调度机制更聪明。尝试对内核(即唯一真正了解您的系统和使用模式的内核)进行二次猜测很可能会让事情变得更糟。
您不认为如果可以的话,PostreSQL 或 Oracle 会使用类似的技术吗?当数据库安装在正确的文件系统上时,它们让内核做它的事情,而不是试图事后猜测它的决定。只有当数据库位于原始设备上时,考虑物理块的专门优化算法才会发挥作用。
您还应该考虑存储设备的特定属性。例如,现代 SSD 使传统的寻道时间优化变得过时。
In general, optimisation techniques for file access are too tied to the architecture of your storage subsystem for them to be something as simple as a sorting algorithm.
1) You can effectively multiply the read data rate if your files are spread into multiple physical drives (not just partitions) and you read two or more files in parallel from different drives. This one is probably the only method that is easy to implement.
2) Sorting the files by name or inode number does not really change anything in the general case. What you'd want is to sort the files by the physical location of their blocks on the disk, so that they can be read with minimal seeking. There are quite a few obstacles however:
Most filesystems do not provide such information to userspace applications, unless it's for debugging reasons.
The blocks themselves of each file can be spread all over the disk, especially on a mostly full filesystem. There is no way to read multiple files sequentially without seeking back and forth.
You are assuming that your process is the only one accessing the storage subsystem. Once there is at least someone else doing the same, every optimisation you come up with goes out of the window.
You are trying to be smarter than the operating system and its own caching and I/O scheduling mechanisms. It's very likely that by trying to second-guess the kernel, i.e. the only one that really knows your system and your usage patterns, you will make things worse.
Don't you think e.g. PostreSQL pr Oracle would have used a similar technique if they could? When the DB is installed on a proper filesystem they let the kernel do its thing and don't try to second-guess its decisions. Only when the DB is on a raw device do the specialised optimisation algorithms that take physical blocks into account come into play.
You should also take the specific properties of your storage devices into account. Modern SSDs, for example, make traditional seek-time optimisations obsolete.