使用 opendir()、readdir() 和 closedir() 高效地遍历目录树

发布于 2024-08-22 16:00:06 字数 555 浏览 9 评论 0 原文

C 例程 opendir()、readdir() 和 closeir() 为我提供了一种遍历目录结构的方法。然而， readdir() 返回的每个 dirent 结构似乎没有为我提供一种有用的方法来获取我需要递归到目录子目录中的 DIR 指针集。

当然，他们给了我文件的名称，所以我可以将该名称附加到目录路径和 stat() 和 opendir() 它们，或者我可以通过 chdir() 和 roll 更改进程的当前工作目录它通过 chdir("..") 返回。

第一种方法的问题是，如果目录路径的长度足够长，那么将包含它的字符串传递给 opendir() 的成本将超过打开目录的成本。如果您更理论一点，您可以说您的复杂性可能会增加超出线性时间（目录树中（相对）文件名的总字符数）。

另外，第二种方法也有问题。由于每个进程都有一个当前工作目录，因此在多线程应用程序中，除了一个线程之外的所有线程都必须阻塞。另外，我不知道当前工作目录是否只是为了方便（即，在文件系统查询之前将相对路径附加到它）。如果是这样，这种方法也将是低效的。

我接受这些功能的替代方案。那么如何高效地遍历一棵 UNIX 目录树（其下文件的总字符数的线性时间）呢？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

平生欢 2024-08-29 16:00:06

您是否尝试过 ftw() 又名 File Tree Walk ？

来自 man 3 ftw 的片段：

int ftw(const char *dir, int (*fn)(const char *file, const struct stat *sb, int flag), int nopenfd);< /代码>

ftw() 从指定的目录 dir 开始遍历目录树。对于树中找到的每个条目，它使用条目的完整路径名、指向条目的 stat(2) 结构的指针和 int 标志调用 fn()

回复收藏 0 原文

智商已欠费 2024-08-29 16:00:06

您似乎缺少一个基本点：目录遍历涉及从磁盘读取数据。即使该数据位于缓存中，您最终也需要执行大量代码才能将其从缓存中获取到进程中。路径通常也很短——超过几百个字节是很不寻常的。这些意味着您可以相当合理地为您需要的所有路径构建字符串，而不会出现任何实际问题。与从磁盘读取数据的时间相比，构建字符串所花费的时间仍然相当短。这意味着您通常可以忽略字符串操作所花费的时间，而专门致力于优化磁盘使用。

我自己的经验是，对于大多数目录遍历，广度优先搜索通常更可取——当您遍历当前目录时，将所有子目录的完整路径放入诸如优先级队列之类的东西中。遍历完当前目录后，从队列中取出第一项并遍历它，继续遍历，直到队列为空。这通常会提高缓存局部性，从而减少读取磁盘所花费的时间。根据系统（磁盘速度与 CPU 速度、可用总内存等）的不同，它几乎总是至少与深度优先遍历一样快，并且可以轻松达到两倍（左右）。

回复收藏 0 原文

楠木可依 2024-08-29 16:00:06

opendir/readdir/closedir的使用方式就是让函数递归！请查看 Dreamincode.net 上的代码片段。

希望这有帮助。

编辑谢谢R.Sahu，链接已过期，但是，通过wayback archive 并擅自将其添加到要点。请记住，相应地检查许可证并注明来源的原始作者！ :)

回复收藏 0 原文

娇女薄笑 2024-08-29 16:00:06

您可以使用 opendir() >openat()、dirfd() 和 fdopendir( ) 并构造一个递归函数来遍历目录树：

#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>

void dir_recurse (DIR *parent, int level) {
    struct dirent *ent;
    if (!parent) {
        return;
    }
    while ((ent = readdir(parent)) != NULL) {
        if ((strcmp(ent->d_name, ".") == 0) ||
            (strcmp(ent->d_name, "..") == 0)) {
            continue;
        }

        int parent_fd = dirfd(parent);
        if (parent_fd < 0) {
            perror("dirfd");
            continue;
        }
        int fd = openat(parent_fd, ent->d_name, O_RDONLY | O_DIRECTORY);
        if (fd != -1) { /* Directory */
            printf("%*s%s/\n", level, "", ent->d_name);
            DIR *child = fdopendir(fd);
            if (child) {
                dir_recurse(child, level + 1);
                closedir(child);
            } else {
                perror("fdopendir");
            }
        } else if (errno == ENOTDIR) { /* Regular file */
            printf("%*s%s\n", level, "", ent->d_name);
        } else {
            perror("openat");
        }
    }
}

int main (int argc, const char **argv) {
    DIR *root = opendir("..");
    if (root) {
        dir_recurse(root, 0);
        closedir(root);
    } else {
        perror("opendir");
    }

    return 0;
}

这里仍然使用 readdir() 来获取下一个目录条目。如果下一个条目是目录，则我们使用 dirfd() 查找父目录 fd 并将其与子目录名称一起传递给 openat()。生成的 fd 引用子目录。它被传递给fdopendir()，它返回子目录的DIR *指针，然后可以将其传递给我们的dir_recurse()，其中它再次适用于 readdir() 调用。

该程序在以 .. 为根的整个目录树上进行递归。打印条目，每个目录级别缩进 1 个空格。目录打印时带有尾随 /。

在编译器资源管理器上。

Instead of opendir(), you can use a combination of openat(), dirfd() and fdopendir() and construct a recursive function to walk a directory tree:

#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>

void dir_recurse (DIR *parent, int level) {
    struct dirent *ent;
    if (!parent) {
        return;
    }
    while ((ent = readdir(parent)) != NULL) {
        if ((strcmp(ent->d_name, ".") == 0) ||
            (strcmp(ent->d_name, "..") == 0)) {
            continue;
        }

        int parent_fd = dirfd(parent);
        if (parent_fd < 0) {
            perror("dirfd");
            continue;
        }
        int fd = openat(parent_fd, ent->d_name, O_RDONLY | O_DIRECTORY);
        if (fd != -1) { /* Directory */
            printf("%*s%s/\n", level, "", ent->d_name);
            DIR *child = fdopendir(fd);
            if (child) {
                dir_recurse(child, level + 1);
                closedir(child);
            } else {
                perror("fdopendir");
            }
        } else if (errno == ENOTDIR) { /* Regular file */
            printf("%*s%s\n", level, "", ent->d_name);
        } else {
            perror("openat");
        }
    }
}

int main (int argc, const char **argv) {
    DIR *root = opendir("..");
    if (root) {
        dir_recurse(root, 0);
        closedir(root);
    } else {
        perror("opendir");
    }

    return 0;
}

Here readdir() is still used to get the next directory entry. If the next entry is a directory, then we find the parent directory fd with dirfd() and pass this, along with the child directory name to openat(). The resulting fd refers to the child directory. This is passed to fdopendir() which returns a DIR * pointer for the child directory, which can then be passed to our dir_recurse() where it again will be valid for use with readdir() calls.

This program recurses over the whole directory tree rooted at .. Entries are printed, indented by 1 space per directory level. Directories are printed with a trailing /.

On Compiler Explorer.

回复收藏 0 原文