广度优先目录遍历:O(log n)内存是否可行?
我正在尝试创建一个迭代器,对特定文件夹内的所有文件和文件夹执行广度优先遍历。我已经通过深度优先遍历完成了此操作,该遍历返回例如:
\A
\A\1
\A\1\x
\A\1\y
\A\2
\B
\B\1
等。
现在我正在尝试制作一个程序,该程序将返回广度优先的结果:(或逐级)
\A
\B
\A\1
\A\2
\B\1
\A\1\x
\A\1\y
相同的结果等级制度。然而,我遇到了一个绊脚石:假设我希望这些以正确的顺序发生(特别是,而不是相反的顺序),我找不到任何方法来执行此操作而不最终需要O(n) 内存,其中 n 是驱动器上的文件/文件夹数量,因为在我看来,我最终需要将整个驱动器层次结构保留在内存在某个时刻,而对于 DFS,我可以完全忽略我之前在层次结构中同一级别枚举的所有条目。
所以我的问题是:是否有一种比线性更好的方法来使用内存来遍历文件夹?
I'm trying to make an iterator that performs breadth-first traversal of all the files and folders inside a particular folder. I've already done this with depth-first traversal, which returns, for example:
\A
\A\1
\A\1\x
\A\1\y
\A\2
\B
\B\1
etc.
Now I'm trying to make a program that would instead return the results breadth-first: (or level-by-level)
\A
\B
\A\1
\A\2
\B\1
\A\1\x
\A\1\y
for the same hierarchy. However, I've come across a stumbling block: Assuming I want these to happen in the correct order (and specifically, not the reverse order), I cannot find any way to perform this action without ultimately needing O(n) memory, where n is the number of files/folders on the drive, because it seems to me that I would ultimately need to keep the entire drive hierarchy in memory at some point, whereas for DFS, I can entirely ignore all entries that I enumerate previously at the same level in the hierarchy.
So my question is: Is there a better-than-linear way to use memory in order to traverse the folder?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您的平台支持
inode number
的概念,您可以为每个目录存储一个单个编号,以指示您访问过的该特定目录的最大inode编号。如果您按数字顺序访问索引节点,那么跟踪单个条目就足以知道“下一个”条目在哪里。这是一个小小的收获,因为您仍然需要为系统上的每个目录维护一个索引节点号,但您不需要关心目录的内容。
当然,请记住,任何遍历机制都会受到可怕的竞争条件的影响,您必须在一定程度上保证文件系统是静态的,或者您的代码对目录/文件的删除、创建、移动等具有弹性。 ,当您的代码正在进行时。
If your platform supports the notion of
inode number
, you may be able to store a single number for each directory, to indicate the largest inode number you have visited for that specific directory. If you access the inodes in numerical order, keeping track of a single entry will be good enough to know where the 'next' entry is.It's a small gain, as you'll still need to maintain an inode number for every single directory on the system, but you won't need to care about the contents of the directories.
Of course, keeping in mind that any traversal mechanism is subject to horrible race conditions, you'd have to have some level of assurance that the filesystem is quiescent or your code is resilient to directories / files being deleted, created, moved, etc., while your code is underway.