Python 遍历目录树的方法是什么?
我觉得分配文件和文件夹并执行 += [item] 部分有点黑客。有什么建议吗?我正在使用Python 3.2
from os import *
from os.path import *
def dir_contents(path):
contents = listdir(path)
files = []
folders = []
for i, item in enumerate(contents):
if isfile(contents[i]):
files += [item]
elif isdir(contents[i]):
folders += [item]
return files, folders
I feel that assigning files, and folders and doing the += [item] part is a bit hackish. Any suggestions? I'm using Python 3.2
from os import *
from os.path import *
def dir_contents(path):
contents = listdir(path)
files = []
folders = []
for i, item in enumerate(contents):
if isfile(contents[i]):
files += [item]
elif isdir(contents[i]):
folders += [item]
return files, folders
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(18)
在谷歌搜索相同的信息时,我发现了这个问题。
我在这里发布我在 http://www.pythoncentral.io/how-to-traverse-a-directory-tree-in-python-guide-to-os-walk/ (而不是只是发布 URL,以防链接失效)。
该页面包含一些有用的信息,并且还指向其他一些相关页面。
While googling for the same info, I found this question.
I am posting here the smallest, clearest code which I found at http://www.pythoncentral.io/how-to-traverse-a-directory-tree-in-python-guide-to-os-walk/ (rather than just posting the URL, in case of link rot).
The page has some useful info and also points to a few other relevant pages.
我还没有对此进行广泛的测试,但我相信
这将扩展 os.walk 生成器,将目录名连接到所有文件路径,并展平结果列表;给出搜索路径中具体文件的直接列表。
I've not tested this extensively yet, but I believe
this will expand the
os.walk
generator, join dirnames to all the file paths, and flatten the resulting list; To give a straight up list of concrete files in your search path.我喜欢 os.walk() 结果的结构,但总体上更喜欢 pathlib。因此,我的懒惰解决方案只是从 os.walk() 返回的每个项目创建一个
Path
。I like the structure of the result of
os.walk()
but preferpathlib
overall. My lazy solution therefore is simply creating aPath
from each item returned byos.walk()
.对于那些想要深度遍历所有嵌套子目录的人来说,复制并粘贴代码:
不同之处在于,使用 os.walk() ,您不需要手动遍历每个子目录的每个目录,库会为您做这件事,无论你有多少个嵌套目录。
Copy and paste code for those who want to deep walk all nested sub directories:
recursion call
withos.listdir()
:os.walk()
:The difference is that with
os.walk()
you won't need to walk every directories of each sub directories mannually, the library will do it for you, no matter how many nested directories you have.对于任何正在寻找“路径”行走方式的人:
首先在 Python 3.12 中引入:
https://docs.python.org/ zh-cn/3.13/library/pathlib.html#pathlib.Path.walk
For anyone who's looking for a "Path" way to walk:
First introduced in Python 3.12:
https://docs.python.org/zh-cn/3.13/library/pathlib.html#pathlib.Path.walk
尝试使用
append
方法。Try using the
append
method.os.walk
和os.scandir
是不错的选择,但是,我已经越来越多地使用 pathlib,并且可以使用 pathlib.glob()
或.rglob()
(递归 glob)方法:os.walk
andos.scandir
are great options, however, I've been using pathlib more and more, and with pathlib you can use the.glob()
or.rglob()
(recursive glob) methods:看一下
os.walk
函数返回路径及其包含的目录和文件。这应该会大大缩短你的解决方案。Take a look at the
os.walk
function which returns the path along with the directories and files it contains. That should considerably shorten your solution.对于任何使用
pathlib
寻找解决方案的人 (< code>python >= 3.4)但是,如上所述,这不会保留 os.walk 给出的自上而下的顺序
For anyone looking for a solution using
pathlib
(python >= 3.4
)However, as mentioned above, this does not preserve the top-down ordering given by
os.walk
自
Python >= 3.4
起,存在生成器方法Path.rglob
。因此,要处理
some/starting/path
下的所有路径,只需执行以下操作:To get all subpaths in a list do
list(path.rglob('*'))
。要仅获取带有
sql
扩展名的文件,请执行list(path.rglob('*.sql'))
。Since
Python >= 3.4
the exists the generator methodPath.rglob
.So, to process all paths under
some/starting/path
just do something such asTo get all subpaths in a list do
list(path.rglob('*'))
.To get just the files with
sql
extension, dolist(path.rglob('*.sql'))
.如果你想递归地遍历所有文件,包括子文件夹中的所有文件,我相信这是最好的方法。
If you want to recursively iterate through all the files, including all files in the subfolders, I believe this is the best way.
另一个解决方案如何使用
pathlib
模块:模式
**
匹配当前目录并且所有子目录,递归地,以及方法iterdir
然后迭代每个目录的内容。当您在遍历目录树时需要更多控制时很有用。
Another solution how to walk a directory tree using the
pathlib
module:The pattern
**
matches current directory and all subdirectories, recursively, and the methoditerdir
then iterates over each directory's contents. Useful when you need more control when traversing the directory tree.使用
事实上,由于多种原因,
append
方法已经完全为此而设计(将一个元素附加到列表末尾)您正在创建一个包含一个元素的临时列表,只是为了抛出把它带走。虽然在使用 Python 时,原始速度不应该是您首先关心的问题(否则您使用的是错误的语言),但无缘无故地浪费速度似乎并不正确。
您使用的 Python 语言有点不对称...对于列表对象,编写
a += b
与编写a = a + b
不同,因为前者修改对象,而第二个则分配一个新列表,如果对象a
也可以使用其他方式访问,则这可以具有不同的语义。在您的特定代码中,情况似乎并非如此,但当其他人(或几年后您自己,同样)必须修改代码时,它可能会成为一个问题。 Python 甚至有一个方法extend
,其语法不太微妙,专门用于处理您想要通过在末尾添加另一个列表的元素来就地修改列表对象的情况。另外,正如其他人指出的那样,您的代码似乎正在尝试执行 os.walk 已经执行的操作...
Indeed using
is bad for many reasons...
The
append
method has been made exactly for that (appending one element to the end of a list)You are creating a temporary list of one element just to throw it away. While raw speed should not your first concern when using Python (otherwise you're using the wrong language) still wasting speed for no reason doesn't seem the right thing.
You are using a little asymmetry of the Python language... for list objects writing
a += b
is not the same as writinga = a + b
because the former modifies the object in place, while the second instead allocates a new list and this can have a different semantic if the objecta
is also reachable using other ways. In your specific code this doesn't seem the case but it could become a problem later when someone else (or yourself in a few years, that is the same) will have to modify the code. Python even has a methodextend
with a less subtle syntax that is specifically made to handle the case in which you want to modify in place a list object by adding at the end the elements of another list.Also as other have noted seems that your code is trying to do what
os.walk
already does...从 Python 3.4 开始,出现了新模块
pathlib
。因此,要获取所有目录和文件,可以执行以下操作:Since Python 3.4 there is new module
pathlib
. So to get all dirs and files one can do:我没有使用内置的 os.walk 和 os.path.walk,而是使用从我在其他地方建议的这段代码派生的代码,我最初链接到该代码,但已将其替换为内联源:
它递归地遍历目录并且相当高效且易于阅读。
Instead of the built-in os.walk and os.path.walk, I use something derived from this piece of code I found suggested elsewhere which I had originally linked to but have replaced with inlined source:
It walks the directories recursively and is quite efficient and easy to read.
这是一个使用 os.scandir 并返回树结构的版本。使用 os.scandir 将返回 os.DirEntry 对象,该对象保存有关内存中路径对象的信息,允许在不调用文件系统的情况下查询有关项目的信息。
Here is a version that uses
os.scandir
and returns a tree structure. Usingos.scandir
will returnos.DirEntry
objects, which hold information about the path objects in memory, allowing querying of the information about the items without filesystem calls.