Python 遍历目录树的方法是什么?

发布于 2024-11-19 01:58:39 字数 403 浏览 4 评论 0原文

我觉得分配文件和文件夹并执行 += [item] 部分有点黑客。有什么建议吗?我正在使用Python 3.2

from os import *
from os.path import *

def dir_contents(path):
    contents = listdir(path)
    files = []
    folders = []
    for i, item in enumerate(contents):
        if isfile(contents[i]):
            files += [item]
        elif isdir(contents[i]):
            folders += [item]
    return files, folders

I feel that assigning files, and folders and doing the += [item] part is a bit hackish. Any suggestions? I'm using Python 3.2

from os import *
from os.path import *

def dir_contents(path):
    contents = listdir(path)
    files = []
    folders = []
    for i, item in enumerate(contents):
        if isfile(contents[i]):
            files += [item]
        elif isdir(contents[i]):
            folders += [item]
    return files, folders

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(18

清引 2024-11-26 01:58:40

在谷歌搜索相同的信息时,我发现了这个问题。

我在这里发布我在 http://www.pythoncentral.io/how-to-traverse-a-directory-tree-in-python-guide-to-os-walk/ (而不是只是发布 URL,以防链接失效)。

该页面包含一些有用的信息,并且还指向其他一些相关页面。

# Import the os module, for the os.walk function
import os

# Set the directory you want to start from
rootDir = '.'
for dirName, subdirList, fileList in os.walk(rootDir):
    print('Found directory: %s' % dirName)
    for fname in fileList:
        print('\t%s' % fname)

While googling for the same info, I found this question.

I am posting here the smallest, clearest code which I found at http://www.pythoncentral.io/how-to-traverse-a-directory-tree-in-python-guide-to-os-walk/ (rather than just posting the URL, in case of link rot).

The page has some useful info and also points to a few other relevant pages.

# Import the os module, for the os.walk function
import os

# Set the directory you want to start from
rootDir = '.'
for dirName, subdirList, fileList in os.walk(rootDir):
    print('Found directory: %s' % dirName)
    for fname in fileList:
        print('\t%s' % fname)
是伱的 2024-11-26 01:58:40

我还没有对此进行广泛的测试,但我相信
这将扩展 os.walk 生成器,将目录名连接到所有文件路径,并展平结果列表;给出搜索路径中具体文件的直接列表。

import itertools
import os

def find(input_path):
    return itertools.chain(
        *list(
            list(os.path.join(dirname, fname) for fname in files)
            for dirname, _, files in os.walk(input_path)
        )
    )

I've not tested this extensively yet, but I believe
this will expand the os.walk generator, join dirnames to all the file paths, and flatten the resulting list; To give a straight up list of concrete files in your search path.

import itertools
import os

def find(input_path):
    return itertools.chain(
        *list(
            list(os.path.join(dirname, fname) for fname in files)
            for dirname, _, files in os.walk(input_path)
        )
    )
⊕婉儿 2024-11-26 01:58:40
import pathlib
import time

def prune_empty_dirs(path: pathlib.Path):
    for current_path in list(path.rglob("*"))[::-1]:
        if current_path.is_dir() and not any(current_path.iterdir()):
            current_path.rmdir()
            while current_path.exists():
                time.sleep(0.1)
import pathlib
import time

def prune_empty_dirs(path: pathlib.Path):
    for current_path in list(path.rglob("*"))[::-1]:
        if current_path.is_dir() and not any(current_path.iterdir()):
            current_path.rmdir()
            while current_path.exists():
                time.sleep(0.1)
懷念過去 2024-11-26 01:58:40

我喜欢 os.walk() 结果的结构,但总体上更喜欢 pathlib。因此,我的懒惰解决方案只是从 os.walk() 返回的每个项目创建一个 Path

import os
import pathlib


def walk(path='bin'):
    for root, dirs, files in os.walk(path):
        root = pathlib.Path(root)
        dirs = [root / d for d in dirs]
        files = [root / f for f in files]
        yield root, dirs, files

I like the structure of the result of os.walk() but prefer pathlib overall. My lazy solution therefore is simply creating a Path from each item returned by os.walk().

import os
import pathlib


def walk(path='bin'):
    for root, dirs, files in os.walk(path):
        root = pathlib.Path(root)
        dirs = [root / d for d in dirs]
        files = [root / f for f in files]
        yield root, dirs, files
夕嗳→ 2024-11-26 01:58:40

对于那些想要深度遍历所有嵌套子目录的人来说,复制并粘贴代码:

  • 使用 python 的递归调用和 os.listdir() :
import os

count = 0
def deep_walk(mypath):
    global count
    for file in os.listdir(mypath):
        file_path = os.path.join(mypath, file)
        if os.path.isdir(file_path):
            deep_walk(file_path)
        else:
            count += 1
            print(file_path)

mypath="/tmp"
deep_walk(mypath)
print(f"Total file count: {count}")
  • 使用 python 的标准库 os.listdir() 。 walk() :
import os

def walk_dir(mypath):
    count = 0
    for root, dirs, files in os.walk(mypath):
        for file in files:
            file_path = os.path.join(root, file)
            count += 1
            print(file_path)
    print(f"Total file count: {count}")

mypath = "/tmp"
walk_dir(mypath)

不同之处在于,使用 os.walk() ,您不需要手动遍历每个子目录的每个目录,库会为您做这件事,无论你有多少个嵌套目录。

Copy and paste code for those who want to deep walk all nested sub directories:

  • using python's recursion call with os.listdir():
import os

count = 0
def deep_walk(mypath):
    global count
    for file in os.listdir(mypath):
        file_path = os.path.join(mypath, file)
        if os.path.isdir(file_path):
            deep_walk(file_path)
        else:
            count += 1
            print(file_path)

mypath="/tmp"
deep_walk(mypath)
print(f"Total file count: {count}")
  • using python's standard library os.walk():
import os

def walk_dir(mypath):
    count = 0
    for root, dirs, files in os.walk(mypath):
        for file in files:
            file_path = os.path.join(root, file)
            count += 1
            print(file_path)
    print(f"Total file count: {count}")

mypath = "/tmp"
walk_dir(mypath)

The difference is that with os.walk() you won't need to walk every directories of each sub directories mannually, the library will do it for you, no matter how many nested directories you have.

鸩远一方 2024-11-26 01:58:40

对于任何正在寻找“路径”行走方式的人:

from pathlib import Path
p=Path("some_path_you_want_to_walk")
for dirName, subdirList, fileList in p.walk():
    print(dirName, subdirList, fileList)

首先在 Python 3.12 中引入:
https://docs.python.org/ zh-cn/3.13/library/pathlib.html#pathlib.Path.walk

For anyone who's looking for a "Path" way to walk:

from pathlib import Path
p=Path("some_path_you_want_to_walk")
for dirName, subdirList, fileList in p.walk():
    print(dirName, subdirList, fileList)

First introduced in Python 3.12:
https://docs.python.org/zh-cn/3.13/library/pathlib.html#pathlib.Path.walk

小巷里的女流氓 2024-11-26 01:58:40

尝试使用 append 方法。

Try using the append method.

与风相奔跑 2024-11-26 01:58:39

os.walkos.scandir 是不错的选择,但是,我已经越来越多地使用 pathlib,并且可以使用 pathlib .glob().rglob() (递归 glob)方法:

root_directory = Path(".")
for path_object in root_directory.rglob('*'):
    if path_object.is_file():
        print(f"hi, I'm a file: {path_object}")
    elif path_object.is_dir():
        print(f"hi, I'm a dir: {path_object}")


os.walk and os.scandir are great options, however, I've been using pathlib more and more, and with pathlib you can use the .glob() or .rglob() (recursive glob) methods:

root_directory = Path(".")
for path_object in root_directory.rglob('*'):
    if path_object.is_file():
        print(f"hi, I'm a file: {path_object}")
    elif path_object.is_dir():
        print(f"hi, I'm a dir: {path_object}")


我一向站在原地 2024-11-26 01:58:39

看一下 os.walk 函数返回路径及其包含的目录和文件。这应该会大大缩短你的解决方案。

Take a look at the os.walk function which returns the path along with the directories and files it contains. That should considerably shorten your solution.

不再让梦枯萎 2024-11-26 01:58:39

对于任何使用 pathlib 寻找解决方案的人 (< code>python >= 3.4)

from pathlib import Path

def walk(path): 
    for p in Path(path).iterdir(): 
        if p.is_dir(): 
            yield from walk(p)
            continue
        yield p.resolve()

# recursively traverse all files from current directory
for p in walk(Path('.')): 
    print(p)

# the function returns a generator so if you need a list you need to build one
all_files = list(walk(Path('.'))) 

但是,如上所述,这不会保留 os.walk 给出的自上而下的顺序

For anyone looking for a solution using pathlib (python >= 3.4)

from pathlib import Path

def walk(path): 
    for p in Path(path).iterdir(): 
        if p.is_dir(): 
            yield from walk(p)
            continue
        yield p.resolve()

# recursively traverse all files from current directory
for p in walk(Path('.')): 
    print(p)

# the function returns a generator so if you need a list you need to build one
all_files = list(walk(Path('.'))) 

However, as mentioned above, this does not preserve the top-down ordering given by os.walk

秋意浓 2024-11-26 01:58:39

Python >= 3.4 起,存在生成器方法 Path.rglob
因此,要处理 some/starting/path 下的所有路径,只需执行以下操作:

from pathlib import Path

path = Path('some/starting/path') 
for subpath in path.rglob('*'):
    # do something with subpath

To get all subpaths in a list do list(path.rglob('*'))
要仅获取带有 sql 扩展名的文件,请执行 list(path.rglob('*.sql'))

Since Python >= 3.4 the exists the generator method Path.rglob.
So, to process all paths under some/starting/path just do something such as

from pathlib import Path

path = Path('some/starting/path') 
for subpath in path.rglob('*'):
    # do something with subpath

To get all subpaths in a list do list(path.rglob('*')).
To get just the files with sql extension, do list(path.rglob('*.sql')).

幸福%小乖 2024-11-26 01:58:39

如果你想递归地遍历所有文件,包括子文件夹中的所有文件,我相信这是最好的方法。

import os

def get_files(input):
    for fd, subfds, fns in os.walk(input):
       for fn in fns:
            yield os.path.join(fd, fn)

## now this will print all full paths

for fn in get_files(fd):
    print(fn)

If you want to recursively iterate through all the files, including all files in the subfolders, I believe this is the best way.

import os

def get_files(input):
    for fd, subfds, fns in os.walk(input):
       for fn in fns:
            yield os.path.join(fd, fn)

## now this will print all full paths

for fn in get_files(fd):
    print(fn)
素罗衫 2024-11-26 01:58:39

另一个解决方案如何使用 pathlib 模块:

from pathlib import Path

for directory in Path('.').glob('**'):
    for item in directory.iterdir():
        print(item)

模式 ** 匹配当前目录并且所有子目录,递归地,以及方法 iterdir然后迭代每个目录的内容。当您在遍历目录树时需要更多控制时很有用。

Another solution how to walk a directory tree using the pathlib module:

from pathlib import Path

for directory in Path('.').glob('**'):
    for item in directory.iterdir():
        print(item)

The pattern ** matches current directory and all subdirectories, recursively, and the method iterdir then iterates over each directory's contents. Useful when you need more control when traversing the directory tree.

耳钉梦 2024-11-26 01:58:39
def dir_contents(path):
    files,folders = [],[]
    for p in listdir(path):
        if isfile(p): files.append(p)
        else: folders.append(p)
    return files, folders
def dir_contents(path):
    files,folders = [],[]
    for p in listdir(path):
        if isfile(p): files.append(p)
        else: folders.append(p)
    return files, folders
墨小沫ゞ 2024-11-26 01:58:39

使用

items += [item]

事实上,由于多种原因,

  1. 是不好的......

    append 方法已经完全为此而设计(将一个元素附加到列表末尾)

  2. 您正在创建一个包含一个元素的临时列表,只是为了抛出把它带走。虽然在使用 Python 时,原始速度不应该是您首先关心的问题(否则您使用的是错误的语言),但无缘无故地浪费速度似乎并不正确。

  3. 您使用的 Python 语言有点不对称...对于列表对象,编写 a += b 与编写 a = a + b 不同,因为前者修改对象,而第二个则分配一个新列表,如果对象 a 也可以使用其他方式访问,则这可以具有不同的语义。在您的特定代码中,情况似乎并非如此,但当其他人(或几年后您自己,同样)必须修改代码时,它可能会成为一个问题。 Python 甚至有一个方法 extend ,其语法不太微妙,专门用于处理您想要通过在末尾添加另一个列表的元素来就地修改列表对象的情况。

另外,正如其他人指出的那样,您的代码似乎正在尝试执行 os.walk 已经执行的操作...

Indeed using

items += [item]

is bad for many reasons...

  1. The append method has been made exactly for that (appending one element to the end of a list)

  2. You are creating a temporary list of one element just to throw it away. While raw speed should not your first concern when using Python (otherwise you're using the wrong language) still wasting speed for no reason doesn't seem the right thing.

  3. You are using a little asymmetry of the Python language... for list objects writing a += b is not the same as writing a = a + b because the former modifies the object in place, while the second instead allocates a new list and this can have a different semantic if the object a is also reachable using other ways. In your specific code this doesn't seem the case but it could become a problem later when someone else (or yourself in a few years, that is the same) will have to modify the code. Python even has a method extend with a less subtle syntax that is specifically made to handle the case in which you want to modify in place a list object by adding at the end the elements of another list.

Also as other have noted seems that your code is trying to do what os.walk already does...

乱了心跳 2024-11-26 01:58:39

从 Python 3.4 开始,出现了新模块 pathlib。因此,要获取所有目录和文件,可以执行以下操作:

from pathlib import Path

dirs = [str(item) for item in Path(path).iterdir() if item.is_dir()]
files = [str(item) for item in Path(path).iterdir() if item.is_file()]

Since Python 3.4 there is new module pathlib. So to get all dirs and files one can do:

from pathlib import Path

dirs = [str(item) for item in Path(path).iterdir() if item.is_dir()]
files = [str(item) for item in Path(path).iterdir() if item.is_file()]
余厌 2024-11-26 01:58:39

我没有使用内置的 os.walk 和 os.path.walk,而是使用从我在其他地方建议的这段代码派生的代码,我最初链接到该代码,但已将其替换为内联源:

import os
import stat

class DirectoryStatWalker:
    # a forward iterator that traverses a directory tree, and
    # returns the filename and additional file information

    def __init__(self, directory):
        self.stack = [directory]
        self.files = []
        self.index = 0

    def __getitem__(self, index):
        while 1:
            try:
                file = self.files[self.index]
                self.index = self.index + 1
            except IndexError:
                # pop next directory from stack
                self.directory = self.stack.pop()
                self.files = os.listdir(self.directory)
                self.index = 0
            else:
                # got a filename
                fullname = os.path.join(self.directory, file)
                st = os.stat(fullname)
                mode = st[stat.ST_MODE]
                if stat.S_ISDIR(mode) and not stat.S_ISLNK(mode):
                    self.stack.append(fullname)
                return fullname, st

if __name__ == '__main__':
    for file, st in DirectoryStatWalker("/usr/include"):
        print file, st[stat.ST_SIZE]

它递归地遍历目录并且相当高效且易于阅读。

Instead of the built-in os.walk and os.path.walk, I use something derived from this piece of code I found suggested elsewhere which I had originally linked to but have replaced with inlined source:

import os
import stat

class DirectoryStatWalker:
    # a forward iterator that traverses a directory tree, and
    # returns the filename and additional file information

    def __init__(self, directory):
        self.stack = [directory]
        self.files = []
        self.index = 0

    def __getitem__(self, index):
        while 1:
            try:
                file = self.files[self.index]
                self.index = self.index + 1
            except IndexError:
                # pop next directory from stack
                self.directory = self.stack.pop()
                self.files = os.listdir(self.directory)
                self.index = 0
            else:
                # got a filename
                fullname = os.path.join(self.directory, file)
                st = os.stat(fullname)
                mode = st[stat.ST_MODE]
                if stat.S_ISDIR(mode) and not stat.S_ISLNK(mode):
                    self.stack.append(fullname)
                return fullname, st

if __name__ == '__main__':
    for file, st in DirectoryStatWalker("/usr/include"):
        print file, st[stat.ST_SIZE]

It walks the directories recursively and is quite efficient and easy to read.

缱倦旧时光 2024-11-26 01:58:39

这是一个使用 os.scandir 并返回树结构的版本。使用 os.scandir 将返回 os.DirEntry 对象,该对象保存有关内存中路径对象的信息,允许在不调用文件系统的情况下查询有关项目的信息。

import os

def treedir(path):
    files = []
    folders = {}
    for entry in os.scandir(path):
        if entry.is_file():
            files.append(entry)
        elif entry.is_dir():
            folders[entry.name] = treedir(entry)
    result = {}
    if files:
        result['files'] = files
    if folders:
        result['folders'] = folders
    return result

Here is a version that uses os.scandir and returns a tree structure. Using os.scandir will return os.DirEntry objects, which hold information about the path objects in memory, allowing querying of the information about the items without filesystem calls.

import os

def treedir(path):
    files = []
    folders = {}
    for entry in os.scandir(path):
        if entry.is_file():
            files.append(entry)
        elif entry.is_dir():
            folders[entry.name] = treedir(entry)
    result = {}
    if files:
        result['files'] = files
    if folders:
        result['folders'] = folders
    return result
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文