Python 通过 dict 进行目录搜索和组织

发布于 2024-08-14 18:38:13 字数 875 浏览 1 评论 0原文

大家好,这是我最近第一次尝试进入Python的文件和操作系统部分。我正在尝试搜索一个目录然后找到所有子目录。如果该目录没有文件夹,则将所有文件添加到列表中。并通过听写来组织它们。

例如,一棵树可能看起来像这样的

  • 起始路径
    • 目录 1
      • 子目录 1
      • 子目录 2
      • 子目录 3
        • 子子目录
          • 文件.jpg
          • 文件夹1
            • 文件1.jpg
            • 文件2.jpg
          • 文件夹2
            • 文件3.jpg
            • 文件4.jpg

即使 subsubdir 中有文件,也应该跳过它,因为它有文件夹。

现在,如果我知道要查找多少个目录,则通常可以使用 os.listdir 和 os.path.isdir 来执行此操作。但是,如果我希望它是动态的,则必须补偿任意数量的文件夹和子文件夹。我尝试过使用 os.walk,它会轻松找到所有文件。我遇到的唯一麻烦是使用包含文件的路径名创建所有字典。我需要按 dict 组织的文件夹名称,直到起始路径。

因此,最后,使用上面的示例,字典及其中的文件应如下所示:

dict['dir1']['subdir3']['subsubdir']['folder1'] = ['file1.jpg', 'file2.jpg']

dict['dir1']['subdir3']['subsubdir']['folder2'] = ['file3.jpg', 'file4.jpg']

希望对此提供任何帮助或组织信息的更好想法。谢谢。

Hey all, this is my first time recently trying to get into the file and os part of Python. I am trying to search a directory then find all sub directories. If the directory has no folders, add all the files to a list. And organize them all by dict.

So for instance a tree could look like this

  • Starting Path
    • Dir 1
      • Subdir 1
      • Subdir 2
      • Subdir 3
        • subsubdir
          • file.jpg
          • folder1
            • file1.jpg
            • file2.jpg
          • folder2
            • file3.jpg
            • file4.jpg

Even if subsubdir has a file in it, it should be skipped because it has folders in it.

Now I can normally do this if I know how many directories I am going to be looking for, using os.listdir and os.path.isdir. However if I want this to be dynamic it will have to compensate for any amount of folders and subfolders. I have tried using os.walk and it will find all the files easily. The only trouble I am having is creating all the dicts with the path names that contain file. I need the foldernames organized by dict, up until the starting path.

So in the end, using the example above, the dict should look like this with the files in it:

dict['dir1']['subdir3']['subsubdir']['folder1'] = ['file1.jpg', 'file2.jpg']

dict['dir1']['subdir3']['subsubdir']['folder2'] = ['file3.jpg', 'file4.jpg']

Would appreciate any help on this or better ideas on organizing the information. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

撧情箌佬 2024-08-21 18:38:13

也许你想要类似的东西:

def explore(starting_path):
  alld = {'': {}}

  for dirpath, dirnames, filenames in os.walk(starting_path):
    d = alld
    dirpath = dirpath[len(starting_path):]
    for subd in dirpath.split(os.sep):
      based = d
      d = d[subd]
    if dirnames:
      for dn in dirnames:
        d[dn] = {}
    else:
      based[subd] = filenames
  return alld['']

例如,给定一个 /tmp/a 这样:

$ ls -FR /tmp/a
b/  c/  d/

/tmp/a/b:
z/

/tmp/a/b/z:

/tmp/a/c:
za  zu

/tmp/a/d:

print explore('/tmp/a') 发出: {'c ':['za','zu'],'b':{'z':[]},'d':[]}

如果这不正是您所追求的,也许您可​​以具体向我们展示其中的差异是什么?我怀疑如果需要的话,它们可能很容易修复。

Maybe you want something like:

def explore(starting_path):
  alld = {'': {}}

  for dirpath, dirnames, filenames in os.walk(starting_path):
    d = alld
    dirpath = dirpath[len(starting_path):]
    for subd in dirpath.split(os.sep):
      based = d
      d = d[subd]
    if dirnames:
      for dn in dirnames:
        d[dn] = {}
    else:
      based[subd] = filenames
  return alld['']

For example, given a /tmp/a such that:

$ ls -FR /tmp/a
b/  c/  d/

/tmp/a/b:
z/

/tmp/a/b/z:

/tmp/a/c:
za  zu

/tmp/a/d:

print explore('/tmp/a') emits: {'c': ['za', 'zu'], 'b': {'z': []}, 'd': []}.

If this isn't exactly what you're after, maybe you can show us specifically what the differences are supposed to be? I suspect they can probably be easily fixed, if need be.

泛泛之交 2024-08-21 18:38:13

我不知道你为什么要这样做。您应该能够使用 os.path.walk 进行处理,但如果您确实需要这样的结构,您可以这样做(未经测试):

import os

def dirfunc(fdict, dirname, fnames):
    tmpdict = fdict
    keys = dirname.split(os.sep)[:-1]
    for k in keys:
        tmpdict = tmpdict.setdefault(k, {})

    for f in fnames:
        if os.path.isdir(f):
            return

    tmpdict[dirname] = fnames

mydict = {}
os.walk(directory_to_search, dirfunc, mydict)

此外,您不应该将变量命名为 dict 因为它是 Python 内置的。将名称 dict 重新绑定到 Python 的 dict 类型以外的名称是一个非常糟糕的主意

编辑:编辑以修复“双最后一个键”错误并使用os.walk

I don't know why you would want to do this. You should be able to do your processing using os.path.walk, but in case you really need such a structure, you can do (untested):

import os

def dirfunc(fdict, dirname, fnames):
    tmpdict = fdict
    keys = dirname.split(os.sep)[:-1]
    for k in keys:
        tmpdict = tmpdict.setdefault(k, {})

    for f in fnames:
        if os.path.isdir(f):
            return

    tmpdict[dirname] = fnames

mydict = {}
os.walk(directory_to_search, dirfunc, mydict)

Also, you should not name your variable dict because it's a Python built-in. It is a very bad idea to rebind the name dict to something other than Python's dict type.

Edit: edited to fix the "double last key" error and to use os.walk.

独享拥抱 2024-08-21 18:38:13

您想要构建数据的方式存在一个基本问题。如果 dir1/subdir1 包含子目录和文件,那么 dict['dir1']['subdir1'] 应该是列表还是字典?要使用 ...['subdir2'] 访问更多子目录,它需要是一个字典,但另一方面 dict['dir1']['subdir1']应该返回文件列表。

您要么必须从以某种方式结合这两个方面的自定义对象构建树,要么必须更改树结构以以不同的方式处理文件。

There is a basic problem with the way you want to structure the data. If dir1/subdir1 contains subdirectories and files, should dict['dir1']['subdir1'] be a list or a dictionary? To access further subdirectories with ...['subdir2'] it needs to be a dictionary, but on the other hand dict['dir1']['subdir1'] should return a list of files.

Either you have to build the tree from custom objects that combine these two aspects in some way, or you have to change the tree structure to treat files differently.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文