python的glob函数是否支持可变深度的通配符?

发布于 2024-11-27 09:40:36 字数 590 浏览 1 评论 0原文

我正在编写一个使用这种尴尬的 glob 语法的 python 脚本。

import glob    
F = glob.glob('./www.dmoz.org/Science/Environment/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/*/*/index.html')

似乎应该有一种方法可以将其包装为一行:

F = glob.glob('./www.dmoz.org/Science/Environment/[super_wildcard]/index.html')

但我不知道合适的超级通配符是什么。这样的事存在吗?

I'm writing a python script that uses this awkward glob syntax.

import glob    
F = glob.glob('./www.dmoz.org/Science/Environment/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/*/index.html')
F += glob.glob('./www.dmoz.org/Science/Environment/*/*/*/*/index.html')

Seems like there ought to be a way to wrap this is one line:

F = glob.glob('./www.dmoz.org/Science/Environment/[super_wildcard]/index.html')

But I don't know what the appropriate super wildcard would be. Does such a thing exist?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

蘸点软妹酱 2024-12-04 09:40:36

抱歉 - 事实并非如此。您可能需要使用 os.walk 编写几行代码:

for root, dirs, files in os.walk('/starting/path/'):
    for myFile in files:
        if myFile == "index.html":
            print os.path.join(root, myFile)

Sorry - it does not. You will have to probably write few lines of code using os.walk:

for root, dirs, files in os.walk('/starting/path/'):
    for myFile in files:
        if myFile == "index.html":
            print os.path.join(root, myFile)
萌化 2024-12-04 09:40:36

我不知道这是否是新的,但 glob 现在可以做到这一点。

例如,

F = glob.glob('./www.dmoz.org/Science/Environment/**/index.html', recursive=True)

I don't know if this is new, but glob CAN do this now.

For example,

F = glob.glob('./www.dmoz.org/Science/Environment/**/index.html', recursive=True)
烟燃烟灭 2024-12-04 09:40:36

我刚刚发布了 Formic ,它在实现中准确实现了您需要的通配符 - '**' Apache Ant 的 FileSet 和 Glob

可以执行搜索:

import formic
fileset = formic.FileSet(include="/www.dmoz.org/Science/Environment/**/index.html")
for file_name in fileset.qualified_files():
    # Do something with file_name

这将从当前目录开始搜索。我希望这有帮助。

I have just released Formic which implements exactly the wildcard you need - '**' - in an implementation of Apache Ant's FileSet and Globs.

The search can be implemented:

import formic
fileset = formic.FileSet(include="/www.dmoz.org/Science/Environment/**/index.html")
for file_name in fileset.qualified_files():
    # Do something with file_name

This will search from the current directory. I hope this helps.

一桥轻雨一伞开 2024-12-04 09:40:36

它并不完美,但对我有用:

for i in range(max_depth):  
    components= ['./www.dmoz.org/Science/Environment',]+(['*']*i)+['index.html']
    fsearch=os.path.join(*components)
    fs_res=glob.glob(fsearch)
    if len(fs_res)==1:
        return fs_res[0]

It's not perfect, but works for me:

for i in range(max_depth):  
    components= ['./www.dmoz.org/Science/Environment',]+(['*']*i)+['index.html']
    fsearch=os.path.join(*components)
    fs_res=glob.glob(fsearch)
    if len(fs_res)==1:
        return fs_res[0]
逆流 2024-12-04 09:40:36

10 年后...pathlib 解决方案

from pathlib import Path
F = Path("./www.dmoz.org/Science/Environment/").glob('**/*index.html')

其中[super_wildcard] = **

10 years later ... pathlib solution

from pathlib import Path
F = Path("./www.dmoz.org/Science/Environment/").glob('**/*index.html')

Where [super_wildcard] = **.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文