使用 fnmatch.filter 按多个可能的文件扩展名过滤文件

发布于 2024-10-25 01:19:14 字数 491 浏览 1 评论 0原文

给定以下一段 python 代码:

for root, dirs, files in os.walk(directory):
    for filename in fnmatch.filter(files, '*.png'):
        pass

如何过滤多个扩展?在这种特殊情况下,我想获取所有以 *.png、*.gif、*.jpg 或 *.jpeg 结尾的文件。

现在我想出了

for root, dirs, files in os.walk(directory):
    for extension in ['jpg', 'jpeg', 'gif', 'png']:
        for filename in fnmatch.filter(files, '*.' + extension):
            pass

但我认为它不是很优雅和高性能。

有人有更好的主意吗?

Given the following piece of python code:

for root, dirs, files in os.walk(directory):
    for filename in fnmatch.filter(files, '*.png'):
        pass

How can I filter for more than one extension? In this special case I want to get all files ending with *.png, *.gif, *.jpg or *.jpeg.

For now I came up with

for root, dirs, files in os.walk(directory):
    for extension in ['jpg', 'jpeg', 'gif', 'png']:
        for filename in fnmatch.filter(files, '*.' + extension):
            pass

But I think it is not very elegant and performant.

Someone has a better idea?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

十年不长 2024-11-01 01:19:14

如果您只需要检查扩展名(即不需要进一步的通配符),为什么不简单地使用基本的字符串操作呢?

for root, dirs, files in os.walk(directory):
    for filename in files:
        if filename.endswith(('.jpg', '.jpeg', '.gif', '.png')):
            pass

If you only need to check extensions (i.e. no further wildcards), why don't you simply use basic string operations?

for root, dirs, files in os.walk(directory):
    for filename in files:
        if filename.endswith(('.jpg', '.jpeg', '.gif', '.png')):
            pass
孤君无依 2024-11-01 01:19:14

我认为你的代码实际上很好。如果您只想对每个文件名进行一次处理,请定义您自己的过滤函数:

def is_image_file(filename, extensions=['.jpg', '.jpeg', '.gif', '.png']):
    return any(filename.endswith(e) for e in extensions)

for root, dirs, files in os.walk(directory):
    for filename in filter(is_image_file, files):
        pass

I think your code is actually fine. If you want to touch every filename only once, define your own filtering function:

def is_image_file(filename, extensions=['.jpg', '.jpeg', '.gif', '.png']):
    return any(filename.endswith(e) for e in extensions)

for root, dirs, files in os.walk(directory):
    for filename in filter(is_image_file, files):
        pass
冷弦 2024-11-01 01:19:14

这将是一个更好的方法,也许是因为您没有重复调用 + 并使用 tuple 而不是 list

for root, dirs, files in os.walk(directory):
    for extension in ('*.jpg', '*.jpeg', '*.gif', '*.png'):
        for filename in fnmatch.filter(files, extension):
            pass

元组更好,因为一旦创建了扩展,您就不会再修改它们。您只是用来迭代它们。

This would be a better way, perhaps because you are not calling + repeatedly and using a tuple instead of list.

for root, dirs, files in os.walk(directory):
    for extension in ('*.jpg', '*.jpeg', '*.gif', '*.png'):
        for filename in fnmatch.filter(files, extension):
            pass

A tuple is better because you are not going to modify the extension once you have created them. You are just using to iterate over them.

不念旧人 2024-11-01 01:19:14

我一直在使用这个并取得了很大的成功。

import fnmatch
import functools
import itertools
import os

# Remove the annotations if you're not on Python3
def find_files(dir_path: str=None, patterns: [str]=None) -> [str]:
    """
    Returns a generator yielding files matching the given patterns
    :type dir_path: str
    :type patterns: [str]
    :rtype : [str]
    :param dir_path: Directory to search for files/directories under. Defaults to current dir.
    :param patterns: Patterns of files to search for. Defaults to ["*"]. Example: ["*.json", "*.xml"]
    """
    path = dir_path or "."
    path_patterns = patterns or ["*"]

    for root_dir, dir_names, file_names in os.walk(path):
        filter_partial = functools.partial(fnmatch.filter, file_names)

        for file_name in itertools.chain(*map(filter_partial, path_patterns)):
            yield os.path.join(root_dir, file_name)

示例:

for f in find_files(test_directory):
    print(f)

产量:

.\test.json
.\test.xml
.\test.ini
.\test_helpers.py
.\__init__.py

使用多种模式进行测试:

for f in find_files(test_directory, ["*.xml", "*.json", "*.ini"]):
    print(f)

产量:

.\test.json
.\test.xml
.\test.ini

I've been using this with a lot of success.

import fnmatch
import functools
import itertools
import os

# Remove the annotations if you're not on Python3
def find_files(dir_path: str=None, patterns: [str]=None) -> [str]:
    """
    Returns a generator yielding files matching the given patterns
    :type dir_path: str
    :type patterns: [str]
    :rtype : [str]
    :param dir_path: Directory to search for files/directories under. Defaults to current dir.
    :param patterns: Patterns of files to search for. Defaults to ["*"]. Example: ["*.json", "*.xml"]
    """
    path = dir_path or "."
    path_patterns = patterns or ["*"]

    for root_dir, dir_names, file_names in os.walk(path):
        filter_partial = functools.partial(fnmatch.filter, file_names)

        for file_name in itertools.chain(*map(filter_partial, path_patterns)):
            yield os.path.join(root_dir, file_name)

Examples:

for f in find_files(test_directory):
    print(f)

yields:

.\test.json
.\test.xml
.\test.ini
.\test_helpers.py
.\__init__.py

Testing with multiple patterns:

for f in find_files(test_directory, ["*.xml", "*.json", "*.ini"]):
    print(f)

yields:

.\test.json
.\test.xml
.\test.ini
北座城市 2024-11-01 01:19:14

这也不是很优雅,但它有效:

for root, dirs, files in os.walk(directory):
    for filename in fnmatch.filter(files, '*.png') + fnmatch.filter(files, '*.jpg') + fnmatch.filter(files, '*.jpeg') + fnmatch.filter(files, '*.gif'):
        pass

This isn't really elegant either, but it works:

for root, dirs, files in os.walk(directory):
    for filename in fnmatch.filter(files, '*.png') + fnmatch.filter(files, '*.jpg') + fnmatch.filter(files, '*.jpeg') + fnmatch.filter(files, '*.gif'):
        pass
彩虹直至黑白 2024-11-01 01:19:14

在内部,fnmatch 使用正则表达式。还有一种方法可以根据 fnmatch 模式生成正则表达式 - fnmatch.translate。这也可能会带来一点加速。

import fnmatch
import os
import re

image_exts = ['jpg', 'jpeg', 'gif', 'png']
image_re = re.compile('|'.join(fnmatch.translate('*.' + e) for e in image_exts))
for root, dirs, files in os.walk(directory):
    for filename in files:
        if image_re.match(filename):
            ...

Internally, fnmatch users regular expressions. And there's a method that makes a regex from an fnmatch pattern — fnmatch.translate. This may also give a little speed-up.

import fnmatch
import os
import re

image_exts = ['jpg', 'jpeg', 'gif', 'png']
image_re = re.compile('|'.join(fnmatch.translate('*.' + e) for e in image_exts))
for root, dirs, files in os.walk(directory):
    for filename in files:
        if image_re.match(filename):
            ...
半夏半凉 2024-11-01 01:19:14

这是我用来过滤 apache 日志目录中的文件的内容。
这里我排除错误文件

rep_filters = [now.strftime("%Y%m%d")]
def files_filter(liste_fic, filters = rep_filters):
    s = "(fic for fic in liste_fic if fic.find('error') < 0"
    for filter in filters:
        s += " and fic.find('%s') >=0 " % filter
    s += ")"
    return eval(s)

Here is what I am using to filter files in apache log directories.
Here I exclude errors flles

rep_filters = [now.strftime("%Y%m%d")]
def files_filter(liste_fic, filters = rep_filters):
    s = "(fic for fic in liste_fic if fic.find('error') < 0"
    for filter in filters:
        s += " and fic.find('%s') >=0 " % filter
    s += ")"
    return eval(s)
鸢与 2024-11-01 01:19:14

请尝试这个:

# pattern_list = ['*.jpg', '__.*']
def checkFilepatter(filename, pattern_list):
    for pattern in pattern_list:
        if fnmatch.fnmatch(filename, pattern):
            return True
    return False

Please try this:

# pattern_list = ['*.jpg', '__.*']
def checkFilepatter(filename, pattern_list):
    for pattern in pattern_list:
        if fnmatch.fnmatch(filename, pattern):
            return True
    return False
寄离 2024-11-01 01:19:14

您可以使用列表理解来检查 my_file 是否与 patterns 中定义的任何文件掩码匹配:

import fnmatch

my_file = 'my_precious.txt'
patterns = ('*.txt', '*.html', '*.mp3')


if [pat for pat in patterns if fnmatch.fnmatch(my_file, pat)]:
    print('We have a match!')
else:
    print('No match')

You can use a list comprehension to check if my_file matches any of the file masks defined in patterns:

import fnmatch

my_file = 'my_precious.txt'
patterns = ('*.txt', '*.html', '*.mp3')


if [pat for pat in patterns if fnmatch.fnmatch(my_file, pat)]:
    print('We have a match!')
else:
    print('No match')
﹏半生如梦愿梦如真 2024-11-01 01:19:14

最清晰的解决方案是:

import os

for root, dirs, files in os.walk(directory):
    for filename in files:
        _, ext = os.path.splitext(filename)
        if ext in ['.jpg', '.jpeg', '.gif', '.png']:
            ...

或者,使用 pathlib ,

for path in pathlib.Path(directory).glob('**/*'):
    if path.suffix in ['.jpg', '.jpeg', '.gif', '.png']:
        ...

The clearest solution is:

import os

for root, dirs, files in os.walk(directory):
    for filename in files:
        _, ext = os.path.splitext(filename)
        if ext in ['.jpg', '.jpeg', '.gif', '.png']:
            ...

or, using pathlib,

for path in pathlib.Path(directory).glob('**/*'):
    if path.suffix in ['.jpg', '.jpeg', '.gif', '.png']:
        ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文