如何使用Python计算目录中的文件数量

发布于 2024-08-28 17:22:13 字数 86 浏览 13 评论 0原文

如何只计算目录中的文件?这将目录本身算作一个文件:

len(glob.glob('*'))

How do I count only the files in a directory? This counts the directory itself as a file:

len(glob.glob('*'))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(29

素食主义者 2024-09-04 17:22:13

os.listdir() 比使用 glob.glob 效率稍高。要测试文件名是否是普通文件(而不是目录或其他实体),请使用 os.path.isfile() :

import os, os.path

# simple version for working with CWD
print len([name for name in os.listdir('.') if os.path.isfile(name)])

# path joining version for other paths
DIR = '/tmp'
print len([name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))])

os.listdir() will be slightly more efficient than using glob.glob. To test if a filename is an ordinary file (and not a directory or other entity), use os.path.isfile():

import os, os.path

# simple version for working with CWD
print len([name for name in os.listdir('.') if os.path.isfile(name)])

# path joining version for other paths
DIR = '/tmp'
print len([name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))])
生来就爱笑 2024-09-04 17:22:13
import os

_, _, files = next(os.walk("/usr/lib"))
file_count = len(files)
import os

_, _, files = next(os.walk("/usr/lib"))
file_count = len(files)
葬心 2024-09-04 17:22:13

对于所有类型的文件,包括子目录 (Python 2):

import os

lst = os.listdir(directory) # your directory path
number_files = len(lst)
print number_files

仅文件(避免子目录):

import os

onlyfiles = next(os.walk(directory))[2] #directory is your directory path as string
print len(onlyfiles)

For all kind of files, subdirectories included (Python 2):

import os

lst = os.listdir(directory) # your directory path
number_files = len(lst)
print number_files

Only files (avoiding subdirectories):

import os

onlyfiles = next(os.walk(directory))[2] #directory is your directory path as string
print len(onlyfiles)
一生独一 2024-09-04 17:22:13

这就是 fnmatch 非常方便的地方:

import fnmatch

print len(fnmatch.filter(os.listdir(dirpath), '*.txt'))

更多详细信息:http://docs.python.org/ 2/library/fnmatch.html

This is where fnmatch comes very handy:

import fnmatch

print len(fnmatch.filter(os.listdir(dirpath), '*.txt'))

More details: http://docs.python.org/2/library/fnmatch.html

梦里泪两行 2024-09-04 17:22:13

如果你想计算目录中的所有文件 - 包括子目录中的文件,最Pythonic的方法是:

import os

file_count = sum(len(files) for _, _, files in os.walk(r'C:\Dropbox'))
print(file_count)

我们使用 sum 比显式添加文件计数(计时待定)更快

If you want to count all files in the directory - including files in subdirectories, the most pythonic way is:

import os

file_count = sum(len(files) for _, _, files in os.walk(r'C:\Dropbox'))
print(file_count)

We use sum that is faster than explicitly adding the file counts (timings pending)

铜锣湾横着走 2024-09-04 17:22:13

使用 pathlib 的答案,而不将整个列表加载到内存中:

from pathlib import Path

path = Path('.')

print(sum(1 for _ in path.glob('*')))  # Files and folders, not recursive
print(sum(1 for _ in path.rglob('*')))  # Files and folders, recursive

print(sum(1 for x in path.glob('*') if x.is_file()))  # Only files, not recursive
print(sum(1 for x in path.rglob('*') if x.is_file()))  # Only files, recursive

An answer with pathlib and without loading the whole list to memory:

from pathlib import Path

path = Path('.')

print(sum(1 for _ in path.glob('*')))  # Files and folders, not recursive
print(sum(1 for _ in path.rglob('*')))  # Files and folders, recursive

print(sum(1 for x in path.glob('*') if x.is_file()))  # Only files, not recursive
print(sum(1 for x in path.rglob('*') if x.is_file()))  # Only files, recursive
〆凄凉。 2024-09-04 17:22:13

简短而简单

import os
directory_path = '/home/xyz/'
No_of_files = len(os.listdir(directory_path))

Short and simple

import os
directory_path = '/home/xyz/'
No_of_files = len(os.listdir(directory_path))
誰認得朕 2024-09-04 17:22:13

我很惊讶没有人提到 os.scandir:

def count_files(dir):
    return len([1 for x in list(os.scandir(dir)) if x.is_file()])

I am surprised that nobody mentioned os.scandir:

def count_files(dir):
    return len([1 for x in list(os.scandir(dir)) if x.is_file()])
千里故人稀 2024-09-04 17:22:13
def directory(path,extension):
  list_dir = []
  list_dir = os.listdir(path)
  count = 0
  for file in list_dir:
    if file.endswith(extension): # eg: '.txt'
      count += 1
  return count
def directory(path,extension):
  list_dir = []
  list_dir = os.listdir(path)
  count = 0
  for file in list_dir:
    if file.endswith(extension): # eg: '.txt'
      count += 1
  return count
情泪▽动烟 2024-09-04 17:22:13
import os
print len(os.listdir(os.getcwd()))
import os
print len(os.listdir(os.getcwd()))
揽清风入怀 2024-09-04 17:22:13

这使用 os.listdir 并适用于任何目录:

import os
directory = 'mydirpath'

number_of_files = len([item for item in os.listdir(directory) if os.path.isfile(os.path.join(directory, item))])

这可以使用生成器进行简化,并通过以下方式加快速度:

import os
isfile = os.path.isfile
join = os.path.join

directory = 'mydirpath'
number_of_files = sum(1 for item in os.listdir(directory) if isfile(join(directory, item)))

This uses os.listdir and works for any directory:

import os
directory = 'mydirpath'

number_of_files = len([item for item in os.listdir(directory) if os.path.isfile(os.path.join(directory, item))])

this can be simplified with a generator and made a little bit faster with:

import os
isfile = os.path.isfile
join = os.path.join

directory = 'mydirpath'
number_of_files = sum(1 for item in os.listdir(directory) if isfile(join(directory, item)))
小嗷兮 2024-09-04 17:22:13

虽然我同意 @DanielStutzbach 提供的答案:os.listdir() 会比使用 glob.glob 稍微高效一些。

但是,如果您确实想计算文件夹中特定文件的数量,则需要额外的精度,您需要使用 len(glob.glob()) 。例如,如果您要计算要使用的文件夹中的所有 pdf:

pdfCounter = len(glob.glob1(myPath,"*.pdf"))

While I agree with the answer provided by @DanielStutzbach: os.listdir() will be slightly more efficient than using glob.glob.

However, an extra precision, if you do want to count the number of specific files in folder, you want to use len(glob.glob()). For instance if you were to count all the pdfs in a folder you want to use:

pdfCounter = len(glob.glob1(myPath,"*.pdf"))
转角预定愛 2024-09-04 17:22:13

这是一个简单的解决方案,可以计算包含子文件夹的目录中的文件数量。它可能会派上用场:

import os
from pathlib import Path

def count_files(rootdir):
    '''counts the number of files in each subfolder in a directory'''
    for path in pathlib.Path(rootdir).iterdir():
        if path.is_dir():
            print("There are " + str(len([name for name in os.listdir(path) \
            if os.path.isfile(os.path.join(path, name))])) + " files in " + \
            str(path.name))
            
 
count_files(data_dir) # data_dir is the directory you want files counted.

您应该得到与此类似的输出(当然,占位符已更改):

There are {number of files} files in {name of sub-folder1}
There are {number of files} files in {name of sub-folder2}

This is an easy solution that counts the number of files in a directory containing sub-folders. It may come in handy:

import os
from pathlib import Path

def count_files(rootdir):
    '''counts the number of files in each subfolder in a directory'''
    for path in pathlib.Path(rootdir).iterdir():
        if path.is_dir():
            print("There are " + str(len([name for name in os.listdir(path) \
            if os.path.isfile(os.path.join(path, name))])) + " files in " + \
            str(path.name))
            
 
count_files(data_dir) # data_dir is the directory you want files counted.

You should get an output similar to this (with the placeholders changed, of course):

There are {number of files} files in {name of sub-folder1}
There are {number of files} files in {name of sub-folder2}
恬淡成诗 2024-09-04 17:22:13
def count_em(valid_path):
   x = 0
   for root, dirs, files in os.walk(valid_path):
       for f in files:
            x = x+1
print "There are", x, "files in this directory."
return x

摘自这篇文章

def count_em(valid_path):
   x = 0
   for root, dirs, files in os.walk(valid_path):
       for f in files:
            x = x+1
print "There are", x, "files in this directory."
return x

Taked from this post

会傲 2024-09-04 17:22:13

一行和递归:

def count_files(path):
    return sum([len(files) for _, _, files in os.walk(path)])

count_files('path/to/dir')

one liner and recursive:

def count_files(path):
    return sum([len(files) for _, _, files in os.walk(path)])

count_files('path/to/dir')
沫尐诺 2024-09-04 17:22:13
import os

def count_files(in_directory):
    joiner= (in_directory + os.path.sep).__add__
    return sum(
        os.path.isfile(filename)
        for filename
        in map(joiner, os.listdir(in_directory))
    )

>>> count_files("/usr/lib")
1797
>>> len(os.listdir("/usr/lib"))
2049
import os

def count_files(in_directory):
    joiner= (in_directory + os.path.sep).__add__
    return sum(
        os.path.isfile(filename)
        for filename
        in map(joiner, os.listdir(in_directory))
    )

>>> count_files("/usr/lib")
1797
>>> len(os.listdir("/usr/lib"))
2049
牵你手 2024-09-04 17:22:13

这是一个我发现很有用的简单的单行命令:

print int(os.popen("ls | wc -l").read())

Here is a simple one-line command that I found useful:

print int(os.popen("ls | wc -l").read())
旧城烟雨 2024-09-04 17:22:13

卢克的代码重新格式化。

import os

print len(os.walk('/usr/lib').next()[2])

Luke's code reformat.

import os

print len(os.walk('/usr/lib').next()[2])
雨落□心尘 2024-09-04 17:22:13

我使用glob.iglob作为类似于

data
└───train
│   └───subfolder1
│   |   │   file111.png
│   |   │   file112.png
│   |   │   ...
│   |
│   └───subfolder2
│       │   file121.png
│       │   file122.png
│       │   ...
└───test
    │   file221.png
    │   file222.png

以下两个选项返回4的目录结构(如预期,ie不计算子文件夹本身

  • len(list (glob.iglob("data/train/*/*.png", recursive=True)))
  • sum(1 for i in glob.iglob("data/train/*/*.png) “))

I used glob.iglob for a directory structure similar to

data
└───train
│   └───subfolder1
│   |   │   file111.png
│   |   │   file112.png
│   |   │   ...
│   |
│   └───subfolder2
│       │   file121.png
│       │   file122.png
│       │   ...
└───test
    │   file221.png
    │   file222.png

Both of the following options return 4 (as expected, i.e. does not count the subfolders themselves)

  • len(list(glob.iglob("data/train/*/*.png", recursive=True)))
  • sum(1 for i in glob.iglob("data/train/*/*.png"))
不必你懂 2024-09-04 17:22:13

这很简单:

print(len([iq for iq in os.scandir('PATH')]))

它只是计算目录中的文件数量,我使用列表理解技术来迭代特定目录,返回所有文件。 “len(returned list)”返回文件的数量。

It is simple:

print(len([iq for iq in os.scandir('PATH')]))

it simply counts number of files in directory , i have used list comprehension technique to iterate through specific directory returning all files in return . "len(returned list)" returns number of files.

岁月打碎记忆 2024-09-04 17:22:13

如果您将使用操作系统的标准 shell,则可以比使用纯 pythonic 方式更快地获得结果。

Windows 示例:

import os
import subprocess

def get_num_files(path):
    cmd = 'DIR \"%s\" /A-D /B /S | FIND /C /V ""' % path
    return int(subprocess.check_output(cmd, shell=True))

If you'll be using the standard shell of the operating system, you can get the result much faster rather than using pure pythonic way.

Example for Windows:

import os
import subprocess

def get_num_files(path):
    cmd = 'DIR \"%s\" /A-D /B /S | FIND /C /V ""' % path
    return int(subprocess.check_output(cmd, shell=True))
你列表最软的妹 2024-09-04 17:22:13

我找到了另一个答案,它可能是正确的接受答案。

for root, dirs, files in os.walk(input_path):    
for name in files:
    if os.path.splitext(name)[1] == '.TXT' or os.path.splitext(name)[1] == '.txt':
        datafiles.append(os.path.join(root,name)) 


print len(files) 

I found another answer which may be correct as accepted answer.

for root, dirs, files in os.walk(input_path):    
for name in files:
    if os.path.splitext(name)[1] == '.TXT' or os.path.splitext(name)[1] == '.txt':
        datafiles.append(os.path.join(root,name)) 


print len(files) 
緦唸λ蓇 2024-09-04 17:22:13

我编写的一个简单实用函数使用 os.scandir() 而不是 os.listdir()。

import os 

def count_files_in_dir(path: str) -> int:
    file_entries = [entry for entry in os.scandir(path) if entry.is_file()]

    return len(file_entries)

主要好处是,消除了对 os.path.is_file() 的需求,并替换为 os.DirEntry 实例的 is_file()还消除了对 os.path.join(DIR, file_name) 的需要,如其他答案所示。

A simple utility function I wrote that makes use of os.scandir() instead of os.listdir().

import os 

def count_files_in_dir(path: str) -> int:
    file_entries = [entry for entry in os.scandir(path) if entry.is_file()]

    return len(file_entries)

The main benefit is that, the need for os.path.is_file() is eliminated and replaced with os.DirEntry instance's is_file() which also removes the need for os.path.join(DIR, file_name) as shown in other answers.

厌味 2024-09-04 17:22:13

更简单的一个:

import os
number_of_files = len(os.listdir(directory))
print(number_of_files)

Simpler one:

import os
number_of_files = len(os.listdir(directory))
print(number_of_files)
不喜欢何必死缠烂打 2024-09-04 17:22:13
import os

total_con=os.listdir('<directory path>')

files=[]

for f_n in total_con:
   if os.path.isfile(f_n):
     files.append(f_n)


print len(files)
import os

total_con=os.listdir('<directory path>')

files=[]

for f_n in total_con:
   if os.path.isfile(f_n):
     files.append(f_n)


print len(files)
白芷 2024-09-04 17:22:13

我这样做了,这返回了文件夹中的文件数量(Attack_Data)...这工作正常。

import os
def fcount(path):
    #Counts the number of files in a directory
    count = 0
    for f in os.listdir(path):
        if os.path.isfile(os.path.join(path, f)):
            count += 1

    return count
path = r"C:\Users\EE EKORO\Desktop\Attack_Data" #Read files in folder
print (fcount(path))

i did this and this returned the number of files in the folder(Attack_Data)...this works fine.

import os
def fcount(path):
    #Counts the number of files in a directory
    count = 0
    for f in os.listdir(path):
        if os.path.isfile(os.path.join(path, f)):
            count += 1

    return count
path = r"C:\Users\EE EKORO\Desktop\Attack_Data" #Read files in folder
print (fcount(path))
又爬满兰若 2024-09-04 17:22:13

我发现有时我不知道是否会收到文件名或文件路径。所以我打印了 os walk 解决方案输出:

def count_number_of_raw_data_point_files(path: Union[str, Path], with_file_prefix: str) -> int:
    import os
    path: Path = force_expanduser(path)

    _, _, files = next(os.walk(path))
    # file_count = len(files)
    filename: str
    count: int = 0
    for filename in files:
        print(f'-->{filename=}')  # e.g. print -->filename='data_point_99.json'
        if with_file_prefix in filename:
            count += 1
    return count

out:

-->filename='data_point_780.json'
-->filename='data_point_781.json'
-->filename='data_point_782.json'
-->filename='data_point_783.json'
-->filename='data_point_784.json'
-->filename='data_point_785.json'
-->filename='data_point_786.json'
-->filename='data_point_787.json'
-->filename='data_point_788.json'
-->filename='data_point_789.json'
-->filename='data_point_79.json'
-->filename='data_point_790.json'
-->filename='data_point_791.json'
-->filename='data_point_792.json'
-->filename='data_point_793.json'
-->filename='data_point_794.json'
-->filename='data_point_795.json'
-->filename='data_point_796.json'
-->filename='data_point_797.json'
-->filename='data_point_798.json'
-->filename='data_point_799.json'
-->filename='data_point_8.json'
-->filename='data_point_80.json'
-->filename='data_point_800.json'
-->filename='data_point_801.json'
-->filename='data_point_802.json'
-->filename='data_point_803.json'
-->filename='data_point_804.json'
-->filename='data_point_805.json'
-->filename='data_point_806.json'
-->filename='data_point_807.json'
-->filename='data_point_808.json'
-->filename='data_point_809.json'
-->filename='data_point_81.json'
-->filename='data_point_810.json'
-->filename='data_point_811.json'
-->filename='data_point_812.json'
-->filename='data_point_813.json'
-->filename='data_point_814.json'
-->filename='data_point_815.json'
-->filename='data_point_816.json'
-->filename='data_point_817.json'
-->filename='data_point_818.json'
-->filename='data_point_819.json'
-->filename='data_point_82.json'
-->filename='data_point_820.json'
-->filename='data_point_821.json'
-->filename='data_point_822.json'
-->filename='data_point_823.json'
-->filename='data_point_824.json'
-->filename='data_point_825.json'
-->filename='data_point_826.json'
-->filename='data_point_827.json'
-->filename='data_point_828.json'
-->filename='data_point_829.json'
-->filename='data_point_83.json'
-->filename='data_point_830.json'
-->filename='data_point_831.json'
-->filename='data_point_832.json'
-->filename='data_point_833.json'
-->filename='data_point_834.json'
-->filename='data_point_835.json'
-->filename='data_point_836.json'
-->filename='data_point_837.json'
-->filename='data_point_838.json'
-->filename='data_point_839.json'
-->filename='data_point_84.json'
-->filename='data_point_840.json'
-->filename='data_point_841.json'
-->filename='data_point_842.json'
-->filename='data_point_843.json'
-->filename='data_point_844.json'
-->filename='data_point_845.json'
-->filename='data_point_846.json'
-->filename='data_point_847.json'
-->filename='data_point_848.json'
-->filename='data_point_849.json'
-->filename='data_point_85.json'
-->filename='data_point_850.json'
-->filename='data_point_851.json'
-->filename='data_point_852.json'
-->filename='data_point_853.json'
-->filename='data_point_86.json'
-->filename='data_point_87.json'
-->filename='data_point_88.json'
-->filename='data_point_89.json'
-->filename='data_point_9.json'
-->filename='data_point_90.json'
-->filename='data_point_91.json'
-->filename='data_point_92.json'
-->filename='data_point_93.json'
-->filename='data_point_94.json'
-->filename='data_point_95.json'
-->filename='data_point_96.json'
-->filename='data_point_97.json'
-->filename='data_point_98.json'
-->filename='data_point_99.json'
854

注意你可能需要排序。

I find that sometimes I don't know if I will receive filenames or the path to the file. So I printed the os walk solution output:

def count_number_of_raw_data_point_files(path: Union[str, Path], with_file_prefix: str) -> int:
    import os
    path: Path = force_expanduser(path)

    _, _, files = next(os.walk(path))
    # file_count = len(files)
    filename: str
    count: int = 0
    for filename in files:
        print(f'-->{filename=}')  # e.g. print -->filename='data_point_99.json'
        if with_file_prefix in filename:
            count += 1
    return count

out:

-->filename='data_point_780.json'
-->filename='data_point_781.json'
-->filename='data_point_782.json'
-->filename='data_point_783.json'
-->filename='data_point_784.json'
-->filename='data_point_785.json'
-->filename='data_point_786.json'
-->filename='data_point_787.json'
-->filename='data_point_788.json'
-->filename='data_point_789.json'
-->filename='data_point_79.json'
-->filename='data_point_790.json'
-->filename='data_point_791.json'
-->filename='data_point_792.json'
-->filename='data_point_793.json'
-->filename='data_point_794.json'
-->filename='data_point_795.json'
-->filename='data_point_796.json'
-->filename='data_point_797.json'
-->filename='data_point_798.json'
-->filename='data_point_799.json'
-->filename='data_point_8.json'
-->filename='data_point_80.json'
-->filename='data_point_800.json'
-->filename='data_point_801.json'
-->filename='data_point_802.json'
-->filename='data_point_803.json'
-->filename='data_point_804.json'
-->filename='data_point_805.json'
-->filename='data_point_806.json'
-->filename='data_point_807.json'
-->filename='data_point_808.json'
-->filename='data_point_809.json'
-->filename='data_point_81.json'
-->filename='data_point_810.json'
-->filename='data_point_811.json'
-->filename='data_point_812.json'
-->filename='data_point_813.json'
-->filename='data_point_814.json'
-->filename='data_point_815.json'
-->filename='data_point_816.json'
-->filename='data_point_817.json'
-->filename='data_point_818.json'
-->filename='data_point_819.json'
-->filename='data_point_82.json'
-->filename='data_point_820.json'
-->filename='data_point_821.json'
-->filename='data_point_822.json'
-->filename='data_point_823.json'
-->filename='data_point_824.json'
-->filename='data_point_825.json'
-->filename='data_point_826.json'
-->filename='data_point_827.json'
-->filename='data_point_828.json'
-->filename='data_point_829.json'
-->filename='data_point_83.json'
-->filename='data_point_830.json'
-->filename='data_point_831.json'
-->filename='data_point_832.json'
-->filename='data_point_833.json'
-->filename='data_point_834.json'
-->filename='data_point_835.json'
-->filename='data_point_836.json'
-->filename='data_point_837.json'
-->filename='data_point_838.json'
-->filename='data_point_839.json'
-->filename='data_point_84.json'
-->filename='data_point_840.json'
-->filename='data_point_841.json'
-->filename='data_point_842.json'
-->filename='data_point_843.json'
-->filename='data_point_844.json'
-->filename='data_point_845.json'
-->filename='data_point_846.json'
-->filename='data_point_847.json'
-->filename='data_point_848.json'
-->filename='data_point_849.json'
-->filename='data_point_85.json'
-->filename='data_point_850.json'
-->filename='data_point_851.json'
-->filename='data_point_852.json'
-->filename='data_point_853.json'
-->filename='data_point_86.json'
-->filename='data_point_87.json'
-->filename='data_point_88.json'
-->filename='data_point_89.json'
-->filename='data_point_9.json'
-->filename='data_point_90.json'
-->filename='data_point_91.json'
-->filename='data_point_92.json'
-->filename='data_point_93.json'
-->filename='data_point_94.json'
-->filename='data_point_95.json'
-->filename='data_point_96.json'
-->filename='data_point_97.json'
-->filename='data_point_98.json'
-->filename='data_point_99.json'
854

note you might have to sort.

起风了 2024-09-04 17:22:13

我想扩展@Mr_and_Mrs_D 的回复:

import os
folder = 'C:/Dropbox'
file_count = sum(len(files) for _, _, files in os.walk(folder))
print(file_count)

这会计算文件夹及其子文件夹中的所有文件。但是,如果您想做一些过滤 - 例如只计算以 .svg 结尾的文件,您可以这样做:

import os
file_count = sum(len([f for f in files if f.endswith('.svg')]) for _, _, files in os.walk(folder))
print(file_count)

您基本上将:

  • len(files)

替换为:

  • len([f for f in files if f.endswith('.svg')])

I would like to extend the reply from @Mr_and_Mrs_D:

import os
folder = 'C:/Dropbox'
file_count = sum(len(files) for _, _, files in os.walk(folder))
print(file_count)

This counts all the files in the folder and its subfolders. However, if you want to do some filtering - like only counting the files ending in .svg, you can do:

import os
file_count = sum(len([f for f in files if f.endswith('.svg')]) for _, _, files in os.walk(folder))
print(file_count)

You basically replace:

  • len(files)

with:

  • len([f for f in files if f.endswith('.svg')])
椵侞 2024-09-04 17:22:13

将其转换为列表,之后您可以使用 len() 函数:

len(list(glob.glob('*')))

Convert it to a list, after that you can make use of the len() function:

len(list(glob.glob('*')))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文