快速 ls 命令

发布于 2024-07-04 06:13:20 字数 208 浏览 12 评论 0原文

我必须获得包含大约 200 万个文件的目录列表,但是当我对其执行 ls 命令时,什么也没有返回。 我已经等了3个小时了 我已经尝试过 ls | tee directory.txt,但这似乎永远挂起。

我假设服务器正在执行大量索引节点排序。 有没有什么方法可以加快 ls 命令的速度以获取文件名的目录列表? 我现在不关心大小、日期、许可等。

I've got to get a directory listing that contains about 2 million files, but when I do an ls command on it nothing comes back. I've waited 3 hours. I've tried ls | tee directory.txt, but that seems to hang forever.

I assume the server is doing a lot of inode sorting. Is there any way to speed up the ls command to just get a directory listing of filenames? I don't care about size, dates, permission or the like at this time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(19

凤舞天涯 2024-07-11 06:13:20

这可能不是一个有用的答案,但如果您没有 find,您也许可以使用 tar

$ tar cvf /dev/null .

比我年长的人告诉我,“在过去,单用户和恢复环境比现在要有限得多。 这就是这个技巧的由来。

This is probably not a helpful answer, but if you don't have find you may be able to make do with tar

$ tar cvf /dev/null .

I am told by people older than me that, "back in the day", single-user and recovery environments were a lot more limited than they are nowadays. That's where this trick comes from.

↘紸啶 2024-07-11 06:13:20

我假设你正在使用 GNU ls?
尝试

\ls

它将取消通常的 ls (ls --color=auto) 的别名。

I'm assuming you are using GNU ls?
try

\ls

It will unalias the usual ls (ls --color=auto).

何处潇湘 2024-07-11 06:13:20

要尝试的事情:

检查 ls 是否有别名?

alias ls

也许尝试查找?

find . \( -type d -name . -prune \) -o \( -type f -print \)

希望这可以帮助。

Things to try:

Check ls isn't aliased?

alias ls

Perhaps try find instead?

find . \( -type d -name . -prune \) -o \( -type f -print \)

Hope this helps.

小猫一只 2024-07-11 06:13:20

一些后续行动:
您没有提及您正在运行的操作系统,这将有助于表明您正在使用哪个版本的 ls 。 这可能不是一个“bash”问题,而是一个 ls 问题。 我的猜测是你正在使用 GNU ls,它有一些在某些情况下有用的功能,但在大目录上会杀掉你。

GNU ls 试图对列进行更漂亮的排列。 GNU ls 尝试对所有文件名进行智能排列。 在一个巨大的目录中,这将需要一些时间和内存。

要“修复”此问题,您可以尝试:

ls -1 # no columns at all

find BSD ls someplace, http://www.freebsd.org/cgi/cvsweb.cgi/src/bin/ls/ 并使用在你的大目录上。

使用其他工具,例如 find

Some followup:
You don't mention what OS you're running on, which would help indicate which version of ls you're using. This probably isn't a 'bash' question as much as an ls question. My guess is that you're using GNU ls, which has some features that are useful in some contexts, but kill you on big directories.

GNU ls Trying to have prettier arranging of columns. GNU ls tries to do a smart arrange of all the filenames. In a huge directory, this will take some time, and memory.

To 'fix' this, you can try:

ls -1 # no columns at all

find BSD ls someplace, http://www.freebsd.org/cgi/cvsweb.cgi/src/bin/ls/ and use that on your big directories.

Use other tools, such as find

眸中客 2024-07-11 06:13:20

有多种方法可以获取文件列表:

使用此命令获取不排序的列表:

ls -U

或使用以下命令将文件列表发送到文件:

ls /Folder/path > ~/Desktop/List.txt

There are several ways to get a list of files:

Use this command to get a list without sorting:

ls -U

or send the list of files to a file by using:

ls /Folder/path > ~/Desktop/List.txt
方觉久 2024-07-11 06:13:20

您使用什么分区类型?

如果一个目录中有数百万个小文件,那么使用 JFS 或 ReiserFS 可能是一个好主意,它们对于许多小文件具有更好的性能。

What partition type are you using?

Having millions of small files in one directory it might be a good idea to use JFS or ReiserFS which have better performance with many small sized files.

π浅易 2024-07-11 06:13:20

find ./ -type f 怎么样(它将查找当前目录中的所有文件)? 取消-type f即可找到所有内容。

How about find ./ -type f (which will find all files in the currently directory)? Take off the -type f to find everything.

恋你朝朝暮暮 2024-07-11 06:13:20

您应该提供有关您正在使用的操作系统和文件系统类型的信息。 在某些风格的 UNIX 和某些文件系统上,您可能可以使用命令 ffncheck 作为替代方案。

You should provide information about what operating system and the type of filesystem you are using. On certain flavours of UNIX and certain filesystems you might be able to use the commands ff and ncheck as alternatives.

我们的影子 2024-07-11 06:13:20

我有一个文件名中带有时间戳的目录。 我想检查最新文件的日期并找到 find 。 -类型 f -最大深度 1 | 排序| tail -n 1 的速度大约是 ls -alh 的两倍。

I had a directory with timestamps in the file names. I wanted to check the date of the latest file and found find . -type f -maxdepth 1 | sort | tail -n 1 to be about twice as fast as ls -alh.

北方的韩爷 2024-07-11 06:13:20

这里还有很多其他好的解决方案,但为了完整性:

echo *

Lots of other good solutions here, but in the interest of completeness:

echo *
郁金香雨 2024-07-11 06:13:20

您还可以使用 xargs。 只需通过 xargs 传输 ls 的输出即可。

ls | xargs

如果这不起作用并且上面的 find 示例不起作用,请尝试将它们通过管道传输到 xargs,因为它可以帮助减少可能导致问题的内存使用情况。

You can also make use of xargs. Just pipe the output of ls through xargs.

ls | xargs

If that doesn't work and the find examples above aren't working, try piping them to xargs as it can help the memory usage that might be causing your problems.

素年丶 2024-07-11 06:13:20

如果进程“没有回来”,我建议 strace 来分析进程如何进程正在与操作系统交互。

如果是 ls:

$strace ls

您会看到它读取所有目录条目 (getdents(2)) 在它实际输出任何内容之前。 (排序……因为这里已经提到了)

If a process "doesn't come back", I recommend strace to analyze how a process is interacting with the operating system.

In case of ls:

$strace ls

you would have seen that it reads all directory entries (getdents(2)) before it actually outputs anything. (sorting… as it was already mentioned here)

分分钟 2024-07-11 06:13:20

使用速度

ls -1 -f 

大约快 10 倍,而且很容易做到(我测试了 100 万个文件,但我最初的问题有 6 800 000 000 个文件),

但就我而言,我需要检查某个特定目录是否包含超过 10 000 个文件。 如果有超过 10 000 个文件,我不再对有多少文件感兴趣。 我只是退出程序,以便它运行得更快,并且不会尝试一一阅读其余部分。 如果少于 10 000,我会打印确切的数量。 如果您指定的参数值大于文件数量,我的程序的速度与 ls -1 -f 非常相似。

您可以通过键入以下内容在当前目录中使用我的程序 find_if_more.pl:

find_if_more.pl 999999999

如果您只对是否有超过 n 个文件感兴趣,则对于大量文件,脚本将比 ls -1 -f 更快完成。

#!/usr/bin/perl
    use warnings;
    my ($maxcount) = @ARGV;
    my $dir = '.';
    $filecount = 0;
    if (not defined $maxcount) {
      die "Need maxcount\n";
    }
    opendir(DIR, $dir) or die $!;
    while (my $file = readdir(DIR)) {
        $filecount = $filecount + 1;
        last if $filecount> $maxcount
    }
    print $filecount;
    closedir(DIR);
    exit 0;

Using

ls -1 -f 

is about 10 times faster and it is easy to do (I tested with 1 million files, but my original problem had 6 800 000 000 files)

But in my case I needed to check if some specific directory contains more than 10 000 files. If there were more than 10 000 files, I am not anymore interested that how many files there is. I just quit the program so that it will run faster and wont try to read the rest one-by-one. If there are less than 10 000, I will print the exact amount. Speed of my program is quite similar to ls -1 -f if you specify bigger value for parameter than amount of files.

You can use my program find_if_more.pl in current directory by typing:

find_if_more.pl 999999999

If you are just interested if there are more than n files, script will finish faster than ls -1 -f with very large amount of files.

#!/usr/bin/perl
    use warnings;
    my ($maxcount) = @ARGV;
    my $dir = '.';
    $filecount = 0;
    if (not defined $maxcount) {
      die "Need maxcount\n";
    }
    opendir(DIR, $dir) or die $!;
    while (my $file = readdir(DIR)) {
        $filecount = $filecount + 1;
        last if $filecount> $maxcount
    }
    print $filecount;
    closedir(DIR);
    exit 0;
呆° 2024-07-11 06:13:20

据我所知,这将是最快的选项:ls -1 -f

  • -1(无列)
  • -f(无排序)

This would be the fastest option AFAIK: ls -1 -f.

  • -1 (No columns)
  • -f (No sorting)
千仐 2024-07-11 06:13:20

您可以重定向输出并在后台运行 ls 进程。

ls > myls.txt &

这将使您能够在业务运行的同时继续开展业务。 它不会锁定你的外壳。

不确定运行 ls 并获取更少数据的选项是什么。 您始终可以运行 man ls 进行检查。

You can redirect output and run the ls process in the background.

ls > myls.txt &

This would allow you to go on about your business while its running. It wouldn't lock up your shell.

Not sure about what options are for running ls and getting less data back. You could always run man ls to check.

泪眸﹌ 2024-07-11 06:13:20

尝试使用:

find . -type f -maxdepth 1

这只会列出目录中的文件,如果您想列出文件和目录,请省略 -type f 参数。

Try using:

find . -type f -maxdepth 1

This will only list the files in the directory, leave out the -type f argument if you want to list files and directories.

So尛奶瓶 2024-07-11 06:13:20

我有一个包含 400 万个文件的目录,让 ls 立即吐出文件而无需首先进行大量搅动的唯一方法是

ls -1U

I have a directory with 4 million files in it and the only way I got ls to spit out files immediately without a lot of churning first was

ls -1U
菩提树下叶撕阳。 2024-07-11 06:13:20
ls -U

将执行 ls 而不排序。

缓慢的另一个来源是--color。 在某些 Linux 机器上,有一个方便的别名,它将 --color=auto' 添加到 ls 调用中,使其查找找到的每个文件的文件属性(缓慢),以对显示进行着色。 这可以通过 ls -U --color=never\ls -U 来避免。

ls -U

will do the ls without sorting.

Another source of slowness is --color. On some linux machines, there is a convenience alias which adds --color=auto' to the ls call, making it look up file attributes for each file found (slow), to color the display. This can be avoided by ls -U --color=never or \ls -U.

我ぃ本無心為│何有愛 2024-07-11 06:13:20

这个问题似乎很有趣,我正在查看发布的多个答案。 为了了解发布的答案的效率,我在 200 万个文件上执行了它们,发现结果如下。

$ time tar cvf /dev/null . &> /tmp/file-count

real    37m16.553s
user    0m11.525s
sys     0m41.291s

------------------------------------------------------

$ time echo ./* &> /tmp/file-count

real    0m50.808s
user    0m49.291s
sys     0m1.404s

------------------------------------------------------

$ time ls &> /tmp/file-count

real    0m42.167s
user    0m40.323s
sys     0m1.648s

------------------------------------------------------

$ time find . &> /tmp/file-count

real    0m2.738s
user    0m1.044s
sys     0m1.684s

------------------------------------------------------

$ time ls -U &> /tmp/file-count

real    0m2.494s
user    0m0.848s
sys     0m1.452s


------------------------------------------------------

$ time ls -f &> /tmp/file-count

real    0m2.313s
user    0m0.856s
sys     0m1.448s

------------------------------------------------------

总结一下结果,

  1. ls -f 命令的运行速度比 ls -U 快一些。 禁用颜色可能会导致这种改进。
  2. find 命令以 2.738 秒的平均速度排名第三。
  3. 仅运行 ls 就花费了 42.16 秒。 在我的系统中,lsls --color=auto 的别名,
  4. 使用带有 echo ./* 的 shell 扩展功能运行了 50.80 秒。
  5. 基于 tar 的解决方案花费了大约 37 分钟。

所有测试均在系统空闲状态下单独进行。

这里需要注意的一件重要事情是文件列表不会打印在终端中,而是打印在终端中。
它们被重定向到一个文件,稍后使用 wc 命令计算文件计数。
如果输出打印在屏幕上,则命令运行速度太慢。

有什么想法为什么会发生这种情况吗?

This question seems to be interesting and I was going through multiple answers that were posted. To understand the efficiency of the answers posted, I have executed them on 2 million files and found the results as below.

$ time tar cvf /dev/null . &> /tmp/file-count

real    37m16.553s
user    0m11.525s
sys     0m41.291s

------------------------------------------------------

$ time echo ./* &> /tmp/file-count

real    0m50.808s
user    0m49.291s
sys     0m1.404s

------------------------------------------------------

$ time ls &> /tmp/file-count

real    0m42.167s
user    0m40.323s
sys     0m1.648s

------------------------------------------------------

$ time find . &> /tmp/file-count

real    0m2.738s
user    0m1.044s
sys     0m1.684s

------------------------------------------------------

$ time ls -U &> /tmp/file-count

real    0m2.494s
user    0m0.848s
sys     0m1.452s


------------------------------------------------------

$ time ls -f &> /tmp/file-count

real    0m2.313s
user    0m0.856s
sys     0m1.448s

------------------------------------------------------

To summarize the results

  1. ls -f command ran a bit faster than ls -U. Disabling color might have caused this improvement.
  2. find command ran third with an average speed of 2.738 seconds.
  3. Running just ls took 42.16 seconds. Here in my system ls is an alias for ls --color=auto
  4. Using shell expansion feature with echo ./* ran for 50.80 seconds.
  5. And the tar based solution took about 37 miuntes.

All tests were done seperately when system was in idle condition.

One important thing to note here is that the file lists are not printed in the terminal rather
they were redirected to a file and the file count was calculated later with wc command.
Commands ran too slow if the outputs where printed on the screen.

Any ideas why this happens ?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文