如何递归查找目录中最新修改的文件?
执行递归调用时,ls
似乎无法正确对文件进行排序:
ls -altR . | head -n 3
How can I find themost returned file in a directory (includes subdirectories)?
It seems that ls
doesn't sort the files correctly when doing a recursive call:
ls -altR . | head -n 3
How can I find the most recently modified file in a directory (including subdirectories)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(22)
对于一棵巨大的树,
sort
可能很难将所有内容都保存在内存中。%T@
为您提供像 unix 时间戳一样的修改时间,sort -n
按数字排序,tail -1
获取最后一行(最高时间戳) ),cut -f2 -d" "
从输出中删除第一个字段(时间戳)。编辑: 正如
-printf
可能仅适用于 GNU 一样,ajreals 对stat -c
的使用也是如此。虽然可以在 BSD 上执行相同的操作,但格式化选项不同(看起来是 -f "%m %N")而且我错过了复数部分;如果您想要更多最新文件,只需增加 tail 参数即可。
For a huge tree, it might be hard for
sort
to keep everything in memory.%T@
gives you the modification time like a unix timestamp,sort -n
sorts numerically,tail -1
takes the last line (highest timestamp),cut -f2 -d" "
cuts away the first field (the timestamp) from the output.Edit: Just as
-printf
is probably GNU-only, ajreals usage ofstat -c
is too. Although it is possible to do the same on BSD, the options for formatting is different (-f "%m %N"
it would seem)And I missed the part of plural; if you want more then the latest file, just bump up the tail argument.
继 @plundra 的回答之后,这里是 BSD 和 OS X 版本:
Following up on @plundra's answer, here's the BSD and OS X version:
您可以使用 awk 只打印修改时间最长的结果(在 unix 时间中),而不是对结果进行排序并仅保留最后修改的结果:
如果文件数量足够大,这应该是解决问题的更快方法。
我使用了 NUL 字符(即“\0”),因为理论上,文件名可以包含除此之外的任何字符(包括空格和换行符)。
如果您的系统中没有这样的病态文件名,您也可以使用换行符:
此外,这也适用于 mawk。
Instead of sorting the results and keeping only the last modified ones, you could use awk to print only the one with greatest modification time (in unix time):
This should be a faster way to solve your problem if the number of files is big enough.
I have used the NUL character (i.e. '\0') because, theoretically, a filename may contain any character (including space and newline) but that.
If you don't have such pathological filenames in your system you can use the newline character as well:
In addition, this works in mawk too.
显示带有人类可读时间戳的最新文件:
结果如下所示:
要显示更多文件,请将
-n1
替换为更大的数字Shows the latest file with human readable timestamp:
Result looks like this:
To show more files, replace
-n1
with a higher number即使对于子目录,这似乎也能正常工作:
如果文件太多,请优化查找。
This seems to work fine, even with subdirectories:
In case of too many files, refine the find.
我在 Solaris 10 下找到最后修改的文件时遇到了麻烦。其中
find
没有printf
选项,并且stat
不可用。我发现以下解决方案对我来说很有效:要显示文件名,请使用
Explanation
find 。 -type f
查找并列出所有文件sed 's/.*/"&"/'
将路径名括在引号中以处理空格xargs ls -E
> 将带引号的路径发送到ls
,-E
选项确保完整的时间戳(格式年-月-日时-分-秒-纳秒< /em>) 返回awk '{ print $6," ",$7 }'
仅提取日期和时间awk '{ print $6," ",$7," ",$9 }'
提取日期、时间和文件名sort
返回按日期排序的文件tail -1
仅返回最后修改的文件I had the trouble to find the last modified file under Solaris 10. There
find
does not have theprintf
option andstat
is not available. I discovered the following solution which works well for me:To show the filename as well use
Explanation
find . -type f
finds and lists all filessed 's/.*/"&"/'
wraps the pathname in quotes to handle whitespacesxargs ls -E
sends the quoted path tols
, the-E
option makes sure that a full timestamp (format year-month-day hour-minute-seconds-nanoseconds) is returnedawk '{ print $6," ",$7 }'
extracts only date and timeawk '{ print $6," ",$7," ",$9 }'
extracts date, time and filenamesort
returns the files sorted by datetail -1
returns only the last modified file我一直使用类似的东西,以及最近修改的文件的 top-k 列表。对于大型目录树,避免排序会更快。在仅 top-1 最近修改的文件的情况下:
在包含 170 万个文件的目录中,我在 3.4 秒内获得最新的文件,与使用排序的 25.5 秒解决方案相比,速度提高了 7.5 倍。
I use something similar all the time, as well as the top-k list of most recently modified files. For large directory trees, it can be much faster to avoid sorting. In the case of just top-1 most recently modified file:
On a directory containing 1.7 million files, I get the most recent one in 3.4s, a speed-up of 7.5x against the 25.5s solution using sort.
使用
find
— 带有 Nice &快速时间戳下面介绍如何在具有子目录的目录中查找并列出最新修改的文件。 故意忽略隐藏文件。可以自定义时间格式。
结果
完美地处理文件名中的空格 - 不应该使用这些!
更多
更多
find
链接如下。Using
find
— with nice & fast time stampHere is how to find and list the latest modified files in a directory with subdirectories. Hidden files are ignored on purpose. The time format can be customised.
Result
Handles spaces in file names perfectly well — not that these should be used!
More
More
find
galore following the link.这给出了一个排序列表:
通过在排序命令中放置“-r”来反转顺序。如果您只需要文件名,请插入“awk '{print $11}' |”在'|之前头'
This gives a sorted list:
Reverse the order by placing a '-r' in the sort command. If you only want filenames, insert "awk '{print $11}' |" before '| head'
我发现以下输出更短且更易于解释:
鉴于标准化 ISO 格式日期时间的固定长度,字典排序很好,我们不需要排序时的
-n
选项。如果你想再次删除时间戳,你可以使用:
I find the following shorter and with more interpretable output:
Given the fixed length of the standardised ISO format datetimes, lexicographical sorting is fine and we don't need the
-n
option on the sort.If you want to remove the timestamps again, you can use:
在 Ubuntu 13 上,下面的代码可以做到这一点,也许更快一点,因为它颠倒了排序并使用“头”而不是“尾”,从而减少了工作。要显示树中的 11 个最新文件:
find . -type f -printf '%T@ %p\n' |排序 -n -r |头-11 |切 -f2- -d" " | sed -e 's,^./,,' | sed -e 's,^./,,' | xargs ls -U -l
这给出了完整的 ls 列表,无需重新排序,并省略了“find”在每个文件名上添加的烦人的“./”。
或者,作为 bash 函数:
尽管如此,大部分工作都是由 plundra 的原始解决方案完成的。谢谢普伦德拉。
On Ubuntu 13, the following does it, maybe a tad faster, as it reverses the sort and uses 'head' instead of 'tail', reducing the work. To show the 11 newest files in a tree:
find . -type f -printf '%T@ %p\n' | sort -n -r | head -11 | cut -f2- -d" " | sed -e 's,^./,,' | xargs ls -U -l
This gives a complete ls listing without re-sorting and omits the annoying './' that 'find' puts on every file name.
Or, as a bash function:
Still, most of the work was done by plundra's original solution. Thanks plundra.
我遇到了同样的问题。我需要递归地找到最新的文件。 find 花了大约50分钟才找到。
这是一个可以更快地完成此操作的小脚本:
它是一个递归函数,用于获取目录的最新修改项目。如果此项是目录,则递归调用该函数并搜索该目录等。
I faced the same issue. I need to find the most recent file recursively. find took around 50 minutes to find.
Here is a little script to do it faster:
It's a recursive function who get the most recent modified item of a directory. If this item is a directory, the function is called recursively and search into this directory, etc.
搜索 /target_directory 及其所有子目录中最近 60 分钟内修改过的文件:
查找最近修改的文件,按更新时间的相反顺序排序(即,首先更新的文件) :
To search for files in /target_directory and all its sub-directories, that have been modified in the last 60 minutes:
To find the most recently modified files, sorted in the reverse order of update time (i.e., the most recently updated files first):
如果单独对每个文件运行
stat
会降低速度,您可以使用xargs
来加快速度:If running
stat
on each file individually is to slow you can usexargs
to speed things up a bit:这会递归地将当前目录中所有目录的修改时间更改为每个目录中的最新文件:
This recursively changes the modification time of all directories in the current directory to the newest file in each directory:
我更喜欢这个,它更短:
I prefer this one, it is shorter:
这个简单的 cli 也可以工作:
您可以将 -1 更改为您想要列出的文件数
This simple cli will also work:
You may change the -1 to the number of files you want to list
以下命令适用于 Solaris:
The following command worked on Solaris :
使用基于
find
的解决方案多年后,我发现自己希望能够排除.git
等目录。我切换到了这个基于 rsync 的解决方案。将其放入
~/bin/findlatest
中:现在
findlatest .
将列出 5 个最近修改的文件,并且findlatest --exclude .git .
将列出.git
中排除的 5 个。这是通过利用一些很少使用的 rsync 功能来实现的:“如果在没有目标的情况下指定 [to rsync] 单个源参数,则文件将以类似于 ls -l 的输出格式列出” (
rsync
手册页)。获取 rsync args 的能力与基于 rsync 的备份工具结合使用非常有用。例如,我使用 rsnapshot,并使用 rsnapshot.conf 行备份应用程序目录:
其中,rsync-excludes 列出了我没有列出的目录想要备份:
我现在可以看到将要备份的最新文件:
After using a
find
-based solution for years, I found myself wanting the ability to exclude directories like.git
.I switched to this
rsync
-based solution. Put this in~/bin/findlatest
:Now
findlatest .
will list the 5 most recently modified files, andfindlatest --exclude .git .
will list the 5 excluding ones in.git
.This works by taking advantage of some little-used rsync functionality: "if a single source arg is specified [to rsync] without a destination, the files are listed in an output format similar to ls -l" (
rsync
man page).The ability to take rsync args is useful in conjunction with rsync-based backup tools. For instance I use
rsnapshot
, and I back up an application directory withrsnapshot.conf
line:where
rsync-excludes
lists directories I don't want to backup:I can see now the latest files that will be backed up with:
我发现上面的命令很有用,但对于我的情况,我还需要查看文件的日期和时间,并且我遇到了多个名称中包含空格的文件的问题。
这是我的工作解决方案。
I found the command above useful, but for my case I needed to see the date and time of the file as well I had an issue with several files that have spaces in the names.
Here is my working solution.
我为这个问题编写了一个 pypi/github 包,因为我也需要一个解决方案。
https://github.com/bucknerns/logtail
安装:
用法:尾部更改的文件
用法2:打开最新更改的 文件编辑器中的文件
I wrote a pypi/github package for this question because I needed a solution as well.
https://github.com/bucknerns/logtail
Install:
Usage: tails changed files
Usage2: Opens latest changed file in editor
快速、支持无限文件,并且特殊字符安全:
将最后的 ls -l 替换为应处理文件名的任何内容。
Fast, unlimited files supported, and special-characters safe:
Replace the final
ls -l
with whatever should process the filename.