如何递归查找目录中最新修改的文​​件?

发布于 2024-10-09 20:55:31 字数 182 浏览 9 评论 0原文

执行递归调用时,ls 似乎无法正确对文件进行排序:

ls -altR . | head -n 3

How can I find themost returned file in a directory (includes subdirectories)?

It seems that ls doesn't sort the files correctly when doing a recursive call:

ls -altR . | head -n 3

How can I find the most recently modified file in a directory (including subdirectories)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(22

转瞬即逝 2024-10-16 20:55:31
find . -type f -printf '%T@ %p\n' \
| sort -n | tail -1 | cut -f2- -d" "

对于一棵巨大的树,sort 可能很难将所有内容都保存在内存中。

%T@ 为您提供像 unix 时间戳一样的修改时间,sort -n 按数字排序,tail -1 获取最后一行(最高时间戳) ),cut -f2 -d" " 从输出中删除第一个字段(时间戳)。

编辑: 正如 -printf 可能仅适用于 GNU 一样,ajreals 对 stat -c 的使用也是如此。虽然可以在 BSD 上执行相同的操作,但格式化选项不同(看起来是 -f "%m %N")

而且我错过了复数部分;如果您想要更多最新文件,只需增加 tail 参数即可。

find . -type f -printf '%T@ %p\n' \
| sort -n | tail -1 | cut -f2- -d" "

For a huge tree, it might be hard for sort to keep everything in memory.

%T@ gives you the modification time like a unix timestamp, sort -n sorts numerically, tail -1 takes the last line (highest timestamp), cut -f2 -d" " cuts away the first field (the timestamp) from the output.

Edit: Just as -printf is probably GNU-only, ajreals usage of stat -c is too. Although it is possible to do the same on BSD, the options for formatting is different (-f "%m %N" it would seem)

And I missed the part of plural; if you want more then the latest file, just bump up the tail argument.

旧人九事 2024-10-16 20:55:31

@plundra 的回答之后,这里是 BSD 和 OS X 版本:

find . -type f -print0 \
| xargs -0 stat -f "%m %N" \
| sort -rn | head -1 | cut -f2- -d" "

Following up on @plundra's answer, here's the BSD and OS X version:

find . -type f -print0 \
| xargs -0 stat -f "%m %N" \
| sort -rn | head -1 | cut -f2- -d" "
静谧幽蓝 2024-10-16 20:55:31

您可以使用 awk 只打印修改时间最长的结果(在 unix 时间中),而不是对结果进行排序并仅保留最后修改的结果:

find . -type f -printf "%T@\0%p\0" | awk '
    {
        if ($0>max) {
            max=$0; 
            getline mostrecent
        } else 
            getline
    } 
    END{print mostrecent}' RS='\0'

如果文件数量足够大,这应该是解决问题的更快方法。

我使用了 NUL 字符(即“\0”),因为理论上,文件名可以包含除此之外的任何字符(包括空格和换行符)。

如果您的系统中没有这样的病态文件名,您也可以使用换行符:

find . -type f -printf "%T@\n%p\n" | awk '
    {
        if ($0>max) {
            max=$0; 
            getline mostrecent
        } else 
            getline
    } 
    END{print mostrecent}' RS='\n'

此外,这也适用于 mawk。

Instead of sorting the results and keeping only the last modified ones, you could use awk to print only the one with greatest modification time (in unix time):

find . -type f -printf "%T@\0%p\0" | awk '
    {
        if ($0>max) {
            max=$0; 
            getline mostrecent
        } else 
            getline
    } 
    END{print mostrecent}' RS='\0'

This should be a faster way to solve your problem if the number of files is big enough.

I have used the NUL character (i.e. '\0') because, theoretically, a filename may contain any character (including space and newline) but that.

If you don't have such pathological filenames in your system you can use the newline character as well:

find . -type f -printf "%T@\n%p\n" | awk '
    {
        if ($0>max) {
            max=$0; 
            getline mostrecent
        } else 
            getline
    } 
    END{print mostrecent}' RS='\n'

In addition, this works in mawk too.

海风掠过北极光 2024-10-16 20:55:31

显示带有人类可读时间戳的最新文件:

find . -type f -printf '%TY-%Tm-%Td %TH:%TM: %Tz %p\n'| sort -n | tail -n1

结果如下所示:

2015-10-06 11:30: +0200 ./foo/bar.txt

要显示更多文件,请将 -n1 替换为更大的数字

Shows the latest file with human readable timestamp:

find . -type f -printf '%TY-%Tm-%Td %TH:%TM: %Tz %p\n'| sort -n | tail -n1

Result looks like this:

2015-10-06 11:30: +0200 ./foo/bar.txt

To show more files, replace -n1 with a higher number

知你几分 2024-10-16 20:55:31

即使对于子目录,这似乎也能正常工作:

find . -type f | xargs ls -ltr | tail -n 1

如果文件太多,请优化查找。

This seems to work fine, even with subdirectories:

find . -type f | xargs ls -ltr | tail -n 1

In case of too many files, refine the find.

仅一夜美梦 2024-10-16 20:55:31

我在 Solaris 10 下找到最后修改的文件时遇到了麻烦。其中 find 没有 printf 选项,并且 stat 不可用。我发现以下解决方案对我来说很有效:

find . -type f | sed 's/.*/"&"/' | xargs ls -E | awk '{ print $6," ",$7 }' | sort | tail -1

要显示文件名,请使用

find . -type f | sed 's/.*/"&"/' | xargs ls -E | awk '{ print $6," ",$7," ",$9 }' | sort | tail -1

Explanation

  • find 。 -type f 查找并列出所有文件
  • sed 's/.*/"&"/' 将路径名括在引号中以处理空格
  • xargs ls -E > 将带引号的路径发送到 ls-E 选项确保完整的时间戳(格式年-月-日时-分-秒-纳秒< /em>) 返回
  • awk '{ print $6," ",$7 }' 仅提取日期和时间
  • awk '{ print $6," ",$7," ",$9 }' 提取日期、时间和文件名
  • sort 返回按日期排序的文件
  • tail -1 仅返回最后修改的文件

I had the trouble to find the last modified file under Solaris 10. There find does not have the printf option and stat is not available. I discovered the following solution which works well for me:

find . -type f | sed 's/.*/"&"/' | xargs ls -E | awk '{ print $6," ",$7 }' | sort | tail -1

To show the filename as well use

find . -type f | sed 's/.*/"&"/' | xargs ls -E | awk '{ print $6," ",$7," ",$9 }' | sort | tail -1

Explanation

  • find . -type f finds and lists all files
  • sed 's/.*/"&"/' wraps the pathname in quotes to handle whitespaces
  • xargs ls -E sends the quoted path to ls, the -E option makes sure that a full timestamp (format year-month-day hour-minute-seconds-nanoseconds) is returned
  • awk '{ print $6," ",$7 }' extracts only date and time
  • awk '{ print $6," ",$7," ",$9 }' extracts date, time and filename
  • sort returns the files sorted by date
  • tail -1 returns only the last modified file
开始看清了 2024-10-16 20:55:31

我一直使用类似的东西,以及最近修改的文件的 top-k 列表。对于大型目录树,避免排序会更快。在仅 top-1 最近修改的文件的情况下:

find . -type f -printf '%T@ %p\n' | perl -ne '@a=split(/\s+/, $_, 2); ($t,$f)=@a if $a[0]>$t; print $f if eof()'

在包含 170 万个文件的目录中,我在 3.4 秒内获得最新的文件,与使用排序的 25.5 秒解决方案相比,速度提高了 7.5 倍。

I use something similar all the time, as well as the top-k list of most recently modified files. For large directory trees, it can be much faster to avoid sorting. In the case of just top-1 most recently modified file:

find . -type f -printf '%T@ %p\n' | perl -ne '@a=split(/\s+/, $_, 2); ($t,$f)=@a if $a[0]>$t; print $f if eof()'

On a directory containing 1.7 million files, I get the most recent one in 3.4s, a speed-up of 7.5x against the 25.5s solution using sort.

不离久伴 2024-10-16 20:55:31

使用 find — 带有 Nice &快速时间戳

下面介绍如何在具有子目录的目录中查找并列出最新修改的文​​件。 故意忽略隐藏文件。可以自定义时间格式。

$ find . -type f -not -path '*/\.*' -printf '%TY-%Tm-%Td %TH:%TM %Ta %p\n' |sort -nr |head -n 10

结果

完美地处理文件名中的空格 - 不应该使用这些!

2017-01-25 18:23 Wed ./indenting/Shifting blocks visually.mht
2016-12-11 12:33 Sun ./tabs/Converting tabs to spaces.mht
2016-12-02 01:46 Fri ./advocacy/2016.Vim or Emacs - Which text editor do you prefer?.mht
2016-11-09 17:05 Wed ./Word count - Vim Tips Wiki.mht

更多

更多 find 链接如下。

Using find — with nice & fast time stamp

Here is how to find and list the latest modified files in a directory with subdirectories. Hidden files are ignored on purpose. The time format can be customised.

$ find . -type f -not -path '*/\.*' -printf '%TY-%Tm-%Td %TH:%TM %Ta %p\n' |sort -nr |head -n 10

Result

Handles spaces in file names perfectly well — not that these should be used!

2017-01-25 18:23 Wed ./indenting/Shifting blocks visually.mht
2016-12-11 12:33 Sun ./tabs/Converting tabs to spaces.mht
2016-12-02 01:46 Fri ./advocacy/2016.Vim or Emacs - Which text editor do you prefer?.mht
2016-11-09 17:05 Wed ./Word count - Vim Tips Wiki.mht

More

More find galore following the link.

晨曦慕雪 2024-10-16 20:55:31

这给出了一个排序列表:

find . -type f -ls 2>/dev/null | sort -M -k8,10 | head -n5

通过在排序命令中放置“-r”来反转顺序。如果您只需要文件名,请插入“awk '{print $11}' |”在'|之前头'

This gives a sorted list:

find . -type f -ls 2>/dev/null | sort -M -k8,10 | head -n5

Reverse the order by placing a '-r' in the sort command. If you only want filenames, insert "awk '{print $11}' |" before '| head'

萌辣 2024-10-16 20:55:31

我发现以下输出更短且更易于解释:

find . -type f -printf '%TF %TT %p\n' | sort | tail -1

鉴于标准化 ISO 格式日期时间的固定长度,字典排序很好,我们不需要排序时的 -n 选项。

如果你想再次删除时间戳,你可以使用:

find . -type f -printf '%TFT%TT %p\n' | sort | tail -1 | cut -f2- -d' '

I find the following shorter and with more interpretable output:

find . -type f -printf '%TF %TT %p\n' | sort | tail -1

Given the fixed length of the standardised ISO format datetimes, lexicographical sorting is fine and we don't need the -n option on the sort.

If you want to remove the timestamps again, you can use:

find . -type f -printf '%TFT%TT %p\n' | sort | tail -1 | cut -f2- -d' '
复古式 2024-10-16 20:55:31

在 Ubuntu 13 上,下面的代码可以做到这一点,也许更快一点,因为它颠倒了排序并使用“头”而不是“尾”,从而减少了工作。要显示树中的 11 个最新文件:

find . -type f -printf '%T@ %p\n' |排序 -n -r |头-11 |切 -f2- -d" " | sed -e 's,^./,,' | sed -e 's,^./,,' | xargs ls -U -l

这给出了完整的 ls 列表,无需重新排序,并省略了“find”在每个文件名上添加的烦人的“./”。

或者,作为 bash 函数:

treecent () {
  local numl
  if [[ 0 -eq $# ]] ; then
    numl=11   # Or whatever default you want.
  else
    numl=$1
  fi
  find . -type f -printf '%T@ %p\n' | sort -n -r | head -${numl} |  cut -f2- -d" " | sed -e 's,^\./,,' | xargs ls -U -l
}

尽管如此,大部分工作都是由 plundra 的原始解决方案完成的。谢谢普伦德拉。

On Ubuntu 13, the following does it, maybe a tad faster, as it reverses the sort and uses 'head' instead of 'tail', reducing the work. To show the 11 newest files in a tree:

find . -type f -printf '%T@ %p\n' | sort -n -r | head -11 | cut -f2- -d" " | sed -e 's,^./,,' | xargs ls -U -l

This gives a complete ls listing without re-sorting and omits the annoying './' that 'find' puts on every file name.

Or, as a bash function:

treecent () {
  local numl
  if [[ 0 -eq $# ]] ; then
    numl=11   # Or whatever default you want.
  else
    numl=$1
  fi
  find . -type f -printf '%T@ %p\n' | sort -n -r | head -${numl} |  cut -f2- -d" " | sed -e 's,^\./,,' | xargs ls -U -l
}

Still, most of the work was done by plundra's original solution. Thanks plundra.

调妓 2024-10-16 20:55:31

我遇到了同样的问题。我需要递归地找到最新的文件。 find 花了大约50分钟才找到。

这是一个可以更快地完成此操作的小脚本:

#!/bin/sh

CURRENT_DIR='.'

zob () {
    FILE=$(ls -Art1 ${CURRENT_DIR} | tail -n 1)
    if [ ! -f ${FILE} ]; then
        CURRENT_DIR="${CURRENT_DIR}/${FILE}"
        zob
    fi
    echo $FILE
    exit
}
zob

它是一个递归函数,用于获取目录的最新修改项目。如果此项是目录,则递归调用该函数并搜索该目录等。

I faced the same issue. I need to find the most recent file recursively. find took around 50 minutes to find.

Here is a little script to do it faster:

#!/bin/sh

CURRENT_DIR='.'

zob () {
    FILE=$(ls -Art1 ${CURRENT_DIR} | tail -n 1)
    if [ ! -f ${FILE} ]; then
        CURRENT_DIR="${CURRENT_DIR}/${FILE}"
        zob
    fi
    echo $FILE
    exit
}
zob

It's a recursive function who get the most recent modified item of a directory. If this item is a directory, the function is called recursively and search into this directory, etc.

感性 2024-10-16 20:55:31

搜索 /target_directory 及其所有子目录中最近 60 分钟内修改过的文件:

$ find /target_directory -type f -mmin -60

查找最近修改的文件,按更新时间的相反顺序排序(即,首先更新的文件) :

$ find /etc -type f -printf '%TY-%Tm-%Td %TT %p\n' | sort -r

To search for files in /target_directory and all its sub-directories, that have been modified in the last 60 minutes:

$ find /target_directory -type f -mmin -60

To find the most recently modified files, sorted in the reverse order of update time (i.e., the most recently updated files first):

$ find /etc -type f -printf '%TY-%Tm-%Td %TT %p\n' | sort -r
清引 2024-10-16 20:55:31

如果单独对每个文件运行 stat 会降低速度,您可以使用 xargs 来加快速度:

find . -type f -print0 | xargs -0 stat -f "%m %N" | sort -n | tail -1 | cut -f2- -d" " 

If running stat on each file individually is to slow you can use xargs to speed things up a bit:

find . -type f -print0 | xargs -0 stat -f "%m %N" | sort -n | tail -1 | cut -f2- -d" " 
南巷近海 2024-10-16 20:55:31

这会递归地将当前目录中所有目录的修改时间更改为每个目录中的最新文件:

for dir in */; do find $dir -type f -printf '%T@ "%p"\n' | sort -n | tail -1 | cut -f2- -d" " | xargs -I {} touch -r {} $dir; done

This recursively changes the modification time of all directories in the current directory to the newest file in each directory:

for dir in */; do find $dir -type f -printf '%T@ "%p"\n' | sort -n | tail -1 | cut -f2- -d" " | xargs -I {} touch -r {} $dir; done
随梦而飞# 2024-10-16 20:55:31

我更喜欢这个,它更短:

find . -type f -print0|xargs -0 ls -drt|tail -n 1

I prefer this one, it is shorter:

find . -type f -print0|xargs -0 ls -drt|tail -n 1
南街女流氓 2024-10-16 20:55:31

这个简单的 cli 也可以工作:

ls -1t | head -1

您可以将 -1 更改为您想要列出的文件数

This simple cli will also work:

ls -1t | head -1

You may change the -1 to the number of files you want to list

脱离于你 2024-10-16 20:55:31

以下命令适用于 Solaris:

find . -name "*zip" -type f | xargs ls -ltr | tail -1 

The following command worked on Solaris :

find . -name "*zip" -type f | xargs ls -ltr | tail -1 
浅笑依然 2024-10-16 20:55:31

使用基于 find 的解决方案多年后,我发现自己希望能够排除 .git 等目录。

我切换到了这个基于 rsync 的解决方案。将其放入 ~/bin/findlatest 中:

#!/bin/sh
# Finds most recently modified files.
rsync -rL --list-only "$@" | grep -v '^d' | sort -k3,4r | head -5

现在 findlatest . 将列出 5 个最近修改的文件,并且 findlatest --exclude .git .将列出 .git 中排除的 5 个。

这是通过利用一些很少使用的 rsync 功能来实现的:“如果在没有目标的情况下指定 [to rsync] 单个源参数,则文件将以类似于 ls -l 的输出格式列出”rsync 手册页)。

获取 rsync args 的能力与基于 rsync 的备份工具结合使用非常有用。例如,我使用 rsnapshot,并使用 rsnapshot.conf 行备份应用程序目录:

backup  /var/atlassian/application-data/jira/current/   home    +rsync_long_args=--archive --filter="merge /opt/atlassian/jira/current/backups/rsync-excludes"

其中,rsync-excludes 列出了我没有列出的目录想要备份:

- log/
- logs/
- analytics-logs/
- tmp/
- monitor/*.rrd4j

我现在可以看到将要备份的最新文件:

findlatest /var/atlassian/application-data/jira/current/ --filter="merge /opt/atlassian/jira/current/backups/rsync-excludes"

After using a find-based solution for years, I found myself wanting the ability to exclude directories like .git.

I switched to this rsync-based solution. Put this in ~/bin/findlatest:

#!/bin/sh
# Finds most recently modified files.
rsync -rL --list-only "$@" | grep -v '^d' | sort -k3,4r | head -5

Now findlatest . will list the 5 most recently modified files, and findlatest --exclude .git . will list the 5 excluding ones in .git.

This works by taking advantage of some little-used rsync functionality: "if a single source arg is specified [to rsync] without a destination, the files are listed in an output format similar to ls -l" (rsync man page).

The ability to take rsync args is useful in conjunction with rsync-based backup tools. For instance I use rsnapshot, and I back up an application directory with rsnapshot.conf line:

backup  /var/atlassian/application-data/jira/current/   home    +rsync_long_args=--archive --filter="merge /opt/atlassian/jira/current/backups/rsync-excludes"

where rsync-excludes lists directories I don't want to backup:

- log/
- logs/
- analytics-logs/
- tmp/
- monitor/*.rrd4j

I can see now the latest files that will be backed up with:

findlatest /var/atlassian/application-data/jira/current/ --filter="merge /opt/atlassian/jira/current/backups/rsync-excludes"
暮色兮凉城 2024-10-16 20:55:31

我发现上面的命令很有用,但对于我的情况,我还需要查看文件的日期和时间,并且我遇到了多个名称中包含空格的文件的问题。
这是我的工作解决方案。

find . -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" " | sed 's/.*/"&"/' | xargs ls -l

I found the command above useful, but for my case I needed to see the date and time of the file as well I had an issue with several files that have spaces in the names.
Here is my working solution.

find . -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" " | sed 's/.*/"&"/' | xargs ls -l
娇妻 2024-10-16 20:55:31

我为这个问题编写了一个 pypi/github 包,因为我也需要一个解决方案。

https://github.com/bucknerns/logtail

安装:

pip install logtail

用法:尾部更改的文件

logtail <log dir> [<glob match: default=*.log>]

用法2:打开最新更改的 文件编辑器中的文件

editlatest <log dir> [<glob match: default=*.log>]

I wrote a pypi/github package for this question because I needed a solution as well.

https://github.com/bucknerns/logtail

Install:

pip install logtail

Usage: tails changed files

logtail <log dir> [<glob match: default=*.log>]

Usage2: Opens latest changed file in editor

editlatest <log dir> [<glob match: default=*.log>]
秋意浓 2024-10-16 20:55:31

快速、支持无限文件,并且特殊字符安全:

find . -type f -printf '%T@ %p\0' \
| sort -zn | tail -zn1 | cut -zc 23- | xargs --null ls -l

将最后的 ls -l 替换为应处理文件名的任何内容。

Fast, unlimited files supported, and special-characters safe:

find . -type f -printf '%T@ %p\0' \
| sort -zn | tail -zn1 | cut -zc 23- | xargs --null ls -l

Replace the final ls -l with whatever should process the filename.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文