Python FTP 按日期获取最新文件

发布于 2024-12-28 12:09:30 字数 1852 浏览 1 评论 0原文

我正在使用 ftplib 连接到 ftp 站点。我想获取最近上传的文件并下载它。我能够连接到 ftp 服务器并列出文件,我还将它们放入列表中并转换了 datefield。是否有任何函数/模块可以获取最近的日期并从列表中输出整行?

#!/usr/bin/env python

import ftplib
import os
import socket
import sys


HOST = 'test'


def main():
    try:
        f = ftplib.FTP(HOST)
    except (socket.error, socket.gaierror), e:
        print 'cannot reach to %s' % HOST
        return
    print "Connect to ftp server"

    try:
        f.login('anonymous','[email protected]')
    except ftplib.error_perm:
        print 'cannot login anonymously'
        f.quit()
        return
    print "logged on to the ftp server"

    data = []
    f.dir(data.append)
    for line in data:
        datestr = ' '.join(line.split()[0:2])
        orig-date = time.strptime(datestr, '%d-%m-%y %H:%M%p')


    f.quit()
    return


if __name__ == '__main__':
    main()

已解决:

data = []
f.dir(data.append)
datelist = []
filelist = []
for line in data:
    col = line.split()
    datestr = ' '.join(line.split()[0:2])
    date = time.strptime(datestr, '%m-%d-%y %H:%M%p')
    datelist.append(date)
    filelist.append(col[3])

combo = zip(datelist,filelist)
who = dict(combo)

for key in sorted(who.iterkeys(), reverse=True):
   print "%s: %s" % (key,who[key])
   filename = who[key]
   print "file to download is %s" % filename
   try:
       f.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
   except ftplib.err_perm:
       print "Error: cannot read file %s" % filename
       os.unlink(filename)
   else:
       print "***Downloaded*** %s " % filename
   return

f.quit()
return

一个问题,是否可以从字典中检索第一个元素?我在这里所做的是 for 循环仅运行一次并退出,从而给我第一个排序值,这很好,但我不认为以这种方式执行此操作是一个好习惯。

I am using ftplib to connect to an ftp site. I want to get the most recently uploaded file and download it. I am able to connect to the ftp server and list the files, I also have put them in a list and got the datefield converted. Is there any function/module which can get the recent date and output the whole line from the list?

#!/usr/bin/env python

import ftplib
import os
import socket
import sys


HOST = 'test'


def main():
    try:
        f = ftplib.FTP(HOST)
    except (socket.error, socket.gaierror), e:
        print 'cannot reach to %s' % HOST
        return
    print "Connect to ftp server"

    try:
        f.login('anonymous','[email protected]')
    except ftplib.error_perm:
        print 'cannot login anonymously'
        f.quit()
        return
    print "logged on to the ftp server"

    data = []
    f.dir(data.append)
    for line in data:
        datestr = ' '.join(line.split()[0:2])
        orig-date = time.strptime(datestr, '%d-%m-%y %H:%M%p')


    f.quit()
    return


if __name__ == '__main__':
    main()

RESOLVED:

data = []
f.dir(data.append)
datelist = []
filelist = []
for line in data:
    col = line.split()
    datestr = ' '.join(line.split()[0:2])
    date = time.strptime(datestr, '%m-%d-%y %H:%M%p')
    datelist.append(date)
    filelist.append(col[3])

combo = zip(datelist,filelist)
who = dict(combo)

for key in sorted(who.iterkeys(), reverse=True):
   print "%s: %s" % (key,who[key])
   filename = who[key]
   print "file to download is %s" % filename
   try:
       f.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
   except ftplib.err_perm:
       print "Error: cannot read file %s" % filename
       os.unlink(filename)
   else:
       print "***Downloaded*** %s " % filename
   return

f.quit()
return

One problem, is it possible to retrieve the first element from the dictionary? what I did here is that the for loop runs only once and exits thereby giving me the first sorted value which is fine, but I don't think it is a good practice to do it in this way..

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

同展鸳鸯锦 2025-01-04 12:09:30

对于那些寻找在文件夹中查找最新文件的完整解决方案的人:

MLSD

如果您的 FTP 服务器支持 MLSD 命令,解决方案很简单:

entries = list(ftp.mlsd())
entries.sort(key = lambda entry: entry[1]['modify'], reverse = True)
latest_name = entries[0][0]
print(latest_name)

LIST

如果您需要依赖过时的 LIST 命令,您必须解析它返回的专有列表。

常见的 *nix 列表如下:

-rw-r--r-- 1 user group           4467 Mar 27  2018 file1.zip
-rw-r--r-- 1 user group         124529 Jun 18 15:31 file2.zip

有了这样的列表,这段代码就可以了:

from dateutil import parser

# ...

lines = []
ftp.dir("", lines.append)

latest_time = None
latest_name = None

for line in lines:
    tokens = line.split(maxsplit = 9)
    time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
    time = parser.parse(time_str)
    if (latest_time is None) or (time > latest_time):
        latest_name = tokens[8]
        latest_time = time

print(latest_name)

这是一种相当脆弱的方法。


MDTM

一种更可靠但效率较低的方法是使用 MDTM 命令来检索单个文件/文件夹的时间戳:

names = ftp.nlst()

latest_time = None
latest_name = None

for name in names:
    time = ftp.voidcmd("MDTM " + name)
    if (latest_time is None) or (time > latest_time):
        latest_name = name
        latest_time = time

print(latest_name)

有关代码的替代版本,请参阅 @Paulo 的回答


非标准 -t 开关

某些 FTP 服务器支持专有的非标准-t 用于 NLST(或 LIST)命令的开关。

lines = ftp.nlst("-t")

latest_name = lines[-1]

请参阅如何获取按修改时间排序的 FTP 文件夹中的文件


下载找到的文件

无论您使用什么方法,一旦获得了 latest_name,您就可以像下载任何其他文件一样下载它:

with open(latest_name, 'wb') as f:
    ftp.retrbinary('RETR '+ latest_name, f.write)

另请参阅

For those looking for a full solution for finding the latest file in a folder:

MLSD

If your FTP server supports MLSD command, a solution is easy:

entries = list(ftp.mlsd())
entries.sort(key = lambda entry: entry[1]['modify'], reverse = True)
latest_name = entries[0][0]
print(latest_name)

LIST

If you need to rely on an obsolete LIST command, you have to parse a proprietary listing it returns.

Common *nix listing is like:

-rw-r--r-- 1 user group           4467 Mar 27  2018 file1.zip
-rw-r--r-- 1 user group         124529 Jun 18 15:31 file2.zip

With a listing like this, this code will do:

from dateutil import parser

# ...

lines = []
ftp.dir("", lines.append)

latest_time = None
latest_name = None

for line in lines:
    tokens = line.split(maxsplit = 9)
    time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
    time = parser.parse(time_str)
    if (latest_time is None) or (time > latest_time):
        latest_name = tokens[8]
        latest_time = time

print(latest_name)

This is a rather fragile approach.


MDTM

A more reliable, but a way less efficient, is to use MDTM command to retrieve timestamps of individual files/folders:

names = ftp.nlst()

latest_time = None
latest_name = None

for name in names:
    time = ftp.voidcmd("MDTM " + name)
    if (latest_time is None) or (time > latest_time):
        latest_name = name
        latest_time = time

print(latest_name)

For an alternative version of the code, see the answer by @Paulo.


Non-standard -t switch

Some FTP servers support a proprietary non-standard -t switch for NLST (or LIST) command.

lines = ftp.nlst("-t")

latest_name = lines[-1]

See How to get files in FTP folder sorted by modification time.


Downloading found file

No matter what approach you use, once you have the latest_name, you download it as any other file:

with open(latest_name, 'wb') as f:
    ftp.retrbinary('RETR '+ latest_name, f.write)

See also

£噩梦荏苒 2025-01-04 12:09:30

为什么不使用下一个目录选项?

ftp.dir('-t',data.append)

使用此选项,文件列表按时间从最新到最旧的顺序排列。然后只需检索列表中的第一个文件即可下载。

Why don't you use next dir option?

ftp.dir('-t',data.append)

With this option the file listing is time ordered from newest to oldest. Then just retrieve the first file in the list to download it.

残月升风 2025-01-04 12:09:30

使用 NLST,如 Martin Prikryl 的回复所示,
你应该使用 sorted 方法:

ftp = FTP(host="127.0.0.1", user="u",passwd="p")
ftp.cwd("/data")
file_name = sorted(ftp.nlst(), key=lambda x: ftp.voidcmd(f"MDTM {x}"))[-1]

With NLST, like shown in Martin Prikryl's response,
you should use sorted method:

ftp = FTP(host="127.0.0.1", user="u",passwd="p")
ftp.cwd("/data")
file_name = sorted(ftp.nlst(), key=lambda x: ftp.voidcmd(f"MDTM {x}"))[-1]
耳钉梦 2025-01-04 12:09:30

如果您有 time.struct_time (strptime 会给你这个)在一个列表中那么你所要做的就是对列表进行排序

这是一个例子:

#!/usr/bin/python

import time

dates = [
    "Jan 16 18:35 2012",
    "Aug 16 21:14 2012",
    "Dec 05 22:27 2012",
    "Jan 22 19:42 2012",
    "Jan 24 00:49 2012",
    "Dec 15 22:41 2012",
    "Dec 13 01:41 2012",
    "Dec 24 01:23 2012",
    "Jan 21 00:35 2012",
    "Jan 16 18:35 2012",
]

def main():
    datelist = []
    for date in dates:
        date = time.strptime(date, '%b %d %H:%M %Y')
        datelist.append(date)

    print datelist
    datelist.sort()
    print datelist

if __name__ == '__main__':
    main()

If you have all the dates in time.struct_time (strptime will give you this) in a list then all you have to do is sort the list.

Here's an example :

#!/usr/bin/python

import time

dates = [
    "Jan 16 18:35 2012",
    "Aug 16 21:14 2012",
    "Dec 05 22:27 2012",
    "Jan 22 19:42 2012",
    "Jan 24 00:49 2012",
    "Dec 15 22:41 2012",
    "Dec 13 01:41 2012",
    "Dec 24 01:23 2012",
    "Jan 21 00:35 2012",
    "Jan 16 18:35 2012",
]

def main():
    datelist = []
    for date in dates:
        date = time.strptime(date, '%b %d %H:%M %Y')
        datelist.append(date)

    print datelist
    datelist.sort()
    print datelist

if __name__ == '__main__':
    main()
温柔嚣张 2025-01-04 12:09:30

我不知道你的 ftp 怎么样,但你的例子不适合我。我更改了与日期排序部分相关的一些行:

    import sys
    from ftplib import FTP
    import os
    import socket
    import time

    # Connects to the ftp
    ftp = FTP(ftpHost)
    ftp.login(yourUserName,yourPassword)
    data = []
    datelist = []
    filelist = []
    ftp.dir(data.append)
    for line in data:
      col = line.split()
      datestr = ' '.join(line.split()[5:8])
      date = time.strptime(datestr, '%b %d %H:%M')
      datelist.append(date)
      filelist.append(col[8])
    combo = zip(datelist,filelist)
    who = dict(combo)
    for key in sorted(who.iterkeys(), reverse=True):
      print "%s: %s" % (key,who[key])
      filename = who[key]
      print "file to download is %s" % filename
      try:
        ftp.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
      except ftplib.err_perm:
        print "Error: cannot read file %s" % filename
        os.unlink(filename)
      else:
        print "***Downloaded*** %s " % filename
    ftp.quit()

I don't know how it's your ftp, but your example was not working for me. I changed some lines related to the date sorting part:

    import sys
    from ftplib import FTP
    import os
    import socket
    import time

    # Connects to the ftp
    ftp = FTP(ftpHost)
    ftp.login(yourUserName,yourPassword)
    data = []
    datelist = []
    filelist = []
    ftp.dir(data.append)
    for line in data:
      col = line.split()
      datestr = ' '.join(line.split()[5:8])
      date = time.strptime(datestr, '%b %d %H:%M')
      datelist.append(date)
      filelist.append(col[8])
    combo = zip(datelist,filelist)
    who = dict(combo)
    for key in sorted(who.iterkeys(), reverse=True):
      print "%s: %s" % (key,who[key])
      filename = who[key]
      print "file to download is %s" % filename
      try:
        ftp.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
      except ftplib.err_perm:
        print "Error: cannot read file %s" % filename
        os.unlink(filename)
      else:
        print "***Downloaded*** %s " % filename
    ftp.quit()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文