遍历FTP列表

发布于 2024-08-13 20:03:36 字数 699 浏览 3 评论 0 原文

我试图从 FTP 服务器获取所有目录的名称,并将它们按层次结构顺序存储在多维列表或字典中,

例如,包含以下结构的服务器:

/www/
    mysite.com
        images
            png
            jpg

在脚本末尾,会给我一个列表比如

['/www/'
  ['mysite.com'
    ['images'
      ['png'],
      ['jpg']
    ]
  ]
]

我尝试使用递归函数,如下所示: def 遍历(目录): FTP.dir(dir, traverse)

FTP.dir 以这种格式返回行:

drwxr-xr-x    5 leavesc1 leavesc1     4096 Nov 29 20:52 mysite.com

因此执行 line[56:] 只会给我目录名称(mysite.com)。我在递归函数中使用它。

但我无法让它工作。我尝试了很多不同的方法,但无法使其发挥作用。还有很多 FTP 错误(要么找不到目录 - 这是一个逻辑问题,有时服务器会返回意外错误,这不会留下任何日志,我无法调试)

底线问题: 如何从 FTP 服务器获取分层目录列表?

I am trying to to get all directories' name from an FTP server and store them in hierarchical order in a multidimensional list or dict

So for example, a server that contains the following structure:

/www/
    mysite.com
        images
            png
            jpg

at the end of the script, would give me a list such as

['/www/'
  ['mysite.com'
    ['images'
      ['png'],
      ['jpg']
    ]
  ]
]

I have tried using a recursive function like so:
def traverse(dir):
FTP.dir(dir, traverse)

FTP.dir returns lines in this format:

drwxr-xr-x    5 leavesc1 leavesc1     4096 Nov 29 20:52 mysite.com

so doing line[56:] will give me just the directory name(mysite.com). I use this in the recursive function.

But i cannot get it to work. I've tried many different approaches and can't get it to work. Lots of FTP errors as well (either can't find the directory - which is a logical issue, and sometimes unexpected errors returned by the server, which leaves no log and i can't debug)

bottom line question:
How to get a hierarchical directory listing from an FTP server?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

街角迷惘 2024-08-20 20:03:36

这是一个幼稚且缓慢的实现。它很慢,因为它尝试对每个目录条目进行 CWD 以确定它是目录还是文件,但这是可行的。人们可以通过解析 LIST 命令输出来优化它,但这强烈依赖于服务器实现。

import ftplib

def traverse(ftp, depth=0):
    """
    return a recursive listing of an ftp server contents (starting
    from the current directory)

    listing is returned as a recursive dictionary, where each key
    contains a contents of the subdirectory or None if it corresponds
    to a file.

    @param ftp: ftplib.FTP object
    """
    if depth > 10:
        return ['depth > 10']
    level = {}
    for entry in (path for path in ftp.nlst() if path not in ('.', '..')):
        try:
            ftp.cwd(entry)
            level[entry] = traverse(ftp, depth+1)
            ftp.cwd('..')
        except ftplib.error_perm:
            level[entry] = None
    return level

def main():
    ftp = ftplib.FTP("localhost")
    ftp.connect()
    ftp.login()
    ftp.set_pasv(True)

    print traverse(ftp)

if __name__ == '__main__':
    main()

Here is a naive and slow implementation. It is slow because it tries to CWD to each directory entry to determine if it is a directory or a file, but this works. One could optimize it by parsing LIST command output, but this is strongly server-implementation dependent.

import ftplib

def traverse(ftp, depth=0):
    """
    return a recursive listing of an ftp server contents (starting
    from the current directory)

    listing is returned as a recursive dictionary, where each key
    contains a contents of the subdirectory or None if it corresponds
    to a file.

    @param ftp: ftplib.FTP object
    """
    if depth > 10:
        return ['depth > 10']
    level = {}
    for entry in (path for path in ftp.nlst() if path not in ('.', '..')):
        try:
            ftp.cwd(entry)
            level[entry] = traverse(ftp, depth+1)
            ftp.cwd('..')
        except ftplib.error_perm:
            level[entry] = None
    return level

def main():
    ftp = ftplib.FTP("localhost")
    ftp.connect()
    ftp.login()
    ftp.set_pasv(True)

    print traverse(ftp)

if __name__ == '__main__':
    main()
箜明 2024-08-20 20:03:36

这是对我有用的 Python 3 脚本的初稿。它比调用cwd()快得多。传入服务器、端口、目录、用户名和密码作为参数。我将输出保留为列表,作为读者的练习。

import ftplib
import sys

def ftp_walk(ftp, dir):
    dirs = []
    nondirs = []
    for item in ftp.mlsd(dir):
        if item[1]['type'] == 'dir':
            dirs.append(item[0])
        else:
            nondirs.append(item[0])
    if nondirs:
        print()
        print('{}:'.format(dir))
        print('\n'.join(sorted(nondirs)))
    else:
        # print(dir, 'is empty')
        pass
    for subdir in sorted(dirs):
        ftp_walk(ftp, '{}/{}'.format(dir, subdir))

ftp = ftplib.FTP()
ftp.connect(sys.argv[1], int(sys.argv[2]))
ftp.login(sys.argv[4], sys.argv[5])
ftp_walk(ftp, sys.argv[3])

Here's a first draft of a Python 3 script that worked for me. It's much faster than calling cwd(). Pass in server, port, directory, username, and password as arguments. I left output as a list as an exercise for the reader.

import ftplib
import sys

def ftp_walk(ftp, dir):
    dirs = []
    nondirs = []
    for item in ftp.mlsd(dir):
        if item[1]['type'] == 'dir':
            dirs.append(item[0])
        else:
            nondirs.append(item[0])
    if nondirs:
        print()
        print('{}:'.format(dir))
        print('\n'.join(sorted(nondirs)))
    else:
        # print(dir, 'is empty')
        pass
    for subdir in sorted(dirs):
        ftp_walk(ftp, '{}/{}'.format(dir, subdir))

ftp = ftplib.FTP()
ftp.connect(sys.argv[1], int(sys.argv[2]))
ftp.login(sys.argv[4], sys.argv[5])
ftp_walk(ftp, sys.argv[3])
懷念過去 2024-08-20 20:03:36

您不会喜欢这个,但是“这取决于服务器”,或者更准确地说,“这取决于服务器的输出格式”。

不同的服务器可以设置显示不同的输出,因此您最初的建议在一般情况下必然会失败。

上面的“幼稚而缓慢的实现”会导致足够多的错误,以至于一些 FTP 服务器会切断你的连接(这可能是大约 7 个之后发生的事情......)。

You're not going to like this, but "it depends on the server" or, more accurately, "it depends on the output format of the server".

Different servers can be set to display different output, so your initial proposal is bound to failure in the general case.

The "naive and slow implementation" above will cause enough errors that some FTP servers will cut you off (which is probably what happened after about 7 of them...).

So尛奶瓶 2024-08-20 20:03:36

如果服务器支持 MLSD 命令,则使用 那个答案。

If the server supports the MLSD command, then use the “a directory and its descendants” code from that answer.

静若繁花 2024-08-20 20:03:36

如果我们使用Python,请查看:

http://docs.python.org/library/ os.path.html (os.path.walk)

如果已经有一个很好的模块,就不要重新发明轮子。不敢相信上面两个位置的帖子得到了两个提升,无论如何,享受吧。

If we are using Python look at:

http://docs.python.org/library/os.path.html (os.path.walk)

If there already is a good module for this, don't reinvent the wheel. Can't believe the post two spots above got two ups, anyway, enjoy.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文