超时后恢复 FTP 下载

发布于 2024-11-27 21:57:06 字数 2758 浏览 2 评论 0原文

我正在从一个不稳定的 FTP 服务器下载文件,该服务器在文件传输过程中经常超时,我想知道是否有办法重新连接并恢复下载。我正在使用 Python 的 ftplib。这是我正在使用的代码:

#! /usr/bin/python

import ftplib
import os
import socket
import sys

#--------------------------------#
# Define parameters for ftp site #
#--------------------------------#
site           = 'a.really.unstable.server'
user           = 'anonymous'
password       = '[email protected]'
root_ftp_dir   = '/directory1/'
root_local_dir = '/directory2/'

#---------------------------------------------------------------
# Tuple of order numbers to download. Each web request generates 
# an order numbers
#---------------------------------------------------------------
order_num = ('1','2','3','4')

#----------------------------------------------------------------#
# Loop through each order. Connect to server on each loop. There #
# might be a time out for the connection therefore reconnect for #
# every new ordernumber                                          #
#----------------------------------------------------------------#
# First change local directory
os.chdir(root_local_dir)

# Begin loop through 
for order in order_num:
    
    print 'Begin Proccessing order number %s' %order
    
    # Connect to FTP site
    try:
        ftp = ftplib.FTP( host=site, timeout=1200 )
    except (socket.error, socket.gaierror), e:
        print 'ERROR: Unable to reach "%s"' %site
        sys.exit()
    
    # Login
    try:
        ftp.login(user,password)
    except ftplib.error_perm:
        print 'ERROR: Unable to login'
        ftp.quit()
        sys.exit()
     
    # Change remote directory to location of order
    try:
        ftp.cwd(root_ftp_dir+order)
    except ftplib.error_perm:
        print 'Unable to CD to "%s"' %(root_ftp_dir+order)
        sys.exit()

    # Get a list of files
    try:
        filelist = ftp.nlst()
    except ftplib.error_perm:
        print 'Unable to get file list from "%s"' %order
        sys.exit()
    
    #---------------------------------#
    # Loop through files and download #
    #---------------------------------#
    for each_file in filelist:
        
        file_local = open(each_file,'wb')
        
        try:
            ftp.retrbinary('RETR %s' %each_file, file_local.write)
            file_local.close()
        except ftplib.error_perm:
            print 'ERROR: cannot read file "%s"' %each_file
            os.unlink(each_file)
        
    ftp.quit()
    
    print 'Finished Proccessing order number %s' %order
    
sys.exit()

我得到的错误:

socket.error: [Errno 110] 连接超时

非常感谢任何帮助。

I'm downloading files from a flaky FTP server that often times out during file transfer and I was wondering if there was a way to reconnect and resume the download. I'm using Python's ftplib. Here is the code that I am using:

#! /usr/bin/python

import ftplib
import os
import socket
import sys

#--------------------------------#
# Define parameters for ftp site #
#--------------------------------#
site           = 'a.really.unstable.server'
user           = 'anonymous'
password       = '[email protected]'
root_ftp_dir   = '/directory1/'
root_local_dir = '/directory2/'

#---------------------------------------------------------------
# Tuple of order numbers to download. Each web request generates 
# an order numbers
#---------------------------------------------------------------
order_num = ('1','2','3','4')

#----------------------------------------------------------------#
# Loop through each order. Connect to server on each loop. There #
# might be a time out for the connection therefore reconnect for #
# every new ordernumber                                          #
#----------------------------------------------------------------#
# First change local directory
os.chdir(root_local_dir)

# Begin loop through 
for order in order_num:
    
    print 'Begin Proccessing order number %s' %order
    
    # Connect to FTP site
    try:
        ftp = ftplib.FTP( host=site, timeout=1200 )
    except (socket.error, socket.gaierror), e:
        print 'ERROR: Unable to reach "%s"' %site
        sys.exit()
    
    # Login
    try:
        ftp.login(user,password)
    except ftplib.error_perm:
        print 'ERROR: Unable to login'
        ftp.quit()
        sys.exit()
     
    # Change remote directory to location of order
    try:
        ftp.cwd(root_ftp_dir+order)
    except ftplib.error_perm:
        print 'Unable to CD to "%s"' %(root_ftp_dir+order)
        sys.exit()

    # Get a list of files
    try:
        filelist = ftp.nlst()
    except ftplib.error_perm:
        print 'Unable to get file list from "%s"' %order
        sys.exit()
    
    #---------------------------------#
    # Loop through files and download #
    #---------------------------------#
    for each_file in filelist:
        
        file_local = open(each_file,'wb')
        
        try:
            ftp.retrbinary('RETR %s' %each_file, file_local.write)
            file_local.close()
        except ftplib.error_perm:
            print 'ERROR: cannot read file "%s"' %each_file
            os.unlink(each_file)
        
    ftp.quit()
    
    print 'Finished Proccessing order number %s' %order
    
sys.exit()

The error that I get:

socket.error: [Errno 110] Connection timed out

Any help is greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

注定孤独终老 2024-12-04 21:57:07

为此,您必须保留中断的下载,然后找出丢失的文件部分,下载这些部分,然后将它们连接在一起。我不知道如何执行此操作,但 Firefox 和 Chrome 中有一个名为 DownThemAll 的下载管理器可以执行此操作。虽然代码不是用 python 编写的(我认为是 JavaScript),但您可以查看代码并了解它是如何执行此操作的。

DownThemll - http://www.downthemall.net/

To do this, you would have to keep the interrupted download, then figure out which parts of the file you are missing, download those parts and then connect them together. I'm not sure how to do this, but there is a download manager for Firefox and Chrome called DownThemAll that does this. Although the code is not written in python (I think it's JavaScript), you could look at the code and see how it does this.

DownThemll - http://www.downthemall.net/

梦纸 2024-12-04 21:57:06

仅使用标准设施(请参阅 RFC959)通过 FTP 恢复下载需要使用该块传输模式(第 3.4.2 节),可以使用 MODE B 命令进行设置。尽管此功能在技术上是符合规范所必需的,但我不确定所有 FTP 服务器软件都实现了它。

在块传输模式中,与流传输模式相反,服务器以块的形式发送文件,每个块都有一个标记。该标记可以重新提交到服务器以重新启动失败的传输(第 3.5 节)。

规范说:

[...] 提供重新启动过程来保护用户免受严重系统故障(包括主机、FTP 进程或底层网络故障)的影响。

然而,据我所知,该规范没有定义标记所需的寿命。它只说了以下几点:

标记信息仅对发送者有意义,但必须由控制连接的默认或协商语言(ASCII 或 EBCDIC)中的可打印字符组成。标记可以表示位计数、记录计数或系统可以用来识别数据检查点的任何其他信息。数据接收方如果实现了重启过程,就会在接收系统中标记该标记对应的位置,并将该信息返回给用户。

可以安全地假设实现此功能的服务器将提供在 FTP 会话之间有效的标记,但您的情况可能会有所不同。

Resuming a download through FTP using only standard facilities (see RFC959) requires use of the block transmission mode (section 3.4.2), which can be set using the MODE B command. Although this feature is technically required for conformance to the specification, I'm not sure all FTP server software implements it.

In the block transmission mode, as opposed to the stream transmission mode, the server sends the file in chunks, each of which has a marker. This marker may be re-submitted to the server to restart a failed transfer (section 3.5).

The specification says:

[...] a restart procedure is provided to protect users from gross system failures (including failures of a host, an FTP-process, or the underlying network).

However, AFAIK, the specification does not define a required lifetime for markers. It only says the following:

The marker information has meaning only to the sender, but must consist of printable characters in the default or negotiated language of the control connection (ASCII or EBCDIC). The marker could represent a bit-count, a record-count, or any other information by which a system may identify a data checkpoint. The receiver of data, if it implements the restart procedure, would then mark the corresponding position of this marker in the receiving system, and return this information to the user.

It should be safe to assume that servers implementing this feature will provide markers that are valid between FTP sessions, but your mileage may vary.

绾颜 2024-12-04 21:57:06

使用 Python ftplib 实现可恢复 FTP 下载的简单示例:

def connect():

ftp = None

with open('bigfile', 'wb') as f:
    while (not finished):
        if ftp is None:
            print("Connecting...")
            FTP(host, user, passwd)

        try:
            rest = f.tell()
            if rest == 0:
                rest = None
                print("Starting new transfer...")
            else:
                print(f"Resuming transfer from {rest}...")
            ftp.retrbinary('RETR bigfile', f.write, rest=rest)
            print("Done")
            finished = True
        except Exception as e:
            ftp = None
            sec = 5
            print(f"Transfer failed: {e}, will retry in {sec} seconds...")
            time.sleep(sec)

建议进行更细粒度的异常处理。

对于上传也类似:
在Python ftplib FTP传输文件上传中处理断开连接

A simple example for implementing a resumable FTP download using Python ftplib:

def connect():

ftp = None

with open('bigfile', 'wb') as f:
    while (not finished):
        if ftp is None:
            print("Connecting...")
            FTP(host, user, passwd)

        try:
            rest = f.tell()
            if rest == 0:
                rest = None
                print("Starting new transfer...")
            else:
                print(f"Resuming transfer from {rest}...")
            ftp.retrbinary('RETR bigfile', f.write, rest=rest)
            print("Done")
            finished = True
        except Exception as e:
            ftp = None
            sec = 5
            print(f"Transfer failed: {e}, will retry in {sec} seconds...")
            time.sleep(sec)

More fine-grained exception handling is advisable.

Similarly for uploads:
Handling disconnects in Python ftplib FTP transfers file upload

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文