Python HTTPConnection 文件使用 httplib 发送，检索进度

发布于 2024-11-16 11:09:09 字数 3360 浏览 4 评论 0原文

在 django 应用程序中，我使用第三方 Python 脚本来允许用户通过 EC2 实例上的 httplib.HTTPConnection.send 将文件上传到 blip.tv。由于这些文件通常很大，我将使用消息队列异步处理上传（RabbitMQ/Celery），并在前端向用户反馈进度。

httpconnection 和发送是在脚本的这一部分完成的：

host, selector = urlparts = urlparse.urlsplit(url)[1:3]
h = httplib.HTTPConnection(host)
h.putrequest("POST", selector)
h.putheader("content-type", content_type)
h.putheader("content-length", len(data))
h.endheaders()
h.send(data)    
response = h.getresponse()
return response.status, response.reason, response.read()

文件传输完成后返回 getresponse()，我如何写出进度（假设使用 stdout.write ），以便我可以将此值写入缓存显示框架（djangosnippets 678/679）？或者，如果有更好的做法，我洗耳恭听！

编辑：

因为我已经使用了 urllib2 并使用了这个问题< /a> 覆盖文件的 read() 以获取上传进度。此外，我使用海报来生成多部分 urlencode。这是最新的代码：

from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
def Upload(video_id, username, password, title, description, filename):

    class Progress(object):
        def __init__(self):
            self._seen = 0.0

        def update(self, total, size, name):
            self._seen += size
            pct = (self._seen / total) * 100.0
            print '%s progress: %.2f' % (name, pct)

    class file_with_callback(file):
        def __init__(self, path, mode, callback, *args):
            file.__init__(self, path, mode)
            self.seek(0, os.SEEK_END)
            self._total = self.tell()
            self.seek(0)
            self._callback = callback
            self._args = args

        def __len__(self):
            return self._total

        def read(self, size):
            data = file.read(self, size)
            self._callback(self._total, len(data), *self._args)
            return data

    progress = Progress()
    stream = file_with_callback(filename, 'rb', progress.update, filename)

    datagen, headers = multipart_encode({
                                        "post": "1",
                                        "skin": "xmlhttprequest",
                                        "userlogin": "%s" % username,
                                        "password": "%s" % password,
                                        "item_type": "file",
                                        "title": "%s" % title.encode("utf-8"),
                                        "description": "%s" % description.encode("utf-8"),                                             
                                         "file": stream
                                         })    

    opener = register_openers()

    req = urllib2.Request(UPLOAD_URL, datagen, headers)
    response = urllib2.urlopen(req)
    return response.read()

但这仅适用于文件路径输入，而不是来自表单输入（request.FILES）的 InMemoryUploadedFile，因为它试图读取已保存在内存中的文件，我想，并且我得到一个 TypeError在线：“stream = file_with_callback（filename，'rb'，progress.update，filename）”：

coercing to Unicode: need string or buffer, InMemoryUploadedFile found

如何使用用户上传的文件实现相同的进度报告？另外，这是否会消耗大量内存来读取这样的进度，也许是这个下载进度的上传解决方案urllib2 会工作得更好，但是如何实现......非常欢迎帮助

原文

In a django app I'm using a third-party Python script to allow users to upload files to blip.tv through httplib.HTTPConnection.send on an EC2 instance. Since these files are generally large I will use a message queue to process the upload asynchronously (RabbitMQ/Celery), and give feedback on the progress to the user in the frontend.

The httpconnection and send are done in this part of the script:

host, selector = urlparts = urlparse.urlsplit(url)[1:3]
h = httplib.HTTPConnection(host)
h.putrequest("POST", selector)
h.putheader("content-type", content_type)
h.putheader("content-length", len(data))
h.endheaders()
h.send(data)    
response = h.getresponse()
return response.status, response.reason, response.read()

The getresponse() is returned after the file transfer is completed, how do I write out the progress (assuming something with stdout.write) so I can write this value to the cache framework for display (djangosnippets 678/679)? Alternatively if there is a better practice for this, I'm all ears!

EDIT:

Since I've gone with urllib2 and used a tip from this question to override the read() of the file to get the upload progress. Furthermore I'm using poster to generate the multipart urlencode. Here's the latest code:

from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
def Upload(video_id, username, password, title, description, filename):

    class Progress(object):
        def __init__(self):
            self._seen = 0.0

        def update(self, total, size, name):
            self._seen += size
            pct = (self._seen / total) * 100.0
            print '%s progress: %.2f' % (name, pct)

    class file_with_callback(file):
        def __init__(self, path, mode, callback, *args):
            file.__init__(self, path, mode)
            self.seek(0, os.SEEK_END)
            self._total = self.tell()
            self.seek(0)
            self._callback = callback
            self._args = args

        def __len__(self):
            return self._total

        def read(self, size):
            data = file.read(self, size)
            self._callback(self._total, len(data), *self._args)
            return data

    progress = Progress()
    stream = file_with_callback(filename, 'rb', progress.update, filename)

    datagen, headers = multipart_encode({
                                        "post": "1",
                                        "skin": "xmlhttprequest",
                                        "userlogin": "%s" % username,
                                        "password": "%s" % password,
                                        "item_type": "file",
                                        "title": "%s" % title.encode("utf-8"),
                                        "description": "%s" % description.encode("utf-8"),                                             
                                         "file": stream
                                         })    

    opener = register_openers()

    req = urllib2.Request(UPLOAD_URL, datagen, headers)
    response = urllib2.urlopen(req)
    return response.read()

This works though only for file path inputs, rather than an InMemoryUploadedFile that comes from a form input (request.FILES), since it is trying to read the file already saved in memory I suppose, and I get a TypeError on line: "stream = file_with_callback(filename, 'rb', progress.update, filename)":

coercing to Unicode: need string or buffer, InMemoryUploadedFile found

How can I achieve the same progress reporting with the user-uploaded file? Also, will this consume alot of memory reading out the progress like this, perhaps an upload solution to this download progress for urllib2 will work better, but how to implement... Help is sooo welcome

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

国粹 2024-11-23 11:09:09

原来 poster 库在 multipart_encode 中有一个回调钩子，可以用于获取进度（上传或下载）。好东西......

虽然我想我在技术上回答了这个问题，但我确信还有其他方法可以给这只猫剥皮，所以如果我找到其他方法或细节，我会发布更多内容。

这是代码：

def prog_callback(param, current, total):
    pct = 100 - ((total - current ) *100 )/ (total) 
    print "Progress: %s " % pct    


datagen, headers = multipart_encode({
                                    "post": "1",
                                    "skin": "xmlhttprequest",
                                    "userlogin": "%s" % username,
                                    "password": "%s" % password,
                                    "item_type": "file",
                                    "title": "%s" % title.encode("utf-8"),
                                    "description": "%s" % description.encode("utf-8"),                                             
                                     "file": filename
                                     }, cb=prog_callback)    

opener = register_openers()

req = urllib2.Request(UPLOAD_URL, datagen, headers)
response = urllib2.urlopen(req)
return response.read()

It turns out that the poster library has a callback hook in multipart_encode, which can be used to get the progress out (upload or download). Good stuff...

Although i suppose I technically answered this question, I'm sure there are other ways to skin this cat, so i'll post more if I find other methods or details for this.

Here's the code:

def prog_callback(param, current, total):
    pct = 100 - ((total - current ) *100 )/ (total) 
    print "Progress: %s " % pct    


datagen, headers = multipart_encode({
                                    "post": "1",
                                    "skin": "xmlhttprequest",
                                    "userlogin": "%s" % username,
                                    "password": "%s" % password,
                                    "item_type": "file",
                                    "title": "%s" % title.encode("utf-8"),
                                    "description": "%s" % description.encode("utf-8"),                                             
                                     "file": filename
                                     }, cb=prog_callback)    

opener = register_openers()

req = urllib2.Request(UPLOAD_URL, datagen, headers)
response = urllib2.urlopen(req)
return response.read()

回复收藏 0 原文

~没有更多了~