使用 django celery 并行化任务
我想加快我的 django 视图处理速度,在视图中我必须进行多次 Web 服务调用(在我的例子中是 facebook graph api),这确实需要相当长的时间来处理(仅渲染视图就需要大约 15-16 秒)其中大部分用于从 Facebook 获取数据)。因此,
def view(request):
if new_user
#facebook graph api call becomes the bottleneck
profile = facebook.graph_api_call('profile information')
friends = facebook.graph_api_call('get some more information like friends')
some lengthy processing of friends
...
facebook.graph_api_call('yet some more information')
#Finally pass the profile to template to render user details
render_to_response('mytemplate', { 'profile': profile })
我不想这样做:
@task
def get_profile():
profile = facebook.graph_api_call('profile information')
return profile
def view(request):
if new_user
profile_task = get_profile.delay()
friends = facebook.graph_api_call('get some more information like friends')
some really lengthy processing of friends
...
facebook.graph_api_call('yet some more information')
#Wait for profile task to complete
profile_task.get()
#Finally pass the profile to template to render user details
render_to_response('mytemplate', { 'profile': profile })
这样我就可以继续处理朋友数据,而无需等待检索个人资料信息。但是,如果 celery 工作人员很忙,那么它可能无法立即获取配置文件数据,因此视图的渲染可能比以前的方法花费更多的时间。
第三种方法可能是执行与方法 2 中相同的所有操作,但如果任务尚未启动则取消该任务,并进行常规函数调用,而不是启动任务来获取配置文件信息。
以防万一有人建议使用 Facebook 批量请求获取个人资料和朋友信息:在我的情况下这是不可能的,因为上面的代码只是一个片段。当用户第一次访问我的应用程序时,我实际上是在中间件中获取配置文件。
我不确定上述 3 种方法中哪种方法更好。请建议是否有其他方法可以并行网络请求。
I want to speed up my django view processing where in the view I have to make several web service calls (in my case facebook graph api) which really take quite some time to process (it takes around 15-16 secs just to render a view most of which is spent in getting data from facebook). So instead of
def view(request):
if new_user
#facebook graph api call becomes the bottleneck
profile = facebook.graph_api_call('profile information')
friends = facebook.graph_api_call('get some more information like friends')
some lengthy processing of friends
...
facebook.graph_api_call('yet some more information')
#Finally pass the profile to template to render user details
render_to_response('mytemplate', { 'profile': profile })
I thought to do:
@task
def get_profile():
profile = facebook.graph_api_call('profile information')
return profile
def view(request):
if new_user
profile_task = get_profile.delay()
friends = facebook.graph_api_call('get some more information like friends')
some really lengthy processing of friends
...
facebook.graph_api_call('yet some more information')
#Wait for profile task to complete
profile_task.get()
#Finally pass the profile to template to render user details
render_to_response('mytemplate', { 'profile': profile })
This way my processing of friends data can proceed without waiting for the profile information to be retrieved. But if the celery worker(s) are busy then it may not fetch the profile data immediately and hence the rendering of view could take more even more time than the previous approach.
The third approach could be to do everything same as in approach 2 but cancel the task if it has already not been started and make a regular function call instead of launching a task to get profile information.
Just in case someone suggests to get the profile and friend information using a facebook batch request: it is not possible in my case as the code above is just a snippet. I am actually fetching the profile in a middleware when a user first visits my app.
I am not sure which approach out of the 3 above is better. Please suggest if there could be some other way to parallelize the web requests.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我是 Django Facebook 的作者,我遇到了和你一样的问题。我认为如果您想并行(更快)请求 url,最好的解决方案是使用 eventlet。我们也使用 celery,但我们主要用它在后台运行。
I'm the author of Django Facebook and I encountered the same problem you have. I think the best solution would be to use eventlet if you want to request the urls in parallel (faster). We also use celery but we mainly use it for running things on the background.