使用 python/mod_wsgi 的多个生产者、单个消费者
我有一个由 Apache(mod_wsgi、prefork)提供服务的 Pylons Web 应用程序。由于 Apache,有多个单独的进程同时运行我的应用程序代码。我希望推迟应用程序执行的一些非关键任务在后台处理,以缩短“实时”响应时间。所以我想到了任务队列,许多 Apache 进程将任务添加到该队列,一个单独的 Python 进程一一处理它们并从队列中删除。
队列最好保存在磁盘上,这样排队的未处理任务就不会因为断电、服务器重启等而丢失。问题是实现这种队列的合理方法是什么?
至于我尝试过的事情:我从简单的 SQLite 数据库和其中用于存储队列项目的单个表开始。在负载测试中,当增加并发级别时,我开始收到“数据库锁定”错误,正如预期的那样。快速而肮脏的修复方法是用 MySQL 替换 SQLite ——它可以很好地处理并发问题,但对于我需要做的简单事情来说感觉有点大材小用了。与队列相关的数据库操作也在我的分析报告中突出显示。
I have a Pylons web application served by Apache (mod_wsgi, prefork). Because of Apache, there are multiple separate processes running my application code concurrently. Some of the non-critical tasks that the application does I want to defer for processing in background to improve "live" response times. So I'm thinking of task queue, many Apache processes adding tasks to this queue, a single separate Python process processing them one-by-one and removing from queue.
The queue should preferably be persisted to disk so queued unprocessed tasks are not lost because of power outage, server restart etc. The question is what would be a reasonable way to implement such queue?
As for the things I've tried: I started with simple SQLite database and single table in it for storing queue items. In load testing, when increasing level of concurrency, I started getting "database locked" errors, as expected. The quick'n'dirty fix was to replace SQLite with MySQL--it handles concurrency issues well but feels like an overkill for the simple thing I need to do. Queue-related DB operations also show up prominently in my profiling reports.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
像 Apache 的 ActiveMQ 这样的消息代理是一个理想的解决方案。
管道可能如下:
由于 ActiveMQ 存储尚未在持久性存储中使用的消息,因此可以立即满足队列持久性的要求。此外,它的扩展性非常好,因为您可以在不同的机器上自由部署多个 HTTP 应用程序、多个消费者应用程序和 AMQ 本身。
我们在用 Python 编写的项目中使用类似的东西,利用 STOMP 作为底层通信协议。
A message broker like Apache's ActiveMQ is an ideal solution here.
The pipeline could be following:
The requirement of queue persistence is fulfilled out of the box since ActiveMQ stores messages that are not yet consumed in persistent storage. Furthermore it scales quite well since you're free to deploy multiple HTTP-apps, multiple consumer apps and AMQ itself on different machines each.
We use something like this in our project written in Python utilizing STOMP as underlying communication protocol.
Web 服务器(任何 Web 服务器)是多生产者、单消费者进程。
一个简单的解决方案是构建一个 wsgiref 或 Werkzeug 后端服务器来处理您的后端请求。
由于这个“后端”服务器是使用 WSGI 技术构建的,因此它与前端 Web 服务器非常非常相似。除了。它不会生成 HTML 响应(JSON 通常更简单)。除此之外,它非常简单。
您为此后端设计 RESTful 事务。您可以使用所有各种 WSGI 功能来进行 URI 解析、授权、身份验证等。通常,您不需要会话管理,因为 RESTful 服务器通常不提供会话。
如果您遇到严重的可扩展性问题,您只需将后端服务器包装在lighttpd 或其他一些Web 引擎中即可创建多线程后端。
A web server (any web server) is multi-producer, single-consumer process.
A simple solution is to build a wsgiref or Werkzeug backend server to handle your backend requests.
Since this "backend" server is build using WSGI technology, it's very, very similar to the front-end web server. Except. It doesn't produce HTML responses (JSON is usually simpler). Other than that, it's very straightforward.
You design RESTful transactions for this backend. You use all of the various WSGI features for URI parsing, authorization, authentication, etc. You -- generally -- don't need session management, since RESTful servers don't usually offer sessions.
If you get into serious scalability issues, you simply wrap your backend server in lighttpd or some other web engine to create a multi-threaded backend.