轻量级通知技术
我需要在 django 中开发实时的近期活动提要(使用 AJAX 长轮询),我想知道服务器端的最佳策略是什么。
伪代码:
def recent_activity_post_save():
notify_view()
[in the view]
while not new_activity():
sleep(1)
return HttpResponse(new_activity())
首先想到的是每秒查询数据库。不可行。其他选项:
- 使用专门的工具(例如 Celery)将缓存用作通知服务
- (我不想这样做,因为它看起来有点过分)
这里最好的方法是什么?
I need to develop a realtime recent activity feed in django (with AJAX long-polling), and I'm wondering what's the best strategy for the server-side.
Pseudocode:
def recent_activity_post_save():
notify_view()
[in the view]
while not new_activity():
sleep(1)
return HttpResponse(new_activity())
The first thing that comes in mind is querying the DB every second. Not feasible. Other options:
- using the cache as a notification service
- using a specialized tool, like Celery (I'd rather not do it, because it seems like overkill)
What's the best way to go here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
我建议保持简单...
创建一个数据库表来存储您的事件,在适当的时候插入到该表中,然后只需实现一个简单的 ajax 轮询技术即可在客户端上每隔 x 秒访问服务器边。
我对考虑使用推送通知方法或使用 noSql 数据存储的其他解决方案感到担忧。它比使用 Django 框架内置工具的传统拉取通知系统复杂得多,并且除了极少数例外之外,都是矫枉过正的。除非你特别要求严格的实时解决方案,否则保持简单并使用框架中已经存在的工具,而对于基于数据库或网络性能提出异议的人,我要说的是,过早优化是根源一切邪恶。
构建一个包含特定于您的应用程序的最新活动数据的模型,然后,每当您的应用程序执行应记录新活动的操作时,您都可以将其插入到此表中。
您的视图将与任何其他视图一样,从该
RecentActivity
表中提取前 x 行(可选地基于查询参数等)。然后,在客户端,您只需有一个简单的 ajax 轮询器每隔 x 秒访问您的视图。您可以使用的复杂插件和技术并不缺乏,但编写自己的插件和技术也不是那么复杂:
我的观点是,性能问题并不是真正的问题,除非您的网站足够成功,以至于它们成为一个问题。传统的关系数据库可以很好地扩展,直到您开始达到像 Twitter、Google 等那样的成功水平。我们大多数人都没有达到那个水平:)
I would suggest keeping it simple...
Create a database table to store your events, insert into that table when appropriate, then just implement a simple ajax polling technique to hit the server every x seconds on the client side.
I have concerns with other solutions considering using a push-notification approach or using a noSql data store. It's a whole lot more complicated than a traditional pull-notification system using the tools that are built in to the Django framework, and except for very rare exceptions, is overkill. Unless you specifically require a strict real-time solution, keep it simple and use the tools that already exist in the framework, and for people with objections based on database or network performance, all I have to say is that premature optimization is the root of all evil.
Build a model that contains recent activity data specific to your application then, whenever your application does something that should log new activity you can just insert into this table.
Your view would simply be like any other view, pulling the top x rows from this
RecentActivity
table (optionally based on query parameters and whatever).Then, on the client side, you'd just have a simple ajax poller hitting your view every x seconds. There is no shortage of complicated plugins and technologies you can use, but writing your own isn't that complicated either:
My opinion is that performance issues aren't really issues until your site is successful enough for them to be an issue. A traditional relational database can scale up fairly well until you start reaching the level of success like Twitter, Google, etc. Most of us aren't at that level :)
您考虑过使用信号吗?您可以在recent_activity_post_save()中发送信号,并且可能有一个侦听器将信息存储在缓存中。
该视图只会引用缓存来查看是否有新的通知。当然,您不需要信号,但恕我直言,这样会更干净一些,因为您可以添加更多“通知处理程序”。
这似乎是最佳选择,因为您不需要轮询数据库(人工加载),通知几乎立即“可见”(仅在处理信号和与缓存交互所需的时间之后)。
所以伪代码看起来像这样:
Have you considered using Signals? You could send a signal in recent_activity_post_save() and there could be a listener which stores the information in cache.
The view would just refer to the cache to see if there are new notifications. Of course you don't need Signals, but IMHO it would be a bit cleaner that way, as you could add more "notification handlers".
This seems optimal because you don't need to poll the DB (artificial load), the notifications are "visible" almost immediately (only after the time required to process signals and interact with cache).
So the pseudocode would look like this:
您可以使用彗星解决方案,例如 Ape 项目。此类项目旨在将实时数据发送到浏览器,并且可以利用现代浏览器的Web套接字功能。
You could use a comet solution, like the Ape project. This kind of project is designed to send real-time data to the browser, and can make use of modern browsers web sockets feature.
您可以使用触发器(每当发布新帖子时触发)。例如,此触发器可以在轮询目录中写入一个新文件,其中包含必要的数据(例如主键)。然后,您的 python 可以只监视该文件夹中的新文件创建,而无需接触数据库,直到出现新文件。
You could use a trigger (fired whenever a new post is made). This trigger could write, for example, a new file in a polling directory with the necessary data in it (say the primary key). Your python could then just watch that folder for new file creations without having to touch the database until a new file appears.
如果您正在寻找彗星解决方案,那么你可以使用
轨道
。但我要警告您,因为这是一个相当小众的解决方案,所以很难找到有关如何在生产环境中部署和使用 Orbited 的良好文档。If you're after a comet solution then you could use
orbited
. Let me warn you though that because it's a rather niche solution it's very hard to find good documentation on how to deploy and useorbited
in production environments.这是一个类似的讨论,从服务器端的角度回答: Making moves w /websockets 和 python/django (/twisted?)
,最重要的答案是这个 。
还有这个答案,指向与 Django 尝试相比,看起来非常可靠。
如果您确实希望从现有的 Django 应用程序提供此服务,请不要在服务器端执行此操作。将 HTTP 套接字劫持到单个浏览器的连接是破坏应用程序的快速方法。两种合理的替代方案是:探索各种 Web 套接字选项(如上面使用 Pyramid 托管服务的选项),或者考虑让浏览器定期向服务器发送轮询请求以查找更新。
Here's a similar discussion, answering from the server-side perspective: Making moves w/ websockets and python / django ( / twisted? )
, the most important answer being this one.
There's also this answer, pointing to a very solid looking alternative to attempting this from Django.
If you really want this served from your existing Django application, don't do this server side. Holding that HTTP socket hostage to a single browser's connection is a fast way to break your application. Two reasonable alternatives are: explore the various web socket options (like the one above that uses Pyramid to host the service), or look at having the browser send a polling request periodically to the server looking for updates.
您应该决定是否愿意使用“拉”或“推”架构来传递消息,请参阅此 在 quora 上发帖!如果您想寻求一种将通知“推送”到接收者的解决方案,那么基于缓存/nosql 的系统是首选,因为它们不会为大量写入操作产生如此高的负载。
例如,Redis 及其排序集/列表数据结构为您提供了很多实例。参见例如。 这篇文章(虽然它不是Python )来获得一个想法。您还可以研究“真正的”消息队列,例如 RabbitMQ!
对于客户端连接,这里的其他帖子应该已经为您提供了一些关于如何使用扭曲和类似框架的想法。
Celery 始终是一个很好的工具,例如。将所有对用户活动流的写入都放在异步作业中!
You should decide if you would rather go with a "pull" or "push" architecture for delivering your messages, see this post on quora! If you like to go for a solution that "pushes" the notifications to their receivers caching/nosql based systems are preferrable as they don't produce such a high load for a lot of write actions.
Redis for instance with its sorted set/list datastructures offers you a lot of instance. See eg. this post (though its not python) to get an idea. You could also look into "real" message queues like RabbitMQ for example!
For the client connection the other posts here should already have given you some ideas on how to use twisted and similar frameworks.
And Celery can always be a good tool, you could eg. have all the writing to the users' activ ity streams in an asynchronous job!
如果不是真的有必要,我认为没有必要限制自己使用长轮询。有些库是为了利用可能的最佳选项而编写的(如果前面的选项都不可用,可能是短轮询、长轮询、Websockets 甚至是小型闪存插件)。 Node.js 拥有适合此类工作的最佳库之一,称为 Socket.IO,但幸运的是,还有两个可用的 Python 实现,gevent-socketio 和 tornadio,但后来建立在之上龙卷风框架,所以可能不可能。
如果这适合您,您可以将它们与一些 NoSQL(文档)数据库结合起来,事实证明,它比关系数据库更快、更轻量。有很多很多选择,包括 CouchDB、MongoDB、Redis……Socket.IO 和基于文档的 DB 的结合已被证明是快速、轻量级和可靠的。
尽管我在评论中看到您已经考虑过 NoSQL,但我个人的观点是,如果您需要一个快速且简单的解决方案,并且上述选项适合您,那么这是您可能采取的最佳机会。
I don't see a need to limit yourself to the use of long-polling if that is not really necessary. There are libraries written to take advantage of best option possible (may that be short-polling, long polling, websockets or even tiny flash plugin if none of the previous options is available). Node.js has one of the best libraries out there for such a job, called Socket.IO, but lucky there is also two Python implementations available, gevent-socketio and tornadio, but later is built on top of tornado framework, so possibly out of the question.
If that suits you, you can combine them with some of the NoSQL (document) database, which is proven much faster and lightweight than relational databases. There are many many options, including CouchDB, MongoDB, Redis, ... The combination of Socket.IO and document-based DB has proven to be fast, lightweight and reliable.
Although I've seen you've already considered NoSQL in the comments, my personal opinion is, if you need a fast and easy solution, and options above suit you, this is the best chance you may take.