使用 Twisted 和 Python 在后台执行复杂工作的 Websocket 服务器
我想编写一个处理 Websocket 客户端的服务器,同时通过 sqlalchemy 进行 mysql 选择并同时抓取多个网站(scrapy)。接收到的数据必须进行计算,保存到数据库,然后发送到 websocket 客户端。
我的问题是,从逻辑的角度来看,如何在 Python 中做到这一点。我需要如何设置代码结构以及哪些模块是这项工作的最佳解决方案?目前,我确信使用扭曲的线程,其中正在运行抓取和选择内容。但这可以用更简单的方法来完成吗?我只找到简单的扭曲示例,但显然这似乎是一项更复杂的工作。有类似的例子吗?我该如何开始?
I want to code a Server which handles Websocket Clients while doing mysql selects via sqlalchemy and scraping several Websites on the same time (scrapy). The received data has to be calculated, saved to the db and then send to the websocket Clients.
My question ist how can this be done in Python from the logical point of view. How do I need to set up the code structure and what modules are the best solution for this job? At the moment I'm convinced of using twisted with threads in which the scrape and select stuff is running. But can this be done an easier way? I only find simple twisted examples but obviously this seems to be a more complex job. Are there similar examples? How do I start?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Cyclone,一个基于 Twisted 的“网络工具包”,基于/类似于 facebook/friendfeed 的 Tornado 服务器,包含对WebSockets: https://github.com/fiorix/cyclone/ blob/master/cyclone/web.py#L908
这是示例代码:
这是使用 txwebsocket 的示例:
您在使用 SQLAlchemy 时可能遇到问题与扭曲;据我所读,它们不能很好地协同工作(来源)。您是否已经与 SQLA 结婚了,或者另一个更兼容的 OR/M 就足够了?
一些对扭曲友好的 OR/M 包括 Storm (分叉)和 < a href="http://findingscience.com/twistar/" rel="nofollow noreferrer">Twistar,您始终可以依靠 Twisted 的核心数据库抽象库twisted.enterprise.adbapi。
还有其他产品的异步友好数据库库,例如 txMySQL、txMongo 和 txRedis 和 佩斯利图案 (couchdb)。
您可以想象使用 Cyclone (或 txwebsockets)和 Scrapy 作为相同的 MultiService,在不同的端口上运行,但打包在同一个应用程序实例中。服务可以通过父服务或某种 RPC 机制(例如 JSONRPC、观点经纪人,AMP,XML-RPC (2) 等),或者您可以从 scrapy 写入数据库服务并使用 websockets 从中读取。在我看来,Redis 非常适合这个。
Cyclone, a Twisted-based 'network toolkit', based on/similar to facebook/friendfeed's Tornado server, contains support for WebSockets: https://github.com/fiorix/cyclone/blob/master/cyclone/web.py#L908
Here's example code:
Here's an example of using txwebsocket:
You may have a problem using SQLAlchemy with Twisted; from what I have read, they do not work well together (source). Are you married to SQLA, or would another, more compatible OR/M suffice?
Some twisted-friendly OR/Ms include Storm (a fork) and Twistar, and you can always fall back on Twisted's core db abstraction library twisted.enterprise.adbapi.
There are also async-friendly db libraries for other products, such as txMySQL, txMongo, and txRedis, and paisley (couchdb).
You could conceivably use both Cyclone (or txwebsockets) and Scrapy as child services of the same MultiService, running on different ports, but packaged within the same Application instance. The services may communicate, either through the parent service or some RPC mechanism (like JSONRPC, Perspective Broker, AMP, XML-RPC (2) etc), or you can just write to the db from the scrapy service and read from it using websockets. Redis would be great for this IMO.
理想情况下,您希望避免编写自己的 WebSockets 服务器,但由于您正在运行 Twisted,因此您可能无法做到这一点:有多种 WebSockets 实现(请参阅 PyPI 上的此搜索)。不幸的是,它们都不是基于 Twisted 的[编辑请参阅下面 @JP-Calderone 的评论。]Twisted 应该驱动主服务器,因此您可能想从编写可以运行的东西开始通过
twistd
(参见 这里(如果您对此不熟悉)。 @JP-Calderone 和 Scrapy 提到的 WebSocket 实现都是基于 Twisted 的,因此从基于 Twisted 的主服务器驱动它们应该是相当简单的。 SQLAlchemy 会更困难,我之前在 中评论过这一点这个问题。Ideally you'll want to avoid writing your own WebSockets server, but since you're running Twisted, you might not be able to do that: there are several WebSockets implementations (see this search on PyPI). Unfortunately none of them are Twisted-based[Edit see @JP-Calderone's comment below.]Twisted should drive the master server, so you probably want to begin with writing something that can be run via
twistd
(see here if your'e new to this). The WebSocket implementation mentioned by @JP-Calderone and Scrapy are both Twisted -based so they should be reasonable trivial to drive from your master Twisted-based server. SQLAlchemy will be more difficult, I've commented on this before in this question.