一个好的多线程 python 网络服务器?
我正在寻找一个多线程而不是多进程的 python Web 服务器(如 apache 的 mod_python 的情况)。 我希望它是多线程的,因为我希望有一个内存对象缓存供各种 http 线程使用。 我的网络服务器做了很多昂贵的事情并计算了一些大型数组,这些数组需要缓存在内存中以供将来使用以避免重新计算。 这在多进程 Web 服务器环境中是不可能的。 将这些信息存储在 memcache 中也不是一个好主意,因为数组很大,并且将它们存储在 memcache 中除了 IPC 的额外开销之外还会导致来自 memcache 的数据反序列化。
我使用 BaseHttpServer 实现了一个简单的网络服务器,它提供了良好的性能,但几个小时后就卡住了。 我需要一些更成熟的网络服务器。 是否可以将apache配置为在线程模型下使用mod_python,以便我可以进行一些对象缓存?
I am looking for a python webserver which is multithreaded instead of being multi-process (as in case of mod_python for apache). I want it to be multithreaded because I want to have an in memory object cache that will be used by various http threads. My webserver does a lot of expensive stuff and computes some large arrays which needs to be cached in memory for future use to avoid recomputing. This is not possible in a multi-process web server environment. Storing this information in memcache is also not a good idea as the arrays are large and storing them in memcache would lead to deserialization of data coming from memcache apart from the additional overhead of IPC.
I implemented a simple webserver using BaseHttpServer, it gives good performance but it gets stuck after a few hours time. I need some more matured webserver. Is it possible to configure apache to use mod_python under a thread model so that I can do some object caching?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
CherryPy。 网站列出的功能:
CherryPy. Features, as listed from the website:
考虑重新考虑您的设计。 在网络服务器中维护这么多状态可能是一个坏主意。 多进程是实现稳定性的更好方法。
还有另一种方法可以在不同的进程之间共享状态吗? 服务怎么样? 数据库? 指数?
在内存中维护大量数据并依靠单个多线程进程来满足您的所有请求似乎不太可能是您的应用程序的最佳设计或架构。
Consider reconsidering your design. Maintaining that much state in your webserver is probably a bad idea. Multi-process is a much better way to go for stability.
Is there another way to share state between separate processes? What about a service? Database? Index?
It seems unlikely that maintaining a huge array of data in memory and relying on a single multi-threaded process to serve all your requests is the best design or architecture for your app.
Twisted 可以充当这样的 Web 服务器。 虽然本身不是多线程的,但当前主干中存在一个(尚未发布)多线程 WSGI 容器。 您可以查看 SVN 存储库,然后运行:
Twisted can serve as such a web server. While not multithreaded itself, there is a (not yet released) multithreaded WSGI container present in the current trunk. You can check out the SVN repository and then run:
如果不知道您正在工作的网站类型以及您期望的负载类型,则很难给出明确的答案。 亚秒级性能可能是一个严格的要求,也可能不是。 如果您确实需要保存最后一毫秒,那么您绝对需要将数组保留在内存中。 然而,正如其他人所建议的那样,您很可能不会这样做,并且可以通过其他方式度过难关。 您对数组中数据的使用模式可能会影响您做出的选择。 您可能不需要一次访问数组中的整套数据,因此您可以将数据分成更小的块,并将这些块放入缓存中,而不是放在一个大块中。 根据您的阵列数据需要更新的频率,您可以在 memcached、本地数据库(berkley、sqlite、小型 mysql 安装等)或远程数据库之间进行选择。 我认为 memcached 可以用于相当频繁的更新。 本地数据库的频率为每小时,远程数据库的频率为每天。 还需要考虑的一件事是缓存未命中后会发生什么。 如果 50 个客户端突然发生缓存未命中,并且所有客户端同时决定开始重新生成那些昂贵的阵列,那么您的设备将很快减少到 8086。 所以你必须考虑如何处理这个问题。 许多文章都介绍了如何从缓存未命中中恢复。 希望这有帮助。
Its hard to give a definitive answer without knowing what kind of site you are working on and what kind of load you are expecting. Sub second performance may be a serious requirement or it may not. If you really need to save that last millisecond then you absolutely need to keep your arrays in memory. However as others have suggested it is more than likely that you don't and could get by with something else. Your usage pattern of the data in the array may affect what kinds of choices you make. You probably don't need access to the entire set of data from the array all at once so you could break your data up into smaller chunks and put those chunks in the cache instead of the one big lump. Depending on how often your array data needs to get updated you might make a choice between memcached, local db (berkley, sqlite, small mysql installation, etc) or a remote db. I'd say memcached for fairly frequent updates. A local db for something in the frequency of hourly and remote for the frequency of daily. One thing to consider also is what happens after a cache miss. If 50 clients all of a sudden get a cache miss and all of them at the same time decide to start regenerating those expensive arrays your box(es) will quickly be reduced to 8086's. So you have to take in to consideration how you will handle that. Many articles out there cover how to recover from cache misses. Hope this is helpful.
不是多线程,但 twisted 可能会满足您的需求。
Not multithreaded, but twisted might serve your needs.
您可以使用可从每个进程访问的分布式缓存,memcached 是弹出的示例头脑。
You could instead use a distributed cache that is accessible from each process, memcached being the example that springs to mind.
web.py 过去让我很开心。 考虑检查一下。
但听起来,架构重新设计可能是正确的解决方案,尽管成本更高。
web.py has made me happy in the past. Consider checking it out.
But it does sound like an architectural redesign might be the proper, though more expensive, solution.
也许您在使用
BaseHttpServer
在 Python 中实现时遇到问题。 它没有理由“卡住”,并且使用BaseHttpServer
和threading
实现一个简单的线程服务器应该不难。另请参阅 http://pymotw.com/2/BaseHTTPServer/index.html #module-BaseHTTPServer 关于使用
HTTPServer
和ThreadingMixIn
实现简单的多线程服务器Perhaps you have a problem with your implementation in Python using
BaseHttpServer
. There's no reason for it to "get stuck", and implementing a simple threaded server usingBaseHttpServer
andthreading
shouldn't be difficult.Also, see http://pymotw.com/2/BaseHTTPServer/index.html#module-BaseHTTPServer about implementing a simple multi-threaded server with
HTTPServer
andThreadingMixIn
我个人和专业都使用 CherryPy,而且我对它非常满意。 我什至做了你所描述的那种事情,例如拥有全局对象缓存、在后台运行其他线程等。而且它与 Apache 集成得很好; 只需将 CherryPy 作为绑定到 localhost 的独立服务器运行,然后使用 Apache 的 mod_proxy 和 mod_rewrite 让 Apache 透明地将您的请求转发到 CherryPy。
CherryPy 网站是 http://cherrypy.org/
I use CherryPy both personally and professionally, and I'm extremely happy with it. I even do the kinds of thing you're describing, such as having global object caches, running other threads in the background, etc. And it integrates well with Apache; simply run CherryPy as a standalone server bound to localhost, then use Apache's
mod_proxy
andmod_rewrite
to have Apache transparently forward your requests to CherryPy.The CherryPy website is http://cherrypy.org/
我最近实际上也遇到了同样的问题。 即:我们使用 BaseHTTPServer 编写了一个简单的服务器,发现它不是多线程的事实是一个很大的缺点。
我的解决方案是将服务器移植到 Pylons (http://pylonshq.com/)。 该移植相当简单,一个好处是使用 Pylons 创建 GUI 非常容易,因此我能够在基本上是守护进程的顶部添加一个状态页面。
我会这样总结 Pylons:
我们还使用 Twisted 运行一个应用程序,并且对此感到满意。 Twisted 具有良好的性能,但我发现 Twisted 的单线程/延迟线程编程模型相当复杂。 它有很多优点,但对于一个简单的应用程序来说,它不是我的选择。
祝你好运。
I actually had the same issue recently. Namely: we wrote a simple server using BaseHTTPServer and found that the fact that it's not multi-threaded was a big drawback.
My solution was to port the server to Pylons (http://pylonshq.com/). The port was fairly easy and one benefit was it's very easy to create a GUI using Pylons so I was able to throw a status page on top of what's basically a daemon process.
I would summarize Pylons this way:
We also run an app with Twisted and are happy with it. Twisted has good performance, but I find Twisted's single-threaded/defer-to-thread programming model fairly complicated. It has lots of advantages, but would not be my choice for a simple app.
Good luck.
只是为了指出一些与通常的嫌疑人不同的东西......
几年前,当我使用 Zope 2.x 我了解 Medusa,因为它是该平台使用的 Web 服务器。 他们宣称它可以在重负载下正常工作,并且可以为您提供您所要求的功能。
Just to point out something different from the usual suspects...
Some years ago while I was using Zope 2.x I read about Medusa as it was the web server used for the platform. They advertised it to work well under heavy load and it can provide you with the functionality you asked.