我应该使用什么网络服务器/模组/技术来从内存中提供所有内容?
我有很多查找表,我可以从中生成网络响应。
我认为带有 Asp.net 的 IIS 使我能够将静态查找表保留在内存中,我可以使用它来非常快速地提供响应。
然而,是否也有非 .net 解决方案可以执行相同的操作?
我看过 fastcgi,但我认为这会启动 X 个进程,其中任何人都可以处理 Y 个请求。但根据定义,这些进程是相互屏蔽的。我可以将 fastcgi 配置为仅使用 1 个进程,但这是否会影响可伸缩性?
任何使用 PHP 或任何其他解释语言的东西都不会飞,因为它也是 cgi 或 fastcgi 绑定的,对吗?
我知道内存缓存可能是一个选项,尽管这需要另一个(本地)套接字连接,我宁愿避免这种连接,因为内存中的所有内容都会快得多。
该解决方案可以在 WIndows 或 Unix 下工作...这并不重要。唯一重要的是,将会有很多请求(现在为 100/秒,一年后将增长到 500/秒),我想减少处理它所需的网络服务器数量。
当前的解决方案是使用 PHP 和 memcache(以及偶尔访问 SQL Server 后端)完成的。虽然它很快(无论如何对于 php 来说),但是当超过 50/秒时 Apache 就会遇到真正的问题。
我对这个问题给予了悬赏,因为我没有看到足够的回复来做出明智的选择。
目前我正在考虑使用 Asp.net 或使用 C(++) 的 fastcgi。
I've lots of lookuptables from which I'll generate my webresponse.
I think IIS with Asp.net enables me to keep static lookuptables in memory which I can use to serve up my responses very fast.
Are there however also non .net solutions which can do the same?
I've looked at fastcgi, but I think this starts X processes, of which anyone can handle Y requests. But the processes are by definition shielded from eachother. I could configure fastcgi to use just 1 process, but does this have scalability implications?
anything using PHP or any other interpreted language won't fly because it is also cgi or fastcgi bound right?
I understand memcache could be an option, though this would require another (local) socket connection which I'd rather avoid since everything in memory would be much faster.
The solution can work under WIndows or Unix... it doesn't matter too much. The only thing which matters is that there will be a lot of requests (100/sec now and growing to 500/sec in a year), and I want to reduce the amount of webservers needed to process it.
The current solution is done using PHP and memcache (and the occasional hit to the SQL server backend). Although it is fast (for php anyway), Apache has real problems when the 50/sec is passed.
I've put a bounty on this question since I've not seen enough responses to make a wise choice.
At the moment I'm considering either Asp.net or fastcgi with C(++).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
听起来您应该使用内存中的键值数据存储,例如 Redis,如果如果您打算在未来拥有多个 Web 服务器,那么您绝对应该使用集中式内存存储。 Redis 在这种情况下尤其理想,因为它支持列表、集合和有序集等高级数据结构。它的速度也相当快,在入门级 Linux 机器中每秒可以获取 110000 个 SET,每秒 81000 个 GET。 检查基准。如果您沿着这条路线走,我有一个 c# redis 客户端,可以简化访问。
为了使用共享内存,您需要一个在同一进程中“始终运行”的应用程序服务器。在这些情况下,您可以使用静态类或共享“应用程序”缓存。最流行的“应用程序服务器”是任何 Java servlet 容器(例如 Tomcat)或 ASP.NET。
现在转向访问内存而不是磁盘将产生显着的性能节省,如果这个性能对您来说很重要,那么我认为您不想考虑使用解释性语言。在处理请求、网络 IO、解析协议设置工作线程等时总会产生开销。与内存中的共享内存存储相比,决定使用进程外(位于同一主机上)的共享内存存储可以忽略不计与完成请求所需的总时间进行比较。
It sounds like you should be using a in-memory key-value datastore like Redis, if you intend on having multiple web servers in the future than you should definitely be using a centralized memory store. Redis is especially ideal in this scenario as it supports advanced data structures like lists, sets and ordered sets. Its also pretty fast, it can get 110000 SETs/second, 81000 GETs/second in an entry level Linux box. Check the benchmarks. If you go down that route I have a c# redis client that can simplify access.
In order to use shared memory you need an application server that's 'always running' in the same process. In these cases you can use static classes or the shared 'Application' cache. The most popular 'Application servers' are any Java servlet containers (e.g. Tomcat) or ASP.NET.
Now moving to access memory rather than disk will yield significant perf savings, if this perf is important to you than I don't think you want to be considering using an interpreted language. There is always going to be overhead when dealing with a request, Network IO, parsing protocol setting up worker thread etc. Deciding to use an out of process (that's on the same host) shared memory store compared to an in memory one is negligible in comparison to the overall time it takes to complete the request.
首先,让我尝试与您一起思考您的直接问题:
- 对于您所追求的性能,我认为要求对查找表的共享内存访问是多余的。例如,memcache 开发人员对预期性能的评价是:“在具有高速网络(或本地访问 - ed.)的快速机器上,memcached 可以轻松处理每秒 200,000 多个请求。”
- 您目前可能受到 CPU 时间的限制,因为您正在动态生成每个页面。如果可能的话:缓存,缓存,缓存!缓存您的首页并每分钟或五分钟重建一次。对于已登录的用户,缓存他们可能在会话中再次访问的特定于用户的页面。例如:对于动态页面来说,每秒 50 个请求并不算太糟糕,诸如 varnish 之类的反向代理可以提供 在相当平庸的服务器上每秒有数千个 静态页面。我最好的提示是考虑使用 varnish 或 squid。
- 如果您仍然需要动态生成大量页面,请使用 php 加速器 以避免每次运行脚本时编译 php 代码。根据 wikipedia,性能提升了 2 到 10 倍。
- mod_php 是运行 php 最快的方式。
- 除了使用 fastcgi 之外,您还可以编写一个 apache 模块,并将数据存储在与网络服务器本身共享的内存空间中。这可能会非常快。但是,我从未听说过有人这样做来提高性能,而且这是一个非常不灵活的解决方案。
- 如果您转向更多静态内容或采用 fastcgi 方式:lighthttpd 比 apache 更快。
- 还是不够快? 内核内网络服务器,例如TUX 可以非常快。
其次:您不是第一个遇到此挑战的人,幸运的是,一些人大鱼很友善地与我们分享它们的“技巧”。我想这超出了您的问题范围,但看到这些人如何解决他们的问题确实令人鼓舞,我决定分享我所知道的材料。
请查看此有关 facebook 架构的演示文稿,以及此关于“构建可扩展的 Web 服务”的演示文稿,包含有关 flickr 的一些注释
此外,Facebook 还列出了他们开发并贡献的令人印象深刻的工具集,此外,他们分享了关于其架构的笔记。他们的一些提高性能的技巧:
- 对 memcache 进行一些性能改进自定义,例如 memcache-over-udp。
- hiphop 是一个 php-to-optimized-c++ 编译器。 Facebook 工程师声称 CPU 使用率降低了 50%。
- 以“更快的语言”实现计算密集型服务,并使用 thrift 将所有内容连接在一起。
First of all, let me try to think with you on your direct questions:
- For the performance that you're aiming at, I would say that demanding shared memory access to lookup-tables is overkill. For example, memcache developers on expected performance: "On a fast machine with very high speed networking (or local access - ed.), memcached can easily handle 200,000+ requests per second."
- You're currently probably limited by cpu-time since you're generating every page dynamically. If at all possible: cache, cache, cache! Cache your frontpage and rebuilt it just once every minute or five minutes. For logged-in users, cache user-specific pages that they might visit again in their session. For example: Where 50 requests a second is not too bad for a dynamic page, a reverse-proxy such as varnish can serve thousands of static pages a second on a pretty mediocre server. My best hint would be to look into setting up a reverse proxy using varnish or squid.
- if you still need to generate a lot of pages dynamically, use a php accelerator to avoid having to compile the php code every time the script is run. According to wikipedia, this is a 2 to 10-fold performance increase right there.
- mod_php is the fastest way to run php.
- Besides using fastcgi, you could write an apache module and have your data in shared memoryspace with the webserver itself. This could be very fast. However, I've never heard of anybody doing this to increase performance, and it's a very inflexible solution.
- If you move towards more static content or go the fastcgi way: lighthttpd is faster than apache.
- Still not fast enough? in-kernel webservers such as TUX can be very fast.
Secondly: You are not the first one to encounter this challenge, and fortunately some of the bigger fish are kind enough to share their 'tricks' with us. I guess this is beyond the scope of your question, but it can be truly inspiring to see how these guys have solved their problems, and I decided to share the material known to me.
Look at this presentation on facebook architecture, and this presentation on 'building scalable web-services', containing some notes on the flickr design.
Moreover, facebook lists an impressive toolset that they have developed and contributed to, Moreover, they share notes on their architecture. Some of their performance-improving tricks:
- some performance-improving customizations to memcache, such as memcache-over-udp.
- hiphop is a php-to-optimized-c++ compiler. Facebook engineers claim a 50% cpu-usage reduction.
- implement computationally intensive services in a 'faster language', and wire everything together using thrift.