如果我不必发回任何数据,服务器有什么好的技巧可以处理更多请求吗?
我想使用某种服务服务器处理来自 javascript 客户端的大量(> 100k/sec)POST 请求。这些数据不会被存储,但我必须处理所有这些数据,因此我不能将整个服务器的能力仅用于服务请求。所有处理都需要在同一个服务器实例中完成,否则我将需要使用数据库在服务器之间进行同步,这会慢几个数量级。
然而,我不需要将任何数据发送回客户,他们甚至不期望它们。 到目前为止,我的计划是创建几个代理服务器实例,这些实例将能够缓冲请求并将它们以更大的包发送到主服务器。
例如,假设我需要每秒处理 200k 个请求,每个服务器可以处理 40k 个请求。我可以在其中 5 个之间分配负载。然后每个消息都会缓冲请求并将它们以 100 个为一组发送回主服务器。这将导致主服务器上每秒 2k 个请求(但是,每条消息将大 100 倍 - 这可能意味着大约 100-200kB) 。我什至可以使用 UDP 将它们发送回服务器,以减少所需资源的数量(然后我只需要主服务器上的一个套接字,对吧?)。
我只是在想是否没有其他方法可以加快速度。特别是,正如我所说,我不需要退回任何东西。我也可以完全控制 javascript 客户端,但不幸的是 javascript 无法使用 UDP 发送数据,这可能是我的解决方案(我什至不关心 0.1% 的数据是否会丢失)。
有什么想法吗?
编辑以回应我迄今为止给出的答案。
问题不在于服务器处理队列中的事件速度变慢,也不在于将事件放入队列本身。事实上,我计划使用干扰器模式(http://code.google.com/p/disruptor/< /a>),经证明每秒可处理多达 600 万个请求。
我可能遇到的唯一问题是需要同时打开 100、200 或 300k 个套接字,这是任何主流服务器都无法处理的。我知道一些自定义解决方案是可能的(http ://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3)但我想知道是否没有办法更好地利用这一事实我不必向客户重播。
(例如,将部分数据嵌入初始 TCP 数据包中并像处理 UDP 一样处理 TCP 数据包的方法。或者其他某种魔法;))
I want to handle a lot of (> 100k/sec) POST requests from javascript clients with some kind of service server. Not many of this data will be stored, but I have to process all of them so I cannot spend my whole server power for serving requests only. All the processing need to be done in the same server instance, otherwise I'll need to use database for synchronization between servers which will be slower by orders of magnitude.
However I don't need to send any data back to the clients, and they don't even expect them.
So far my plan was to create few proxy servers instances which will be able to buffer the request and send them to main server in bigger packs.
For example let's say that I need to handle 200k requests / sec and each server can handle 40k. I can split load between 5 of them. Then each one will be buffering requests and sending them back to main server in packs of 100. This will result in 2k requests / sec on the main server (however, each message will be 100 times bigger - which probably means around 100-200kB). I could even send them back to the server using UDP to decrease amount of needed resources (then I need only one socket on main server, right?).
I'm just thinking if there is no other way to speed up the things. Especially, when as I said I don't need to send anything back. I have full control over javascript clients also, but unlucky javascript is unable to send data using UDP which probably would be solution for me (I don't even care if 0.1% of data will be lost).
Any ideas?
Edit in response to answers given me so far.
The problem isn't with server being to slow at processing events from the queue or with putting events in the queue itself. In fact I plan to use disruptor pattern (http://code.google.com/p/disruptor/) which was proven to process up to 6 million requests per second.
The only problem which I potentially can have is need to have 100, 200 or 300k sockets open at the same time, which cannot be handled by any of the mainstream servers. I know some custom solutions are possible (http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3) but I'm wondering if there is no way to even better utilization of fact that I don't have to replay to clients.
(For example some way to embed part of the data in initial TCP packet and handle TCP packets as they would be UDP. Or some other kind of magic ;))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
制作一个独特且快速(可能用 C 语言)的函数,从非常快的服务器(如 nginx)获取所有请求。这个函数的唯一工作是将请求存储在一个非常快的队列中(如果你有足够的内存,就像redis)。
在另一个进程(或服务器)中,取消队列并执行实际工作,一一处理请求。
Make a unique and fast (probably in C) function that get's all requests, from a very fast server (like nginx). The only job of this function is to store the requests in a very fast queue (like redis if you got enought ram).
In another process (or server), depop the queue and do the real work, processing request one by one.
如果您可以控制客户端,正如您所说,那么您的代理服务器甚至不需要是 HTTP 服务器,因为您可以假设所有请求都是有效的。
您可以将其实现为非 HTTP 服务器,只需发回 200,读取客户端请求直至断开连接,然后将请求排队进行处理。
If you have control of the clients, as you say, then your proxy server doesn't even need to be an HTTP server, because you can assume that all of the requests are valid.
You could implement it as a non-HTTP server that simply sends back a 200, reads the client request until it disconnects, and then queues the requests for processing.
我认为您所描述的是消息队列的实现。您还需要一些东西来将这些请求传递给您使用的任何队列(RabbitMQ 非常好,有很多替代方案)。
您还需要运行其他东西,它可以对请求进行您实际想要的任何处理。你还没有说得很清楚,所以我不太确定什么适合你。本质上,这个想法是,传入的请求尽可能快地由 Web 服务器简单地转储到队列中,然后 Web 服务器可以自由地返回来服务更多请求。当系统有一些资源时,它会使用它们来处理队列,但当系统繁忙时,队列只会不断增长。
不确定您在哪个平台上,但可能想看看 Lighttpd 之类的东西来提供 POST 服务。您可能(如果同域限制不会让您失望)让 Lighttpd 在应用程序的子域上运行(例如 post.myapp.com)。如果做不到这一点,您可以在 Web 服务器前面放置一个适当的负载均衡器(因此所有请求都会发送到 www.myapp.com,负载均衡器决定是否将它们转发到 Web 服务器或队列处理器)。
希望有帮助
I think what you're describing is an implementation of a Message Queue. You also will need something to hand off these requests to whatever queue you use (RabbitMQ is quite good, there are many alternatives).
You'll also need something else running which can do whatever processing you actually want on the requests. You haven't made that very clear, so I'm not too sure exactly what would be right for you. Essentially the idea will be that incoming requests are dumped as quickly as simply as possible into the queue by your web server, and then the web server is free to go back to serving more requests. When the system has some resources, it uses them to process the queue, but when it's busy the queue just keeps growing.
Not sure what platform you're on, but might want to look at something like Lighttpd for serving the POSTs. You might (if same-domain restrictions don't shoot you down) get away with having Lighttpd running on a subdomain of your application (so post.myapp.com). Failing that you could put a proper load balancer in front of your webservers altogether (so all requests go to www.myapp.com and the load balancer decides whether to forward them to the web server or the queue processor).
Hope that helps
考虑使用 MongoDB 来持久化您的请求,它的即发即忘机制可以帮助您的服务器更快地响应。
Consider using MongoDB for persisting your requests, it's fire and forget mechanism can help your servers to response faster.