TNonblockingServer、TThreadedServer 和 TThreadPoolServer,哪一个最适合我的情况?
我们的分析服务器是用 C++ 编写的。它基本上查询底层存储引擎并通过 Thrift 返回相当大的结构化数据。典型的请求大约需要 0.05 到 0.6 秒才能完成,具体取决于请求大小。
我注意到我们可以在 c++ 代码中使用 Thrift 服务器的一些选项,特别是 TNonblockingServer、TThreadedServer 和 TThreadPoolServer。看起来 TNonblockingServer 是可行的方法,因为它可以支持更多的并发请求,并且仍然在幕后使用线程池来处理任务。它还避免了构造/销毁线程的成本。
Facebook 关于 Thrift 的更新:http://www.facebook.com/note.php?note_id =16787213919
在 Facebook,我们正在开发 C++ 的完全异步客户端和服务器。这 服务器像当前的 TNonblockingServer 一样使用事件驱动的 I/O,但它的接口 应用程序代码全部基于异步回调。这将使我们能够写 可以同时服务数千个请求的服务器(每个请求都需要 只需几个线程即可调用其他 Thrift 或 Memcache 服务器。
stackover 上的相关文章:thrift 中的大量并发连接
话虽如此,您不一定能够真正更快地完成工作(处理程序 仍然在线程池中执行),但更多客户端将能够立即连接到您。
只是想知道我在这里还缺少其他因素吗?我该如何决定哪一个最适合我的需求?
Our analytic server is written in c++. It basically queries underlying storage engine and returns a fairly big structured data via thrift. A typical requests will take about 0.05 to 0.6 seconds to finish depends on the request size.
I noticed that there are a few options in terms of which Thrift server we can use in the c++ code, specifically TNonblockingServer, TThreadedServer, and TThreadPoolServer. It seems like TNonblockingServer is the way to go since it can support much more concurrent requests and still using a thread pool behind the scene to crunch through the tasks. It also avoids the cost of constructing/destructing the threads.
Facebook's update on thrift: http://www.facebook.com/note.php?note_id=16787213919
Here at Facebook, we're working on a fully asynchronous client and server for C++. This
server uses event-driven I/O like the current TNonblockingServer, but its interface to
the application code is all based on asynchronous callbacks. This will allow us to write
servers that can service thousands of simultaneous requests (each of which requires
making calls to other Thrift or Memcache servers) with only a few threads.
Related posts on stackover: Large number of simulteneous connections in thrift
That being said, you won't necessarily be able to actually do work faster (handlers
still execute in a thread pool), but more clients will be able to connect to you at once.
Just wondering are there any other factors I'm missing here? How shall I decide which one fits my needs the best?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
需要 50-600 毫秒才能完成的请求相当长。创建或销毁线程所需的时间远少于此,因此此时不要让该因素影响您的决定。我会选择最容易支持并且最不容易出错的一个。您希望最大限度地减少细微并发错误的可能性。
这就是为什么编写在需要的地方阻塞的单线程事务处理代码,并且让其中许多代码并行运行,通常比使用更复杂的非阻塞模型更容易。阻塞的线程可能会减慢单个事务的速度,但它不会阻止服务器在等待时执行其他工作。
如果您的事务负载增加(即更多的客户端事务)或请求处理速度变得更快(每个事务接近 1 毫秒),那么事务开销就变得更重要。需要注意的指标是吞吐量:单位时间内完成的事务数。单笔交易的绝对持续时间不如交易完成的速度重要,至少在其远低于一秒的情况下是这样。
Requests that take 50-600 milliseconds to complete are pretty long. The time it takes to create or destroy a thread is much less than that, so don't let that factor into your decision at this time. I would choose the one that is easiest to support and that is the least error-prone. You want to minimize the likelihood of subtle concurrency bugs.
This is why it is often easier to write single-threaded transaction handling code that blocks where it needs to, and have many of these running in parallel, than to have a more complex non-blocking model. A blocked thread may slow down an individual transaction, but it does not prevent the server from doing other work while it waits.
If your transaction load increases (i.e. more client transactions) or the requests become faster to process (approaching 1 millisecond per transaction), then transaction overhead becomes more of a factor. The metric to pay attention to is throughput: how many transactions complete per unit time. The absolute duration of a single transaction is less important than the rate at which they are being completed, at least if it stays well below one second.
Github 上的一个人做了一个很好的比较
TThreadedServer< /strong>
TThreadedServer 为每个客户端连接生成一个新线程,并且每个线程都保持活动状态,直到客户端连接关闭。这意味着如果有1000个并发客户端连接,TThreadedServer需要同时运行1000个线程。
TNonblockingServer
TNonblockingServer 有一个专用于网络 I/O 的线程。同一个线程还可以处理请求,或者您可以创建一个单独的工作线程池来处理请求。服务器可以使用少量线程处理许多并发连接,因为它不需要为每个连接生成一个新线程。
TThreadPoolServer(此处未进行基准测试)
TThreadPoolServer 与 TThreadedServer 类似;每个客户端连接都有自己的专用服务器线程。它与 TThreadedServer 有两点不同:
服务器线程在客户端关闭连接后返回线程池以供重用。
线程数量有限制。线程池的增长不会超出限制。
如果线程池中没有更多可用线程,客户端将挂起。与其他两台服务器相比,它的使用难度要大得多。
One guy on Github has made a nice comparison
TThreadedServer
TThreadedServer spawns a new thread for each client connection, and each thread remains alive until the client connection is closed. This means that if there are 1000 concurrent client connections, TThreadedServer needs to run 1000 threads simultaneously.
TNonblockingServer
TNonblockingServer has one thread dedicated for network I/O. The same thread can also process requests, or you can create a separate pool of worker threads for request processing. The server can handle many concurrent connections with a small number of threads since it doesn’t need to spawn a new thread for each connection.
TThreadPoolServer (not benchmarked here)
TThreadPoolServer is similar to TThreadedServer; each client connection gets its own dedicated server thread. It’s different from TThreadedServer in 2 ways:
Server thread goes back to the thread pool after client closes the connection for reuse.
There is a limit on the number of threads. The thread pool won’t grow beyond the limit.
Client hangs if there is no more thread available in the thread pool. It’s much more difficult to use compared to the other 2 servers.