串行化任务分配:使用线程还是epoll?
现在我处于这样的情况,有一组预定义的任务供多个客户端执行(任何客户端都可以执行任何任务)。当客户端连接到服务器时,服务器从未完成的任务中选择一个任务并发送给客户端——服务器端的工作非常简单。客户端需要一段时间才能完成任务并将结果发送回服务器。
由于任务只能发送给一个客户端,因此服务器应该以序列化的方式处理请求。现在我有两个计划:为每个客户端连接创建一个线程,所有线程轮流访问任务池,或者使用epoll监听所有连接并处理客户端的每个事件。
哪一个更适合这项工作?或者还有其他的想法吗?服务器将在多核机器上运行。
Now I'm in such a situation that there is a group of predefined tasks for multiple clients to do(any client can take any task). When a client connects to the server, server choose a task from the uncompleted tasks and send it to the client --- job on the server side is quite easy. It takes a while for the client to finish the task and send the result back to the server.
Since a task should be sent to only one client, server should process requests in a serialized way. Now I have two plans to do it: create a thread for each client connection and all the threads take turns accessing the task pool, or use epoll listening on all the connection and process for each event of clients.
Which one is better for the job? Or is there any other ideas? The server will be run on a multi-core machine.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
主要问题是服务器是否需要进行大量处理才能为客户端准备任务。如果不是,那么使用多线程就没有任何好处——相反,上下文切换只会增加开销。在这种情况下,最好使用
epoll
(很可能通过一些现有的库,具体取决于您使用的编程语言)。如果服务器端有重要的处理,它可能会抵消上下文切换的低效率,并通过从多核获得并行性来提高性能。确定最佳解决方案的唯一方法是进行一些原型设计和分析。
客户端与服务器运行在同一台机器上吗?
如果是的话,您将能够轻松地利用多个核心。
The main question is whether the server has significant processing to do in order to prepare the tasks for the client. If not, there's nothing to be gained from using multiple threads -- on the contrary, the context switching will just add overhead. In this case it would be the best to use
epoll
(most likely via some existing library, depending on the programming language you're using).If there is significant processing on the server side, it may offset the inefficiency of context switching, and improve performance by gaining parallelism from multiple core. The only way to know for sure what the best solution will be is to do some prototyping and profiling.
Are the clients running on the same machine as the server is?
If they are, you'll be able to utilize the multiple cores easily.
你可以两者都做。您可以让多个线程在同一个 fd-set 上运行
epoll()
,操作系统将根据需要唤醒线程。做到这一点也很容易 - 特别是如果您不需要任何共享:只需fork()
五十次左右,Linux 将在需要时进行上下文切换,在不需要时进行 epoll。当我这样做时,我只是做了这样的事情:如果您确实需要共享,那么您将需要锁定。 这可能会使事情变得复杂,并且这些类型的编程问题在一般意义上很难解决。根据我的经验,除了一些非常简单的数据库之外,放弃线程或重新设计程序以不需要锁定共享结构通常更简单。
You can do both. You can have multiple threads running
epoll()
on the same fd-set, and the operating system will wake up threads as needed. It's also really easy to do this- especially if you don't need any sharing: Simplyfork()
fifty times or so and Linux will context switch if needed, and epoll when not. When I do this, I simply do something like this:If you do need sharing, then you're going to need locking. This can complicate things here, and these kinds of programming problems are difficult to solve in a general sense. It is my experience that except in the case of some very simple databases, it is usually simpler to either forgo threads, or re-engineer the program to not require locking on the shared structures.