事件驱动 IO 和阻塞与非阻塞
有人可以向我解释一下事件驱动的 IO 系统调用(如 select、poll 和 epoll)与阻塞 IO 和非阻塞 IO 有何关系吗?
我不明白这些概念有多么相关——如果有的话
Can someone explain to me how event-driven IO system calls like select, poll, and epoll relate to blocking vs non-blocking IO?
I don't understand how related -- if at all, these concepts are
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
几乎所有 Unix 都支持 select 系统调用,它为用户态应用程序提供了监视一组描述符并获取有关该组的哪个子集已准备好读/写的信息的方法。它的特定接口有点笨拙,并且大多数内核中的实现充其量只是平庸。
epoll
仅在 Linux 中出于同样的目的提供,但在效率和编程接口方面比select
有了巨大的改进。其他 Unix 系统也有其专门的调用。也就是说,事件驱动的 IO 系统调用不需要阻塞或非阻塞描述符。阻塞是一种影响系统调用的行为,例如
read
、write
、accept
和connect
。select
和epoll_wait
确实有阻塞超时,但这与描述符无关。当然,将这些事件驱动的系统调用与阻塞描述符一起使用有点奇怪,因为您希望在收到数据可用的通知后可以立即读取数据而不会阻塞。始终依赖阻塞描述符在收到准备就绪通知后不会阻塞有点冒险,因为竞争条件是可能的。
非阻塞、事件驱动的 IO 可以使服务器应用程序更加高效,因为每个描述符(连接)不需要线程。将 Apache Web 服务器与 Nginx 或 Lighttpd 的性能进行比较,您就会看到其优势。
The
select
system call is supported in almost all Unixes and provides means for userland applications to watch over a group of descriptors and get information about which subset of this group is ready for reading/writing. Its particular interface is a bit clunky and the implementation in most kernels is mediocre at best.epoll
is provided only in Linux for the same purpose, but is a huge improvement overselect
in terms of efficiency and programming interface. Other Unixes have their specialised calls too.That said, the event-driven IO system calls do not require either blocking or non-blocking descriptors. Blocking is a behaviour that affects system calls like
read
,write
,accept
andconnect
.select
andepoll_wait
do have blocking timeouts, but that is something unrelated to the descriptors.Of course, using these event-driven system calls with blocking descriptors is a bit odd because you would expect that you can immediately read the data without blocking after you have been notified that it is available. Always relying that a blocking descriptor won't block after you have been notified for its readiness is a bit risky because race conditions are possible.
Non-blocking, event-driven IO can make server applications vastly more efficient because threads are not needed for each descriptor (connection). Compare the Apache web server to Nginx or Lighttpd in terms of performance and you'll see the benefit.
它们在很大程度上是不相关的,只是您可能出于以下原因想要将非阻塞文件描述符与事件驱动 IO 一起使用:
旧版本的 Linux 的内核中肯定存在
read
的错误即使在select
指示套接字可读之后也可能会阻塞(UDP 套接字和校验和错误的数据包会发生这种情况)。当前版本的 Linux可能仍然存在一些此类错误;我不确定。如果其他进程有可能访问您的文件描述符并对其进行读/写,或者您的程序是多线程的并且其他线程可能会这样做,那么
select 之间存在竞争条件
确定文件描述符可读/可写,并且您的程序对其执行 IO,这可能会导致阻塞。在调用
connect
之前,您几乎肯定希望使套接字成为非阻塞的;否则你会阻塞直到连接建立。使用select
进行写入以确定连接何时成功,使用select
进行错误以确定连接是否失败。They're largely unrelated, except that you may want to use non-blocking file descriptors with event-driven IO for the following reasons:
Old versions of Linux definitely have bugs in the kernel where
read
can block even afterselect
indicated a socket was readable (it happened with UDP sockets and packets with bad checksums). Current versions of Linux may still have some such bugs; I'm not sure.If there's any possibility that other processes have access to your file descriptors and will read/write to them, or if your program is multi-threaded and other threads might do so, then there is a race condition between
select
determining that the file descriptor is readable/writable and your program performing IO on it, which could result in blocking.You almost surely want to make a socket non-blocking before calling
connect
; otherwise you'll block until the connection is made. Useselect
for writing to determine when it's successfully connected, andselect
for errors to determine if the connection failed.select
和类似的函数(您提到了一些)通常用于在事件驱动系统中实现事件循环。即,应用程序不是直接从套接字或文件进行 read() 操作(如果没有可用数据,则可能会阻塞),而是在多个文件描述符上调用 select(),等待其中任何一个上的数据可用 /em>.
当文件描述符变得可用时,您可以放心数据可用并且 read() 操作不会阻塞。
这是同时处理来自多个源的数据而不诉诸多个线程的一种方法。
select
and similar functions (you mentioned a few) are usually used to implement an event loop in an event driven system.I.e., instead of read()ing directly from a socket or file -- potentially blocking if the no data is available, the application calls select() on multiple file descriptors waiting for data to be available on any one of them.
When a file descriptor becomes available, you can be assured data is available and the read() operation will not block.
This is one way of processing data from multiple sources simultaneously without resorting to multiple threads.