当前位置：文江博客话题详情

poll 和 select 有什么区别？

发布于 2024-07-23 10:53:26 字数 185 浏览 5 评论 0原文

我指的是 POSIX 标准 select 和 poll 系统 C API 调用。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

情绪失控 2024-07-30 10:53:26

select() 调用让您创建三个位掩码来标记您想要监视哪些套接字和文件描述符的读取、写入和错误，然后操作系统标记哪些套接字和文件描述符实际上具有某种类型活动； poll() 让您创建一个描述符 ID 列表，操作系统用发生的事件的种类来标记每个描述符 ID。

select() 方法相当笨重且效率低下。

通常有超过一千个进程可用的潜在文件描述符。如果一个长时间运行的进程只打开了几个描述符，但至少其中一个已被分配了较高的数字，则传递给 select() 的位掩码必须足够大才能容纳最高的描述符描述符 - 因此数百位的整个范围将被取消设置，操作系统必须在每次 select() 调用上循环才能发现它们被取消设置。
一旦 select() 返回，调用者必须循环遍历所有三个位掩码以确定发生了什么事件。在许多典型应用程序中，在任何给定时刻只有一两个文件描述符会获得新流量，但必须一直读取所有三个位掩码直至最后才能发现它们是哪些描述符。
由于操作系统通过重写位掩码向您发出有关活动的信号，因此它们被破坏并且不再标记有您想要侦听的文件描述符列表。您要么必须从内存中保存的其他列表重建整个位掩码，要么必须在损坏的数据之上保留每个位掩码和 memcpy() 数据块的副本每次select()调用后的位掩码。

因此，poll() 方法效果更好，因为您可以继续重复使用相同的数据结构。

事实上，poll() 启发了现代 Linux 内核中的另一种机制：epoll()，它对该机制进行了更多改进，以实现可扩展性的又一次飞跃，就像今天的 epoll() 一样。服务器通常希望同时处理数万个连接。这是对这项工作的一个很好的介绍：

http://scotdoyle.com/python-epoll-howto。虽然

此链接有一些漂亮的图表，显示了 epoll() 的优点（您会注意到 select() 到目前为止被认为效率低下且陈旧- 时尚的是，它甚至在这些图表上都没有一条线！）：

http://lse.sourceforge .net/epoll/index.html

更新：这是另一个 Stack Overflow 问题，其答案提供了有关差异的更多详细信息：

Twisted 中 select/poll 与 epoll 反应器的注意事项

The select() call has you create three bitmasks to mark which sockets and file descriptors you want to watch for reading, writing, and errors, and then the operating system marks which ones in fact have had some kind of activity; poll() has you create a list of descriptor IDs, and the operating system marks each of them with the kind of event that occurred.

The select() method is rather clunky and inefficient.

There are typically more than a thousand potential file descriptors available to a process. If a long-running process has only a few descriptors open, but at least one of them has been assigned a high number, then the bitmask passed to select() has to be large enough to accomodate that highest descriptor — so whole ranges of hundreds of bits will be unset that the operating system has to loop across on every select() call just to discover that they are unset.
Once select() returns, the caller has to loop over all three bitmasks to determine what events took place. In very many typical applications only one or two file descriptors will get new traffic at any given moment, yet all three bitmasks must be read all the way to the end to discover which descriptors those are.
Because the operating system signals you about activity by rewriting the bitmasks, they are ruined and are no longer marked with the list of file descriptors you want to listen to. You either have to rebuild the whole bitmask from some other list that you keep in memory, or you have to keep a duplicate copy of each bitmask and memcpy() the block of data over on top of the ruined bitmasks after each select() call.

So the poll() approach works much better because you can keep re-using the same data structure.

In fact, poll() has inspired yet another mechanism in modern Linux kernels: epoll() which improves even more upon the mechanism to allow yet another leap in scalability, as today's servers often want to handle tens of thousands of connections at once. This is a good introduction to the effort:

http://scotdoyle.com/python-epoll-howto.html

While this link has some nice graphs showing the benefits of epoll() (you will note that select() is by this point considered so inefficient and old-fashioned that it does not even get a line on these graphs!):

http://lse.sourceforge.net/epoll/index.html

Update: Here is another Stack Overflow question, whose answer gives even more detail about the differences:

Caveats of select/poll vs. epoll reactors in Twisted

回复收藏 0 原文

谢绝鈎搭 2024-07-30 10:53:26

我认为这回答了您的问题：

来自理查德·史蒂文斯（[电子邮件受保护]）：
基本区别在于 select() 的 fd_set 是位掩码，
因此有一些固定的大小。内核有可能
编译内核时不限制这个大小，允许
应用程序将 FD_SETSIZE 定义为它想要的任何内容（如注释
在今天的系统标题中暗示），但这需要更多的工作。 4.4BSD的
内核和Solaris库函数都有这个限制。但是我
看到 BSD/OS 2.1 现在已经被编码以避免这个限制，所以它
可行，只是编程的小问题。 :-) 有人应该提交一份
Solaris 错误报告，看看它是否得到修复。
但是，使用 poll() 时，用户必须分配一个 pollfd 数组
结构体，并传递该数组中的条目数，因此有
没有根本限制。正如 Casper 所指出的，拥有 poll() 的系统比
select，所以后者更便携。另外，与原
实现（SVR3）你不能将描述符设置为-1来告诉
内核忽略 pollfd 结构中的条目，这使得
很难从数组中删除条目； SVR4 解决了这个问题。
就我个人而言，我总是使用 select() 而很少使用 poll()，因为我移植了我的
也可以将代码移植到 BSD 环境。有人可以写一个实现
对于这些环境，使用 select() 的 poll() 的，但我从来没有
见过一个。 select() 和 poll() 都已被 POSIX 标准化
1003.1克。

2017 年 10 月更新：

上面引用的电子邮件至少可以追溯到 2001 年；现在（2017 年）所有现代操作系统（包括 BSD）都支持 poll() 命令。事实上，有些人认为 select() 应该被弃用。抛开观点不谈，围绕 poll() 的可移植性问题不再是现代系统的问题。此外，epoll()已经被开发出来（你可以阅读手册页），并且受欢迎程度持续上升。

对于现代开发，您可能不想使用 select()，尽管它没有任何明显的错误。 poll()，它是 epoll() 的更现代的演变，提供与 select() 相同的功能（甚至更多），而不会受到其中的限制。

I think that this answers your question:

From Richard Stevens ([email protected]):
The basic difference is that select()'s fd_set is a bit mask and
therefore has some fixed size. It would be possible for the kernel to
not limit this size when the kernel is compiled, allowing the
application to define FD_SETSIZE to whatever it wants (as the comments
in the system header imply today) but it takes more work. 4.4BSD's
kernel and the Solaris library function both have this limit. But I
see that BSD/OS 2.1 has now been coded to avoid this limit, so it's
doable, just a small matter of programming. :-) Someone should file a
Solaris bug report on this, and see if it ever gets fixed.
With poll(), however, the user must allocate an array of pollfd
structures, and pass the number of entries in this array, so there's
no fundamental limit. As Casper notes, fewer systems have poll() than
select, so the latter is more portable. Also, with original
implementations (SVR3) you could not set the descriptor to -1 to tell
the kernel to ignore an entry in the pollfd structure, which made it
hard to remove entries from the array; SVR4 gets around this.
Personally, I always use select() and rarely poll(), because I port my
code to BSD environments too. Someone could write an implementation
of poll() that uses select(), for these environments, but I've never
seen one. Both select() and poll() are being standardized by POSIX
1003.1g.

October 2017 Update:

The email referenced above is at least as old as 2001; the poll() command is now (2017) supported across all modern operating systems - including BSD. In fact, some people believe that select() should be deprecated. Opinions aside, portability issues around poll() are no longer a concern on modern systems. Furthermore, epoll() has since been developed (you can read the man page), and continues to rise in popularity.

For modern development you probably don't want to use select(), although there's nothing explicitly wrong with it. poll(), and it's more modern evolution epoll(), provide the same features (and more) as select() without suffering from the limitations therein.

回复收藏 0 原文

深海蓝天 2024-07-30 10:53:26

它们都慢并且大部分相同，但是大小和某些功能不同！

当你编写迭代器时，你每次都需要复制select集合！而 poll 已经修复了此类问题，拥有漂亮的代码。另一个区别是，默认情况下，poll 可以处理超过 1024 个文件描述符 (FD)。 poll 可以处理不同的事件，使程序更具可读性，而不是用很多变量来处理此类工作。由于需要进行大量检查，poll 和 select 中的操作是线性且缓慢的。