如何通知 select() 立即返回?
我有一个工作线程正在侦听 TCP 套接字以获取传入流量,并缓冲接收到的数据以供主线程访问(我们将此套接字称为 A)。 但是,即使没有数据传入,工作线程也必须执行一些常规操作(例如每秒一次)。因此,我使用带有超时的 select()
,以便我不需要继续轮询。 (请注意,在非阻塞套接字上调用 receive()
然后休眠一秒钟并不好:传入的数据应该立即可供主线程使用,即使主线程可能并不总是如此能够立即处理它,因此需要缓冲。)
现在,我还需要能够向工作线程发出信号以立即执行一些其他操作; 从主线程,我需要让工作线程的 select()
立即返回。 现在,我已经解决了这个问题,如下(基本上采用的方法是 这里 和此处):
在程序启动时,工作线程为此目的创建一个额外的数据报 (UDP) 类型的套接字,并将其绑定到某个随机端口(我们称此套接字B)。 同样,主线程创建一个用于发送的数据报套接字。 在调用 select()
时,工作线程现在在 fd_set
中列出 A 和 B。 当主线程需要发出信号时,它会 sendto()
将几个字节发送到 localhost
上的相应端口。 回到工作线程,如果 select()
返回后 B 仍保留在 fd_set
中,则 recvfrom()
被调用并且接收到的字节被简单地忽略。
这似乎工作得很好,但我不能说我喜欢这个解决方案,主要是因为它需要为 B 绑定一个额外的端口,而且还因为它添加了几个可能会失败的额外套接字 API 调用猜测 - 我真的不想为每种情况找出适当的行动。
我认为理想情况下,我想调用一些将 A 作为输入的函数,除了使 select()
立即返回之外什么都不做。 但是,我不知道这样的功能。 (我想我可以例如shutdown()
套接字,但副作用并不是真正可接受的:)
如果这是不可能的,第二个最佳选择是创建一个B 它比真正的 UDP 套接字要虚拟得多,并且实际上不需要分配任何有限的资源(超出合理的内存量)。 我想 Unix 域套接字 正是这样做的,但是:解决方案不应该是跨域的平台比我目前拥有的平台要好,尽管一些适量的 #ifdef
东西就可以了。 (我的目标主要是 Windows 和 Linux,顺便编写 C++。)
请不要建议重构以消除两个单独的线程。 这种设计是必要的,因为主线程可能会被阻塞很长时间(例如,进行一些密集的计算 - 并且我无法从最内层的计算循环开始定期调用 receive()
),并且在同时,有人需要缓冲传入的数据(由于我无法控制的原因,它不能是发送者)。
现在我正在写这篇文章,我意识到有人肯定会简单地回复“Boost.Asio",所以我只是第一次看到它......但是找不到明显的解决方案。 请注意,我也不能(轻易)影响套接字 A 的创建方式,但如果需要,我应该能够让其他对象包装它。
I have a worker thread that is listening to a TCP socket for incoming traffic, and buffering the received data for the main thread to access (let's call this socket A). However, the worker thread also has to do some regular operations (say, once per second), even if there is no data coming in. Therefore, I use select()
with a timeout, so that I don't need to keep polling. (Note that calling receive()
on a non-blocking socket and then sleeping for a second is not good: the incoming data should be immediately available for the main thread, even though the main thread might not always be able to process it right away, hence the need for buffering.)
Now, I also need to be able to signal the worker thread to do some other stuff immediately; from the main thread, I need to make the worker thread's select()
return right away. For now, I have solved this as follows (approach basically adopted from here and here):
At program startup, the worker thread creates for this purpose an additional socket of the datagram (UDP) type, and binds it to some random port (let's call this socket B). Likewise, the main thread creates a datagram socket for sending. In its call to select()
, the worker thread now lists both A and B in the fd_set
. When the main thread needs to signal, it sendto()
's a couple of bytes to the corresponding port on localhost
. Back in the worker thread, if B remains in the fd_set
after select()
returns, then recvfrom()
is called and the bytes received are simply ignored.
This seems to work very well, but I can't say I like the solution, mainly as it requires binding an extra port for B, and also because it adds several additional socket API calls which may fail I guess – and I don't really feel like figuring out the appropriate action for each of the cases.
I think ideally, I would like to call some function which takes A as input, and does nothing except makes select()
return right away. However, I don't know such a function. (I guess I could for example shutdown()
the socket, but the side effects are not really acceptable :)
If this is not possible, the second best option would be creating a B which is much dummier than a real UDP socket, and doesn't really require allocating any limited resources (beyond a reasonable amount of memory). I guess Unix domain sockets would do exactly this, but: the solution should not be much less cross-platform than what I currently have, though some moderate amount of #ifdef
stuff is fine. (I am targeting mainly for Windows and Linux – and writing C++ by the way.)
Please don't suggest refactoring to get rid of the two separate threads. This design is necessary because the main thread may be blocked for extended periods (e.g., doing some intensive computation – and I can't start periodically calling receive()
from the innermost loop of calculation), and in the meanwhile, someone needs to buffer the incoming data (and due to reasons beyond what I can control, it cannot be the sender).
Now that I was writing this, I realized that someone is definitely going to reply simply "Boost.Asio", so I just had my first look at it... Couldn't find an obvious solution, though. Do note that I also cannot (easily) affect how socket A is created, but I should be able to let other objects wrap it, if necessary.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
你快到了。 使用“自管道”技巧。 打开一个管道,将其添加到您的
select()
读写fd_set
中,从主线程写入它以解锁工作线程。 它可以跨 POSIX 系统移植。我在一个系统中看到了针对 Windows 的类似技术的变体(实际上与上面的方法一起使用,由
#ifdef WIN32
分隔)。 解除阻塞可以通过向fd_set
添加一个虚拟(未绑定)数据报套接字然后关闭它来实现。 当然,缺点是每次都必须重新打开它。然而,在上述系统中,这两种方法的使用相当少,并且用于意外事件(例如,信号、终止请求)。 首选方法仍然是
select()
的可变超时,具体取决于为工作线程安排某事的时间。You are almost there. Use a "self-pipe" trick. Open a pipe, add it to your
select()
read and writefd_set
, write to it from main thread to unblock a worker thread. It is portable across POSIX systems.I have seen a variant of similar technique for Windows in one system (in fact used together with the method above, separated by
#ifdef WIN32
). Unblocking can be achieved by adding a dummy (unbound) datagram socket tofd_set
and then closing it. The downside is that, of course, you have to re-open it every time.However, in the aforementioned system, both of these methods are used rather sparingly, and for unexpected events (e.g., signals, termination requests). Preferred method is still a variable timeout to
select()
, depending on how soon something is scheduled for a worker thread.使用管道而不是套接字更干净一些,因为另一个进程不可能抓住它并把事情弄乱。
使用 UDP 套接字肯定会造成杂散数据包进入并产生干扰的可能性。
匿名管道永远不可用于任何其他进程(除非您将其提供给它)。
您也可以使用信号,但在多线程程序中,您需要确保除您想要的线程之外的所有线程都屏蔽了该信号。
Using a pipe rather than socket is a bit cleaner, as there is no possibility for another process to get hold of it and mess things up.
Using a UDP socket definitely creates the potential for stray packets to come in and interfere.
An anonymous pipe will never be available to any other process (unless you give it to it).
You could also use signals, but in a multithreaded program you'll want to make sure that all threads except for the one you want have that signal masked.
在 unix 上使用管道会很简单。 如果您在 Windows 上并希望继续使用 select 语句来保持代码与 unix 兼容,那么创建未绑定 UDP 套接字并关闭它的技巧非常有效且简单。 但你必须使其多线程安全。
我发现实现多线程安全的唯一方法是在运行 select 语句的同一线程中关闭并重新创建套接字。 当然,如果线程在选择上阻塞,这会很困难。 然后在windows进来调用QueueUserAPC。 当 windows 在 select 语句中阻塞时,线程可以处理异步过程调用。 您可以使用 QueueUserAPC 从不同的线程安排此操作。 Windows 中断 select,在同一线程中执行您的函数,然后继续执行 select 语句。 您现在可以在 APC 方法中关闭套接字并重新创建它。 保证线程安全,您永远不会丢失信号。
On unix it will be straightforward with using a pipe. If you are on windows and want to keep using the select statement to keep your code compatible with unix, the trick to create an unbound UDP socket and close it, works well and easy. But you have to make it multi-threadsafe.
The only way I found to make this multi-threadsafe is to close and recreate the socket in the same thread as the select statement is running. Of course this is difficult if the thread is blocking on the select. And then comes in the windows call QueueUserAPC. When windows is blocking in the select statement, the thread can handle Asynchronous Procedure Calls. You can schedule this from a different thread using QueueUserAPC. Windows interrupts the select, executes your function in the same thread, and continues with the select statement. You can now in your APC method close the socket and recreate it. Guaranteed thread safe and you will never loose a signal.
简单来说:
全局变量保存套接字句柄,然后关闭全局套接字,
select()
将立即返回:closesocket(g_socket);
To be simple:
a global var saves the socket handle, then close the global socket, the
select()
will return immediately:closesocket(g_socket);