关于recv和读缓冲区 - C Berkeley Sockets
我正在使用 berkeley 套接字和 TCP(SOCK_STREAM 套接字)。
过程是:
- 我连接到远程地址。
- 我给它发一条消息。
- 我收到一条来自它的消息。
想象一下我正在使用以下缓冲区:
char recv_buffer[3000];
recv(socket, recv_buffer, 3000, 0);
问题是:
- 我如何知道第一次调用recv后读取缓冲区是否为空?如果它不为空,我将不得不再次调用recv,但如果我在它为空时这样做,我会让它阻塞很长时间。
- 我如何知道已读入recv_buffer 的字节数?我无法使用 strlen,因为我收到的消息可能包含空字节。
谢谢。
I am using berkeley sockets and TCP (SOCK_STREAM sockets).
The process is:
- I connect to a remote address.
- I send a message to it.
- I receive a message from it.
Imagine I am using the following buffer:
char recv_buffer[3000];
recv(socket, recv_buffer, 3000, 0);
Questions are:
- How can I know if after calling recv first time the read buffer is empty or not? If it's not empty I would have to call recv again, but if I do that when it's empty I would have it blocking for much time.
- How can I know how many bytes I have readed into recv_buffer? I can't use strlen because the message I receive can contain null bytes.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以使用
select
或poll
系统调用以及套接字描述符来判断是否有数据等待读取从插座。然而,通常应该有一个发送者和接收者都遵循的商定协议,以便双方都知道要传输多少数据。例如,发送方可能首先发送一个 2 字节整数,指示它将发送的字节数。然后接收方首先读取这个 2 字节整数,以便知道还需要从套接字读取多少字节。
无论如何,正如 Tony 在下面指出的那样,一个强大的应用程序应该在标头中使用长度信息的组合,并在每次调用
recv
之前结合轮询套接字以获取其他数据(或使用非阻塞套接字)。这将防止您的应用程序在以下情况下阻塞:例如,您(从标头)知道应该仍然有 100 个字节可供读取,但对等方由于某种原因无法发送数据(可能对等计算机已关闭)。意外关闭),从而导致您的recv
调用被阻止。recv
系统调用将返回读取的字节数,如果发生错误则为 -1。从recv(2) 的手册页来看:
You can use the
select
orpoll
system calls along with your socket descriptor to tell if there is data waiting to be read from the socket.However, usually there should be an agreed-upon protocol that both sender and receiver follow, so that both parties know how much data is to be transferred. For example, perhaps the sender first sends a 2-byte integer indicating the number of bytes it will send. The receiver then first reads this 2-byte integer, so that it knows how many more bytes to read from the socket.
Regardless, as Tony pointed out below, a robust application should use a combination of length-information in the header, combined with polling the socket for additional data before each call to
recv
, (or using a non-blocking socket). This will prevent your application from blocking in the event that, for example, you know (from the header) that there should still be 100 bytes remaining to read, but the peer fails to send the data for whatever reason (perhaps the peer computer was unexpectedly shut off), thus causing yourrecv
call to block.The
recv
system call will return the number of bytes read, or -1 if an error occurred.From the man page for recv(2):
即使是第一次(在接受客户端之后),如果客户端连接丢失,recv 也可能会阻塞并失败。您必须:
select
或poll
(BSD 套接字)或某些操作系统特定的等效项,它们可以告诉您特定套接字描述符上是否有可用数据(以及异常条件以及可以写入更多输出的缓冲区空间)recv
只会返回立即可用的任何内容(可能什么也没有)recv
-ing数据,知道其他线程将执行您关心的继续进行的其他工作recv()
返回读取的字节数,如果出错则返回 -1。请注意,TCP 是一种字节流协议,这意味着您只能保证能够以正确的顺序从中读取和写入字节,但不保证保留消息边界。因此,即使发送方对其套接字进行了一次大的单次写入,它也可能在途中被分段并以几个较小的块或几个较小的
send()
/write()< 的形式到达。 /code> 可以通过一个
recv()
/read()
来合并和检索。因此,请确保循环调用
recv
直到获得所需的所有数据(即可以处理的完整逻辑消息)或出现错误。您应该准备/能够处理从客户端获取部分/全部后续发送
(如果您没有协议,其中每一方仅在从另一方获取完整消息后才发送,并且不使用带有消息长度的标头)。请注意,先对消息头(带长度)进行recvs,然后对消息体进行recvs 可能会导致对recv()
的更多调用,从而对性能产生潜在的不利影响。这些可靠性问题常常被忽视。当在单个主机、可靠且快速的 LAN、涉及较少的路由器和交换机以及较少或非并发消息时,它们出现的频率较低。然后它们可能会在负载和更复杂的网络上崩溃。
Even the first time (after accepting a client), the recv can block and fail if the client connection has been lost. You must either:
select
orpoll
(BSD sockets) or some OS-specific equivalent, which can tell you whether there is data available on specific socket descriptors (as well as exception conditions, and buffer space you can write more output to)recv
will only return whatever is immediately available (possibly nothing)recv
-ing data, knowing other threads will be doing the other work you're concerned to continue withrecv()
returns the number of bytes read, or -1 on error.Note that TCP is a byte stream protocol, which means that you're only guaranteed to be able to read and write bytes from it in the correct order, but the message boundaries are not guaranteed to be preserved. So, even if the sender has made a large single write to their socket, it can be fragmented en route and arrive in several smaller blocks, or several smaller
send()
/write()
s can be consolidated and retrieved by onerecv()
/read()
.For that reason, make sure you loop calling
recv
until you either get all the data you need (i.e. a complete logical message you can process) or an error. You should be prepared/able to handle getting part/all of subsequentsend
s from your client (if you don't have a protocol where each side only sends after getting a complete message from the other, and are not using headers with message lengths). Note that doing recvs for the message header (with length) then the body can result in a lot more calls torecv()
, with a potential adverse affect on performance.These reliability issues are often ignored. They manifest less often when on a single host, a reliable and fast LAN, with less routers and switches involved, and fewer or non-concurrent messages. Then they may break under load and over more complex networks.
如果
recv()
返回的字节数少于 3000,那么您可以假设读取缓冲区为空。如果它在你的 3000 字节缓冲区中返回 3000 字节,那么你最好知道是否继续。大多数协议都包含 TLV 的一些变体 - 类型、长度、值。每条消息都包含消息类型的指示符、某个长度(如果长度固定,则可能由类型隐含)和值。如果在阅读您收到的数据时,您发现最后一个单元不完整,您可以假设还有更多内容需要阅读。还可以将socket做成非阻塞socket;如果没有读取数据,则recv()
将失败,并显示 EAGAIN 或 EWOULDBLOCK。recv()
函数返回读取的字节数。If the
recv()
returns fewer than 3000 bytes, then you can assume that the read buffer was empty. If it returns 3000 bytes in your 3000 byte buffer, then you'd better know whether to continue. Most protocols include some variation on TLV - type, length, value. Each message contains an indicator of the type of message, some length (possibly implied by the type if the length is fixed), and the value. If, on reading through the data you did receive, you find that the last unit is incomplete, you can assume there is more to be read. You can also make the socket into a non-blocking socket; then therecv()
will fail with EAGAIN or EWOULDBLOCK if there is no data read for reading.The
recv()
function returns the number of bytes read.带有 FIONREAD 选项的 ioctl() 会告诉您当前可以在不阻塞的情况下读取多少数据。
ioctl() with the FIONREAD option tells you how much data can currently be read without blocking.