使用伯克利套接字接收未知大小的数据

发布于 2024-12-10 07:27:32 字数 206 浏览 3 评论 0原文

我有一个 C++ 代码,其中使用 Berkeley Sockets 的 recv() 从远程主机接收数据。问题是我不知道数据的大小(这是可变的),所以我需要某种超时选项(可能)来完成这项工作。

由于我是套接字编程的新手,我想知道 Web 客户端如何处理来自服务器的响应(例如服务器将 html 数据发送到客户端)。它是否使用某种超时,因为它不知道页面有多大?与 FTP 客户端相同。

I have a code in C++ in which i use recv() from Berkeley Sockets to receive data from a remote host. The issue is that i do not know the size of the data ( which is variable ) so i need some kind of timeout opt ( probably ) to make this work.

Since I'm new in sockets programming, i was wondering how does for example a web client handle responses from a server ( eg a server sends the html data to the client ). Does it use some kind of timeout, since it doesn't know how big the page is ? Same with an FTP client.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

谎言 2024-12-17 07:27:32

当您的数据长度可变时,通常该数据会被封装在另一个容器中。也就是说,实际数据块之前有一个标头,告诉接收器应该接受多少数据。

例如,HTTP 使用换行符来分隔数据。如果存在可变长度消息,则在标头中它将包含“Content-length:”字段,该字段指示接收到整个标头后要读取的确切字节数(当您读取 2 个连续的新行时,标头将停止)。

从套接字读取 4 个字节,获取后面的数据量,然后再接收并读取其余数据是完全可以的。只是要小心,当您请求 4 个字节时,套接字可能会给您 1-4 个字节之间的任何位置,因此小于 4 的任何内容都意味着您需要返回并请求剩余的几个字节。这是一个非常常见的错误。在开发环境中,当要求 4 个字节时,您几乎总是会得到 4 个字节,但是一旦您部署应用程序,在某些计算机上的某个位置,您将遇到随机崩溃,因为它们的网络行为在某种程度上有所不同。

一般来说,依靠超时来确定何时到达数据末尾是一个不好的方法。通过超时,您可能会在控制良好的开发环境中“可靠”地工作,但这是一个非常不稳定的解决方案。任何 CPU/磁盘/网络故障都可能导致您的应用过早停止接收。您还限制了数据吞吐量和响应能力,因为您的应用程序会休眠一段时间而不是工作。

When your data is of variable length, then typically that data is framed within another container. That is to say, there's a header preceding the actual data block that tell the receiver how much data it should accept.

For example HTTP uses new line characters to delimit data. If there's variable-length message, then in the header it will include "Content-length:" field that indicates exactly how many bytes to read once entire header is received (header stops when you read 2 consecutive new lines).

It is perfectly fine to read 4 bytes from socket, get how much data follows, then do another receive and read the rest. Only be careful, when you ask for 4 bytes, the socket might give you anywhere between 1-4 bytes so anything less than 4 means you need to go back and ask for remaining few bytes. This is a very common mistake. In dev environment you will almost always get 4 bytes when asking for 4, but once you deploy your app, somewhere on some machine you will get random crashes because their network behavior is somehow different.

Generally, it is a bad approach to rely on timeouts to determine when you reach end of data. With a timeout, you might get things "reliably" working in a well-controlled dev environment, but it is a very flaky solution. Any CPU/disk/network hick up might cause your app to stop receiving prematurely. You are also limiting your data throughput and responsiveness since your app is sleeping for some time interval instead of doing work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文