为每个连接缓存未使用的数据?

发布于 2025-01-07 21:05:06 字数 326 浏览 1 评论 0原文

我正在编写一个由 1 个调度程序线程和 N 个工作线程组成的 C 程序,其负责描述如下:

调度程序线程: 监听 TCP 端口; 在该端口上重复执行 epoll_wait() ; 当连接建立时,接受它并将新的文件描述符(即“accept”函数返回的内容)传递给N个工作线程之一;

工作线程: 在每个新连接上,重复读取,直到没有收到数据; 使用接收到的所有数据作为参数来调用解码函数,该函数将数据解码为消息结构(即 RTSP 消息);

我想知道的是,如果工作线程读取的数据不完整,我是否应该缓存它,这意味着我应该维护一个全局列表来缓存每个未使用的数据(即收到但不是完整消息,因此尚未使用)联系?

I'm writing a C program composed of one dispatcher thread and N worker thread, the responsible of which are described below:

dispatcher thread:
listen on a TCP port;
do epoll_wait() repeatedly on that port;
when connection established, accept it and pass the new file descriptor(i.e. what the "accept" function return) to one of the N worker thread;

worker thread:
upon each new connection, do read repeatedly until no data received;
using all the data received as parameter to call the decode function which will decode the data to a message structure (i.e. an RTSP message);

what I wonder is that, if the data that worker thread read is incomplete, should I cache it which means that I should maintain a global list to cache the unused data(i.e. received but not of full message, so not used yet) for each connection?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

赠我空喜 2025-01-14 21:05:07

如果您每个套接字使用一个工作人员,我想没有问题,您只需阻塞,直到收到所有消息。我假设这不是您的情况。

如果您使用工作线程以非阻塞方式处理多个套接字,则可以使用以下方法:

  1. 开始以预先确定的缓冲区大小读取数据。 (尝试将缓冲区的大小与消息的最大可能长度相匹配,这将为您节省副本)。

  2. 确定总消息长度(根据协议的标头)并计算需要继续阅读多少才能完成整个消息。在这种情况下,您可能已经读了“太多”,因此您应该为“下一个”消息分配另一个缓冲区,如果您想要更通用,您可以保留 n 个这样的缓冲区(基于最小消息长度和指定要读取的缓冲区)。
    您还可以选择始终只读取标题并从那里继续(这将确保您不会读取太多),但这会更浪费(每条消息需要两次读取)。

  3. 如果消息已完全读取,则处理它,否则,保留缓冲区和要读取该消息的字节数,并再次循环通过套接字(您的 epool)。

  4. 在下次处理同一套接字时,您将检查当前是否有部分消息,并从上次完成的位置继续读入同一缓冲区。您需要在此处读取接下来的 x 个字节,并且您需要准备好少于您的预期。
    在这里,您还可以添加一项优化,一次性读取此套接字上的所有内容(留在缓冲区中)(不仅仅是剩下的下一个 x 字节,从而节省了一些系统调用)。如果这样做,您将需要使用向量(readv() 或类似的)。

如果你不进行优化,处理起来非常简单。

If you use a worker per socket I guess there is no problem, you just block until you get all the message.. I'm assuming this is not your case.

If you use a worker for handling several sockets in a non-blocking manner, you could use this approach:

  1. Start reading the data in a pre-determined buffer size. (Try to match the size of the buffer to the maximum possible length of the message, this will save you copies).

  2. Determine the total message length (from the header of your protocol) and calculate how much you need to continue reading to finish the whole message. In this case, you may have already read "too much", so you should allocate another buffer for the "next" message, and if you want to be more generic, you could keep n such buffers (based on the minimal message length and the assigned buffer to read).
    You could also choose to always read only the header and continue from there (this will make sure you do not read too much), but it will be more wasteful (you need two reads per each message).

  3. If the message is fully read, process it, otherwise, keep the buffer and the amount of bytes to read for this message and loop again through the sockets (your epool).

  4. On your next handling of the same socket, you will check if you currently have a partial message and continue reading into the same buffer from the location you finished the last time. You need to read here the next x bytes, and you need to be prepared to have less than what you expect.
    Here you could add also an optimization, reading all that you have (left in the buffer) on this socket in one shot (not only the next x bytes left, saving you some system calls). If you do that, you'll need to use vectors (readv() or similar).

If you go without the optimization stuff, it is pretty simple to handle.

ヤ经典坏疍 2025-01-14 21:05:07

那么,您可以做的是维护一个固定大小的缓冲区,用于接收消息。整个消息和缓冲区的大小应该相同。
每次通过套接字描述符接收消息时,您都可以检查大小是否匹配。如果没有,您可以:

  1. 转储消息并请求重新传输(这是简单的情况)
  2. 跟踪数据包并找出它被剪掉的位置,然后仅重新传输消息的剩余部分。

希望这有帮助。

Well, what you could do is to maintain a fixed size buffer which is used to receive the message. The size of the entire message and the buffer should be the same.
Each time you receive a message over a socket descriptor, you could check and see if the size matches. If not, you can:

  1. Either dump the message and request for a retransmit (which is the easy case)
  2. Trace the packet and find out where it got clipped off and retransmit only the remaining part of the message.

Hope this helps.

放我走吧 2025-01-14 21:05:07

数据是否需要缓存取决于数据长度、连接数和内存大小。
例如,假设我们使用HTTP,正常的HTTP header应该小于4096字节,如果客户端使用POST方法,我们可以解析“Content-Length”,如果Content-Length太大,我们可以将post数据缓存在临时文件。

Whether the data need be cached depends on the data length, how many connections and memory size.
For example, suppose we use HTTP, the normal HTTP header should be less than 4096 bytes, if the client use POST method, we can parse "Content-Length", if Content-Length is too large, we can cached the post data in the temporary files.

成熟的代价 2025-01-14 21:05:07

全球名单?为什么你需要这样的东西? buffer/buffer-array/buffer-linkedList/buffer-whatever 应该是套接字对象的成员或引用它。如果数据必须被解析并分成某种应用协议单元,那么,是的,必须隔离“剩余”数据,以便它可以构成下一个 APU 的一部分。要么复制它,要么允许每个缓冲区都有一个“起始索引”,该索引不一定必须为 0。

Global list? Why would you need such a thing? The buffer/buffer-array/buffer-linkedList/buffer-whatever should be a member of the socket object or referenced to/from it. If the data has to be parsed and blocked up into some sort of Application Protocol-Unit then, yes, the 'left-over' data has to be isolated so that it can form part of the next APU. Either copy it or allow each buffer to have a 'start index' that does not necessarily have to be 0.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文