我应该如何处理不完整的数据包缓冲区?
我正在为服务器编写一个客户端,该服务器通常以 500 或更少字节的字符串形式发送数据。然而,数据偶尔会超出这个范围,并且据客户端所知(在初始化或重大事件上),一组数据可能包含 200,000 字节。但是,我不想让每个客户端都运行 50 MB 套接字缓冲区(如果可能的话)。
每组数据均由空 \0
字符分隔。我应该使用什么样的结构来存储部分发送的数据集?
例如,服务器可以发送ABCDEFGHIJKLMNOPQRSTUV\0WXYZ\0123!\0
。我想独立处理 ABCDEFGHIJKLMNOPQRSTUV
、WXYZ
和 123!
。此外,服务器可以发送不带终止字符的 ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890LOL123HAHATHISISREALLYLONG
。我希望该数据集存储在某个地方以便以后追加和处理。
另外,如果重要的话,我正在使用异步套接字方法(BeginSend
、EndSend
、BeginReceive
、EndReceive
)。
目前我正在 List
和 StringBuilder
之间争论。对于这种情况,对两者进行任何比较都会非常有帮助。
I am writing a client for a server that typically sends data as strings in 500 or less bytes. However, the data will occasionally exceed that, and a single set of data could contain 200,000 bytes, for all the client knows (on initialization or significant events). However, I would like to not have to have each client running with a 50 MB socket buffer (if it's even possible).
Each set of data is delimited by a null \0
character. What kind of structure should I look at for storing partially sent data sets?
For example, the server may send ABCDEFGHIJKLMNOPQRSTUV\0WXYZ\0123!\0
. I would want to process ABCDEFGHIJKLMNOPQRSTUV
, WXYZ
, and 123!
independently. Also, the server could send ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890LOL123HAHATHISISREALLYLONG
without the terminating character. I would want that data set stored somewhere for later appending and processing.
Also, I'm using asynchronous socket methods (BeginSend
, EndSend
, BeginReceive
, EndReceive
) if that matters.
Currently I'm debating between List<Byte>
and StringBuilder
. Any comparison of the two for this situation would be very helpful.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
将数据从套接字读入缓冲区。当您获得终止字符时,将其转换为消息并将其发送到代码的其余部分。
另外,请记住 TCP 是一个流,而不是一个数据包。因此,您永远不应该假设您将在一次读取中一次性发送所有内容。
就缓冲区而言,每个连接最多只需要一个缓冲区。我可能会从您合理期望收到的最大大小开始,如果填满,则创建一个更大大小的新缓冲区 - 典型的策略是当您用完时将大小加倍避免进行过多的分配。
如果您有多个传入连接,您可能需要执行一些操作,例如创建一个缓冲区池,并在使用完它们后将“大”连接返回到池中。
Read the data from the socket into a buffer. When you get the terminating character, turn it into a message and send it on its way to the rest of your code.
Also, remember that TCP is a stream, not a packet. So you should never assume that you will get everything sent at one time in a single read.
As far as buffers go, you should probably only need one per connection at most. I'd probably start with the max size that you reasonably expect to receive, and if that fills, create a new buffer of a larger size - a typical strategy is to double the size when you run out to avoid churning through too many allocations.
If you have multiple incoming connections, you may want to do something like create a pool of buffers, and just return "big" ones to the pool when done with them.
您可以仅使用
List
作为缓冲区,因此 .NET 框架会根据需要自动扩展它。当您找到空终止符时,您可以使用List.RemoveRange()从缓冲区中删除该消息并将其传递到上一层。您可能想要添加一个检查,并在超过一定长度时抛出异常,而不是仅仅等到客户端耗尽内存。
(这与 Ben S 的答案非常相似,但我认为在面对编码问题时,字节数组比 StringBuilder 更强大。一旦你有了完整的消息,将字节解码为字符串最好在更高的位置完成。)
You could just use a
List<byte>
as your buffer, so the .NET framework takes care of automatically expanding it as needed. When you find a null terminator you can useList.RemoveRange()
to remove that message from the buffer and pass it to the next layer up.You'd probably want to add a check and throw an exception if it exceeds a certain length, rather than just wait until the client runs out of memory.
(This is very similar to Ben S's answer, but I think a byte array is a bit more robust than a StringBuilder in the face of encoding issues. Decoding bytes to a string is best done higher up, once you have a complete message.)
我只想使用
StringBuilder
< /a> 并一次读入一个字符,每当我遇到空终止符时就会复制并清空构建器。I would just use a
StringBuilder
and read in one character at a time, copying and emptying the builder whenever I hit a null terminator.我写了这个关于 Java 套接字的答案,但概念是相同的。
监视套接字是否有新数据然后处理该数据的最佳方法是什么?
I wrote this answer regarding Java sockets but the concept is the same.
What's the best way to monitor a socket for new data and then process that data?