通过 Boost TCP 发送大块数据？

发布于 2024-08-08 14:45:42 字数 828 浏览 4 评论 0原文

我必须通过 TCP 将网格数据从一台计算机发送到另一台计算机...这些网格可能相当大。我很难思考通过 TCP 发送它们的最佳方式是什么，因为我对网络编程了解不多。

这是我的基本类结构，我需要将其放入通过 TCP 发送的缓冲区中：

class PrimitiveCollection
{
    std::vector<Primitive*> primitives;
};

class Primitive 
{
    PRIMTYPES primType; // PRIMTYPES is just an enum with values for fan, strip, etc...
    unsigned int numVertices;
    std::vector<Vertex*> vertices;
};


class Vertex
{
    float X;
    float Y;
    float Z;
    float XNormal;
    float ZNormal;
};

我正在使用 Boost 库及其 TCP 内容...它相当容易使用。您只需填充一个缓冲区并通过 TCP 发送它即可。
然而，当然这个缓冲区只能这么大，我最多只能发送 2 MB 的数据。

那么，将上述类结构放入所需缓冲区并通过网络发送的最佳方法是什么？我还需要在接收端进行反序列化。

任何这方面的指导将不胜感激。

编辑：再次阅读本文后，我意识到这确实是一个更普遍的问题，并不是 Boost 所特有的......它更多的是对数据进行分块并发送的问题。不过，我仍然有兴趣看看 Boost 是否有任何东西可以在某种程度上抽象化这一点。

原文

I have to send mesh data via TCP from one computer to another... These meshes can be rather large. I'm having a tough time thinking about what the best way to send them over TCP will be as I don't know much about network programming.

Here is my basic class structure that I need to fit into buffers to be sent via TCP:

class PrimitiveCollection
{
    std::vector<Primitive*> primitives;
};

class Primitive 
{
    PRIMTYPES primType; // PRIMTYPES is just an enum with values for fan, strip, etc...
    unsigned int numVertices;
    std::vector<Vertex*> vertices;
};


class Vertex
{
    float X;
    float Y;
    float Z;
    float XNormal;
    float ZNormal;
};

I'm using the Boost library and their TCP stuff... it is fairly easy to use. You can just fill a buffer and send it off via TCP.
However, of course this buffer can only be so big and I could have up to 2 megabytes of data to send.

So what would be the best way to get the above class structure into the buffers needed and sent over the network? I would need to deserialize on the recieving end also.

Any guidance in this would be much appreciated.

EDIT: I realize after reading this again that this really is a more general problem that is not specific to Boost... Its more of a problem of chunking the data and sending it. However I'm still interested to see if Boost has anything that can abstract this away somewhat.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一城柳絮吹成雪 2024-08-15 14:45:42

你试过Boost的TCP吗？我不明白为什么 2MB 会成为传输的问题。我假设我们正在谈论以 100mbps 或 1gbps 运行的 LAN、一台具有充足 RAM 的计算机，并且不必具有 > 20ms 响应时间？如果您的目标是将所有 2MB 从一台计算机传输到另一台计算机，只需发送它，TCP 就会为您处理将其分块。

我有一个用 Boost 编写的 TCP 延迟检查工具，它尝试发送各种大小的缓冲区，我通常检查最多 20MB，这些似乎都没有问题。

我想我想说的是，除非您知道自己有问题，否则不要花时间开发解决方案:-)

--------- 解决方案实施 --------

现在我有几分钟的时间，我浏览并快速实现了您所说的内容： https://github.com/teeks99/data-chunker 共有三大部分：

序列化器/反序列化器，boost 有自己的，但比自己滚动好不了多少，所以我就这么做了。

发送方 - 通过 TCP 连接到接收方并发送数据

接收方 - 等待来自发送方的连接并解包收到的数据。

我已将 .exe 包含在 zip 中，运行 Sender.exe/Receiver.exe --help 以查看选项，或者仅查看 main。

更详细的解释：
打开两个命令提示符，然后转到两个命令提示符中的 DataChunker\Debug。
在其中之一运行 Receiver.exe
在另一台计算机上运行 Sender.exe（可以在另一台计算机上运行，在这种情况下，如果您想尝试发送多次，请在可执行文件名称后添加 --remote-host=IP.ADD.RE.SS 和 --num -sends=10 发送十次）。
查看代码，您可以看到发生了什么，在各自的 main() 函数中创建 TCP 套接字的接收方和发送方端。发送者创建一个新的 PrimitiveCollection 并用一些示例数据填充它，然后序列化并发送它...接收者将数据反序列化到一个新的 PrimitiveCollection 中，此时原始集合可以被其他人使用，但我刚刚写了到控制台它已经完成了。

编辑：将示例移至 github。

Have you tried it with Boost's TCP? I don't see why 2MB would be an issue to transfer. I'm assuming we're talking about a LAN running at 100mbps or 1gbps, a computer with plenty of RAM, and don't have to have > 20ms response times? If your goal is to just get all 2MB from one computer to another, just send it, TCP will handle chunking it up for you.

I have a TCP latency checking tool that I wrote with Boost, that tries to send buffers of various sizes, I routinely check up to 20MB and those seem to get through without problems.

I guess what I'm trying to say is don't spend your time developing a solution unless you know you have a problem :-)

--------- Solution Implementation --------

Now that I've had a few minutes on my hands, I went through and made a quick implementation of what you were talking about: https://github.com/teeks99/data-chunker There are three big parts:

The serializer/deserializer, boost has its own, but its not much better than rolling your own, so I did.

Sender - Connects to the receiver over TCP and sends the data

Receiver - Waits for connections from the sender and unpacks the data it receives.

I've included the .exe(s) in the zip, run Sender.exe/Receiver.exe --help to see the options, or just look at main.

More detailed explanation:
Open two command prompts, and go to DataChunker\Debug in both of them.
Run Receiver.exe in one of the
Run Sender.exe in the other one (possible on a different computer, in which case add --remote-host=IP.ADD.RE.SS after the executable name, if you want to try sending more than once and --num-sends=10 to send ten times).
Looking at the code, you can see what's going on, creating the receiver and sender ends of the TCP socket in the respecitve main() functions. The sender creates a new PrimitiveCollection and fills it in with some example data, then serializes and sends it...the receiver deserializes the data into a new PrimitiveCollection, at which point the primitive collection could be used by someone else, but I just wrote to the console that it was done.

Edit: Moved the example to github.

回复收藏 0 原文

披肩女神 2024-08-15 14:45:42

没有任何花哨的东西，根据我在网络课程中的记忆：

向接收者发送一条消息，询问它可以处理多大大小的数据块，
取其中的最小值和您自己的发送能力，然后回复说：
- 您要发送的尺寸、数量
得到之后，只需发送每个块。您需要等待“确定”回复，这样您就知道自己不会浪费时间发送给不存在的客户。这也是客户端发送“我正在取消”消息而不是“确定”消息的好时机。
发送直到所有数据包都收到“Ok”回复为止
数据已传输。

这是可行的，因为 TCP 保证按顺序传送。 UDP 需要数据包编号（用于排序）。

压缩是相同的，只不过您发送的是压缩数据。（数据就是数据，这完全取决于你如何解释它）。只要确保你传达了数据是如何压缩的:)

至于例子，我能找到的只是此页面和此老问题。我认为你正在做的事情可以与 Boost.Serialization。

回复收藏 0 原文

一萌ing 2024-08-15 14:45:42

我想添加一点需要考虑 - 设置 TCP 套接字缓冲区大小以在一定程度上提高套接字性能。

有一个实用程序 Iperf 可以测试TCP 套接字上的交换速度。我在 Windows 上的 100 Mbs LAN 中运行了一些测试。使用 8Kb 默认 TCP 窗口大小时，速度为 89 Mbits/sec，使用 64Kb TCP 窗口大小时，速度为 94 Mbits/sec。

回复收藏 0 原文

梦里°也失望 2024-08-15 14:45:42

除了如何分块和交付数据之外，您应该考虑的另一个问题是平台差异。如果两台计算机具有相同的体系结构，并且双方运行的代码是同一编译器的相同版本，那么您可能应该能够通过网络转储原始内存结构并让它在另一台计算机上工作边。但是，如果一切都不相同，您可能会遇到字节序、结构填充、字段对齐等问题。

一般来说，最好为数据定义一个网络格式，与内存中的表示形式分开。该格式可以是二进制的，在这种情况下，数值应该转换为标准形式（主要是将字节序更改为“网络顺序”，即大字节序），也可以是文本格式。许多网络协议选择文本，因为它消除了很多格式问题，并且使调试更容易。就我个人而言，我非常喜欢 JSON。它并不太冗长，每种编程语言都有很好的库可用，而且对于人类来说真的很容易阅读和理解。

定义网络协议时要考虑的关键问题之一是接收器如何知道何时收到了所有数据。有两种基本方法。首先，您可以在消息的开头发送明确的大小，然后接收者知道要继续读取，直到获得那么多字节。另一种是使用某种消息结束定界符。后者的优点是您不必提前知道要发送多少字节，但缺点是您必须弄清楚如何确保消息结束定界符不会出现在消息中。信息。

一旦您决定了数据在网络上流动时应如何构建，那么您应该找到一种将内部表示转换为该格式的方法，最好是“流”方式，这样您就可以循环遍历数据结构，将内部表示转换为该格式。将其每一部分转换为网络格式并将其写入网络套接字。

在接收端，您只需反转该过程，将网络格式解码为适当的内存格式。

对于您的情况，我的建议是使用 JSON。 2 MB 并不是很多数据，因此生成和解析的开销不会很大，并且您可以轻松地直接用 JSON 表示您的数据结构。生成的文本将是自定界的、人类可读的、易于流式传输并且易于解析回目标端的内存中。

In addition to how to chunk and deliver the data, another issue you should consider is platform differences. If the two computers are the same architecture, and the code running on both sides is the same version of the same compiler, then you should, probably, be able to just dump the raw memory structure across the network and have it work on the other side. If everything isn't the same, though, you can run into problems with endianness, structure padding, field alignment, etc.

In general, it's good to define a network format for the data separately from your in-memory representation. That format can be binary, in which case numeric values should be converted to standard forms (mainly, changing endianness to "network order", which is big-endian), or it can be textual. Many network protocols opt for text because it eliminates a lot of formatting issues and because it makes debugging easier. Personally, I really like JSON. It's not too verbose, there are good libraries available for every programming language, and it's really easy for humans to read and understand.

One of the key issues to consider when defining your network protocol is how the receiver knows when it has received all of the data. There are two basic approaches. First, you can send an explicit size at the beginning of the message, then the receiver knows to keep reading until it's gotten that many bytes. The other is to use some sort of an end-of-message delimiter. The latter has the advantage that you don't have to know in advance how many bytes you're sending, but the disadvantage that you have to figure out how to make sure the the end-of-message delimiter can't appear in the message.

Once you decide how the data should be structured as it's flowing across the network, then you should figure out a way to convert the internal representation to that format, ideally in a "streaming" way, so you can loop through your data structure, converting each piece of it to network format and writing it to the network socket.

On the receiving side, you just reverse the process, decoding the network format to the appropriate in-memory format.

My recommendation for your case is to use JSON. 2 MB is not a lot of data, so the overhead of generating and parsing won't be large, and you can easily represent your data structure directly in JSON. The resulting text will be self-delimiting, human-readable, easy to stream, and easy to parse back into memory on the destination side.

回复收藏 0 原文

~没有更多了~