C——序列化技术
我正在编写一些代码来序列化一些数据以通过网络发送。目前,我使用这个原始过程:
- 创建一个
void*
缓冲区, - 对我想通过网络发送的数据应用任何字节排序操作,例如
hton
系列, - 使用
memcpy
将内存复制到缓冲区 - 通过网络发送内存
问题是,对于各种数据结构(通常包含 void* 数据,因此您不知道是否需要关心字节顺序)由于序列化代码对于每个数据结构都非常特定,并且根本无法重用,因此代码变得非常臃肿。
有哪些好的 C 序列化技术可以使这变得更容易/不那么难看?
-
注意:我受限于特定协议,因此我无法自由选择如何序列化我的数据。
I'm writing some code to serialize some data to send it over the network. Currently, I use this primitive procedure:
- create a
void*
buffer - apply any byte ordering operations such as the
hton
family on the data I want to send over the network - use
memcpy
to copy the memory into the buffer - send the memory over the network
The problem is that with various data structures (which often contain void* data so you don't know whether you need to care about byte ordering) the code becomes really bloated with serialization code that's very specific to each data structure and can't be reused at all.
What are some good serialization techniques for C that make this easier / less ugly?
-
Note: I'm bound to a specific protocol so I cannot freely choose how to serialize my data.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
对于每个数据结构,都有一个 serialize_X 函数(其中 X 是结构名称),它接受一个指向 X 的指针和一个指向不透明缓冲区结构的指针,并调用适当的序列化函数。您应该提供一些原语,例如serialize_int,它们写入缓冲区并更新输出索引。
原语必须调用类似 Reserve_space(N) 的函数,其中 N 是写入任何数据之前所需的字节数。 Reserve_space() 将重新分配 void* 缓冲区,使其至少与当前大小加上 N 字节一样大。
为了实现这一点,缓冲区结构需要包含指向实际数据的指针、写入下一个字节的索引(输出索引)以及为数据分配的大小。
有了这个系统,所有的serialize_X函数都应该非常简单,例如:
框架代码将类似于:
由此,实现您需要的所有serialize_()函数应该非常简单。
编辑:
例如:
编辑:
另请注意,我的代码有一些潜在的错误。没有提供错误处理功能,也没有完成后释放缓冲区的功能,因此您必须自己执行此操作。我只是演示了我将使用的基本架构。
For each data structure, have a serialize_X function (where X is the struct name) which takes a pointer to an X and a pointer to an opaque buffer structure and calls the appropriate serializing functions. You should supply some primitives such as serialize_int which write to the buffer and update the output index.
The primitives will have to call something like reserve_space(N) where N is the number of bytes that are required before writing any data. reserve_space() will realloc the void* buffer to make it at least as big as it's current size plus N bytes.
To make this possible, the buffer structure will need to contain a pointer to the actual data, the index to write the next byte to (output index) and the size that is allocated for the data.
With this system, all of your serialize_X functions should be pretty straightforward, for example:
And the framework code will be something like:
From this, it should be pretty simple to implement all of the serialize_() functions you need.
EDIT:
For example:
EDIT:
Also note that my code has some potential bugs. There is no provision for error handling and no function to free the Buffer after you're done so you'll have to do this yourself. I was just giving a demonstration of the basic architecture that I would use.
我想说绝对不要尝试自己实现序列化。这已经被完成了无数次,您应该使用现有的解决方案。例如protobufs: https://github.com/protobuf-c/protobuf-c
还具有与许多其他编程语言兼容的优点。
I would say definitely don't try to implement serialization yourself. It's been done a zillion times and you should use an existing solution. e.g. protobufs: https://github.com/protobuf-c/protobuf-c
It also has the advantage of being compatible with many other programming languages.
我建议使用图书馆。
由于我对现有的库不满意,我创建了 Binn 库来让我们的生活更轻松。
这是使用它的示例:
I suggest using a library.
As I was not happy with the existing ones, I created the Binn library to make our lives easier.
Here is an example of using it:
如果我们知道协议限制是什么,将会有所帮助,但总的来说,您的选择确实非常有限。如果数据是这样的,您可以为每个结构创建字节数组 sizeof(struct) 的联合,它可能会简化事情,但从您的描述来看,听起来您有一个更重要的问题:如果您正在传输指针(您提到void * data) 那么这些点不太可能在接收机器上有效。为什么数据会恰好出现在内存中的同一位置?
It would help if we knew what the protocol constraints are, but in general your options are really pretty limited. If the data are such that you can make a union of a byte array sizeof(struct) for each struct it might simplify things, but from your description it sounds like you have a more essential problem: if you're transferring pointers (you mention void * data) then those points are very unlikely to be valid on the receiving machine. Why would the data happen to appear at the same place in memory?
对于“C”程序,没有很多好的“自动”序列化选项。在“放弃”之前,建议查看 SUNRPC 包(rpcgen 和朋友)。它具有:
协议和代码具有互联网标准。
For "C" programs, when there are not lot of good options for "automatic" serialization. Before "giving up", suggesting to review the SUNRPC package (rpcgen and friends). It has:
The protocol and code have internet standard.
这个图书馆可以帮助你。
https://github.com/souzomain/Packer
使用起来很简单,代码也很干净学习。
使用示例:
This library can help you.
https://github.com/souzomain/Packer
It's easy to use, and the code is clean to study.
use example: