C——序列化技术

发布于 2024-11-07 05:42:22 字数 395 浏览 1 评论 0原文

我正在编写一些代码来序列化一些数据以通过网络发送。目前,我使用这个原始过程:

  1. 创建一个 void* 缓冲区,
  2. 对我想通过网络发送的数据应用任何字节排序操作,例如 hton 系列,
  3. 使用 memcpy 将内存复制到缓冲区
  4. 通过网络发送内存

问题是,对于各种数据结构(通常包含 void* 数据,因此您不知道是否需要关心字节顺序)由于序列化代码对于每个数据结构都非常特定,并且根本无法重用,因此代码变得非常臃肿。

有哪些好的 C 序列化技术可以使这变得更容易/不那么难看?

-

注意:我受限于特定协议,因此我无法自由选择如何序列化我的数据。

I'm writing some code to serialize some data to send it over the network. Currently, I use this primitive procedure:

  1. create a void* buffer
  2. apply any byte ordering operations such as the hton family on the data I want to send over the network
  3. use memcpy to copy the memory into the buffer
  4. send the memory over the network

The problem is that with various data structures (which often contain void* data so you don't know whether you need to care about byte ordering) the code becomes really bloated with serialization code that's very specific to each data structure and can't be reused at all.

What are some good serialization techniques for C that make this easier / less ugly?

-

Note: I'm bound to a specific protocol so I cannot freely choose how to serialize my data.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

空袭的梦i 2024-11-14 05:42:22

对于每个数据结构,都有一个 serialize_X 函数(其中 X 是结构名称),它接受一个指向 X 的指针和一个指向不透明缓冲区结构的指针,并调用适当的序列化函数。您应该提供一些原语,例如serialize_int,它们写入缓冲区并更新输出索引。
原语必须调用类似 Reserve_space(N) 的函数,其中 N 是写入任何数据之前所需的字节数。 Reserve_space() 将重新分配 void* 缓冲区,使其至少与当前大小加上 N 字节一样大。
为了实现这一点,缓冲区结构需要包含指向实际数据的指针、写入下一个字节的索引(输出索引)以及为数据分配的大小。
有了这个系统,所有的serialize_X函数都应该非常简单,例如:

struct X {
    int n, m;
    char *string;
}

void serialize_X(struct X *x, struct Buffer *output) {
    serialize_int(x->n, output);
    serialize_int(x->m, output);
    serialize_string(x->string, output);
}

框架代码将类似于:

#define INITIAL_SIZE 32

struct Buffer {
    void *data;
    size_t next;
    size_t size;
}

struct Buffer *new_buffer() {
    struct Buffer *b = malloc(sizeof(Buffer));

    b->data = malloc(INITIAL_SIZE);
    b->size = INITIAL_SIZE;
    b->next = 0;
    
    return b;
}

void reserve_space(Buffer *b, size_t bytes) {
    if((b->next + bytes) > b->size) {
        /* double size to enforce O(lg N) reallocs */
        b->data = realloc(b->data, b->size * 2);
        b->size *= 2;
    }
}

由此,实现您需要的所有serialize_()函数应该非常简单。

编辑:
例如:

void serialize_int(int x, Buffer *b) {
    /* assume int == long; how can this be done better? */
    x = htonl(x);

    reserve_space(b, sizeof(int));

    memcpy(((char *)b->data) + b->next, &x, sizeof(int));
    b->next += sizeof(int);
}

编辑:
另请注意,我的代码有一些潜在的错误。没有提供错误处理功能,也没有完成后释放缓冲区的功能,因此您必须自己执行此操作。我只是演示了我将使用的基本架构。

For each data structure, have a serialize_X function (where X is the struct name) which takes a pointer to an X and a pointer to an opaque buffer structure and calls the appropriate serializing functions. You should supply some primitives such as serialize_int which write to the buffer and update the output index.
The primitives will have to call something like reserve_space(N) where N is the number of bytes that are required before writing any data. reserve_space() will realloc the void* buffer to make it at least as big as it's current size plus N bytes.
To make this possible, the buffer structure will need to contain a pointer to the actual data, the index to write the next byte to (output index) and the size that is allocated for the data.
With this system, all of your serialize_X functions should be pretty straightforward, for example:

struct X {
    int n, m;
    char *string;
}

void serialize_X(struct X *x, struct Buffer *output) {
    serialize_int(x->n, output);
    serialize_int(x->m, output);
    serialize_string(x->string, output);
}

And the framework code will be something like:

#define INITIAL_SIZE 32

struct Buffer {
    void *data;
    size_t next;
    size_t size;
}

struct Buffer *new_buffer() {
    struct Buffer *b = malloc(sizeof(Buffer));

    b->data = malloc(INITIAL_SIZE);
    b->size = INITIAL_SIZE;
    b->next = 0;
    
    return b;
}

void reserve_space(Buffer *b, size_t bytes) {
    if((b->next + bytes) > b->size) {
        /* double size to enforce O(lg N) reallocs */
        b->data = realloc(b->data, b->size * 2);
        b->size *= 2;
    }
}

From this, it should be pretty simple to implement all of the serialize_() functions you need.

EDIT:
For example:

void serialize_int(int x, Buffer *b) {
    /* assume int == long; how can this be done better? */
    x = htonl(x);

    reserve_space(b, sizeof(int));

    memcpy(((char *)b->data) + b->next, &x, sizeof(int));
    b->next += sizeof(int);
}

EDIT:
Also note that my code has some potential bugs. There is no provision for error handling and no function to free the Buffer after you're done so you'll have to do this yourself. I was just giving a demonstration of the basic architecture that I would use.

涙—继续流 2024-11-14 05:42:22

我想说绝对不要尝试自己实现序列化。这已经被完成了无数次,您应该使用现有的解决方案。例如protobufs: https://github.com/protobuf-c/protobuf-c

还具有与许多其他编程语言兼容的优点。

I would say definitely don't try to implement serialization yourself. It's been done a zillion times and you should use an existing solution. e.g. protobufs: https://github.com/protobuf-c/protobuf-c

It also has the advantage of being compatible with many other programming languages.

暖风昔人 2024-11-14 05:42:22

我建议使用图书馆。

由于我对现有的库不满意,我创建了 Binn 库来让我们的生活更轻松。

这是使用它的示例:

  binn *obj;

  // create a new object
  obj = binn_object();

  // add values to it
  binn_object_set_int32(obj, "id", 123);
  binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
  binn_object_set_double(obj, "price", 12.50);
  binn_object_set_blob(obj, "picture", picptr, piclen);

  // send over the network
  send(sock, binn_ptr(obj), binn_size(obj));

  // release the buffer
  binn_free(obj);

I suggest using a library.

As I was not happy with the existing ones, I created the Binn library to make our lives easier.

Here is an example of using it:

  binn *obj;

  // create a new object
  obj = binn_object();

  // add values to it
  binn_object_set_int32(obj, "id", 123);
  binn_object_set_str(obj, "name", "Samsung Galaxy Charger");
  binn_object_set_double(obj, "price", 12.50);
  binn_object_set_blob(obj, "picture", picptr, piclen);

  // send over the network
  send(sock, binn_ptr(obj), binn_size(obj));

  // release the buffer
  binn_free(obj);
七色彩虹 2024-11-14 05:42:22

如果我们知道协议限制是什么,将会有所帮助,但总的来说,您的选择确实非常有限。如果数据是这样的,您可以为每个结构创建字节数组 sizeof(struct) 的联合,它可能会简化事情,但从您的描述来看,听起来您有一个更重要的问题:如果您正在传输指针(您提到void * data) 那么这些点不太可能在接收机器上有效。为什么数据会恰好出现在内存中的同一位置?

It would help if we knew what the protocol constraints are, but in general your options are really pretty limited. If the data are such that you can make a union of a byte array sizeof(struct) for each struct it might simplify things, but from your description it sounds like you have a more essential problem: if you're transferring pointers (you mention void * data) then those points are very unlikely to be valid on the receiving machine. Why would the data happen to appear at the same place in memory?

横笛休吹塞上声 2024-11-14 05:42:22

对于“C”程序,没有很多好的“自动”序列化选项。在“放弃”之前,建议查看 SUNRPC 包(rpcgen 和朋友)。它具有:

  • 自定义格式,用于描述数据结构的“XDR”语言(基本上是“C”的子集)。
  • RPC 生成 - 可以自动生成客户端和服务器端的序列化。
  • 运行时库,随(几乎)所有 UNIX 环境一起提供。

协议和代码具有互联网标准。

For "C" programs, when there are not lot of good options for "automatic" serialization. Before "giving up", suggesting to review the SUNRPC package (rpcgen and friends). It has:

  • Custom format, the "XDR" language (basically, subset of "C") to describe data structure.
  • RPC generation - making it possible to automatically generate the client and server side of the serialization.
  • Runtime library, shipped with (almost) all unix environment.

The protocol and code have internet standard.

东北女汉子 2024-11-14 05:42:22

这个图书馆可以帮助你。
https://github.com/souzomain/Packer

使用起来很简单,代码也很干净学习。

使用示例:

PPACKER protocol = packer_init();
packer_add_data(protocol, yourstructure, sizeof(yourstructure));
send(fd, protocol->buffer, protocol->offset, 0);
packer_free(protocol);

This library can help you.
https://github.com/souzomain/Packer

It's easy to use, and the code is clean to study.

use example:

PPACKER protocol = packer_init();
packer_add_data(protocol, yourstructure, sizeof(yourstructure));
send(fd, protocol->buffer, protocol->offset, 0);
packer_free(protocol);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文