从 C 中的大结构中获取子结构

发布于 2024-07-21 04:46:13 字数 2052 浏览 3 评论 0原文

我在现有程序中有一个非常大的struct。该结构体包含大量位域。

我希望保存其中的一部分（例如，150 个字段中的 10 个字段）。

我用来保存子类的示例代码是：

typedef struct {int a;int b;char c} bigstruct;
typedef struct {int a;char c;} smallstruct;
void substruct(smallstruct *s,bigstruct *b) {
    s->a = b->a;
    s->c = b->c;
}
int save_struct(bigstruct *bs) {
    smallstruct s;
    substruct(&s,bs);
    save_struct(s);
}

我还希望选择它的哪一部分不会太麻烦，因为我希望时不时地更改它。我之前提出的幼稚方法非常脆弱且难以维护。当扩展到 20 个不同的字段时，您必须更改 smallstruct 和 substruct 函数中的字段。

我想到了两种更好的方法。不幸的是，两者都要求我使用一些外部 CIL 之类的工具来解析我的结构。

第一种方法是自动生成 substruct 函数。我只需设置 smallstruct 的结构，并有一个程序可以解析它并根据 smallstruct 中的字段生成 substruct 函数。

第二种方法是（使用 C 解析器）构建有关 bigstruct 的元信息，然后编写一个允许我访问结构中特定字段的库。这就像 Java 类反射的临时实现。

例如，假设没有结构对齐，对于结构，

struct st {
    int a;
    char c1:5;
    char c2:3;
    long d;
}

我将生成以下元信息：

int field2distance[] = {0,sizeof(int),sizeof(int),sizeof(int)+sizeof(char)}
int field2size[] = {sizeof(int),1,1,sizeof(long)}
int field2bitmask[] =  {0,0x1F,0xE0,0};
char *fieldNames[] = {"a","c1","c2","d"};

我将使用此函数获取 i^th 字段：

long getFieldData(void *strct,int i) {
    int distance = field2distance[i];
    int size = field2size[i];
    int bitmask = field2bitmask[i];
    void *ptr = ((char *)strct + distance);
    long result;
    switch (size) {
        case 1: //char
             result = *(char*)ptr;
             break;
        case 2: //short
             result = *(short*)ptr;
        ...
    }
    if (bitmask == 0) return result;
    return (result & bitmask) >> num_of_trailing_zeros(bitmask);
 }

两种方法都需要额外的工作，但是一旦解析器位于你的 makefile 中 - 更改子结构就很容易了。

但是我宁愿在没有任何外部依赖的情况下这样做。

有人有更好的主意吗？我的想法有什么好处，互联网上是否有一些可以实现我的想法的方法？

原文

I'm having a very big struct in an existing program. This struct includes a great number of bitfields.

I wish to save a part of it (say, 10 fields out of 150).

An example code I would use to save the subclass is:

typedef struct {int a;int b;char c} bigstruct;
typedef struct {int a;char c;} smallstruct;
void substruct(smallstruct *s,bigstruct *b) {
    s->a = b->a;
    s->c = b->c;
}
int save_struct(bigstruct *bs) {
    smallstruct s;
    substruct(&s,bs);
    save_struct(s);
}

I also wish that selecting which part of it wouldn't be too much hassle, since I wish to change it every now and then. The naive approach I presented before is very fragile and unmaintainable. When scaling up to 20 different fields, you have to change fields both in the smallstruct, and in the substruct function.

I thought of two better approaches. Unfortunately both requires me to use some external CIL like tool to parse my structs.

The first approach is automatically generating the substruct function. I'll just set the struct of smallstruct, and have a program that would parse it and generate the substruct function according to the fields in smallstruct.

The second approach is building (with C parser) a meta-information about bigstruct, and then write a library that would allow me to access a specific field in the struct. It would be like ad-hoc implementation of Java's class reflection.

For example, assuming no struct-alignment, for struct

struct st {
    int a;
    char c1:5;
    char c2:3;
    long d;
}

I'll generate the following meta information:

int field2distance[] = {0,sizeof(int),sizeof(int),sizeof(int)+sizeof(char)}
int field2size[] = {sizeof(int),1,1,sizeof(long)}
int field2bitmask[] =  {0,0x1F,0xE0,0};
char *fieldNames[] = {"a","c1","c2","d"};

I'll get the i^th field with this function:

long getFieldData(void *strct,int i) {
    int distance = field2distance[i];
    int size = field2size[i];
    int bitmask = field2bitmask[i];
    void *ptr = ((char *)strct + distance);
    long result;
    switch (size) {
        case 1: //char
             result = *(char*)ptr;
             break;
        case 2: //short
             result = *(short*)ptr;
        ...
    }
    if (bitmask == 0) return result;
    return (result & bitmask) >> num_of_trailing_zeros(bitmask);
 }

Both methods requires extra work, but once the parser is in your makefile - changing the substruct is a breeze.

However I'd rather do that without any external dependencies.

Does anyone have any better idea? Where my ideas any good, is there some availible implementation of my ideas on the internet?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

满地尘埃落定 2024-07-28 04:46:14

从您的描述来看，您似乎有权访问并可以修改您的原始结构。我建议您将子结构重构为完整的类型（就像您在示例中所做的那样），然后将该结构作为大结构上的一个字段，将原始结构中的所有这些字段封装到较小的结构中。

扩展你的小例子：

typedef struct 
{
  int a;
  char c;
} smallstruct;

typedef struct 
{
  int b;
  smallstruct mysub;
} bigstruct;

访问小结构信息可以像这样完成：

/* stack-based allocation */
bigstruct mybig;
mybig.mysub.a = 1;
mybig.mysub.c = '1';
mybig.b = 2;

/* heap-based allocation */
bigstruct * mybig = (bigstruct *)malloc(sizeof(bigstruct));
mybig->mysub.a = 1;
mybig->mysub.c = '1';
mybig->b = 2;

但你也可以传递指向小结构的指针：

void dosomething(smallstruct * small)
{ 
  small->a = 3;
  small->c = '3';
}

/* stack based */    
dosomething(&(mybig.mysub));

/* heap based */    
dosomething(&((*mybig).mysub));

优点：

没有宏
没有外部依赖项
没有内存顺序转换黑客
更干净，更易于阅读和使用代码。

From your description, it looks like you have access to and can modify your original structure. I suggest you refactor your substructure into a complete type (as you did in your example), and then make that structure a field on your big structure, encapsulating all of those fields in the original structure into the smaller structure.

Expanding on your small example:

typedef struct 
{
  int a;
  char c;
} smallstruct;

typedef struct 
{
  int b;
  smallstruct mysub;
} bigstruct;

Accessing the smallstruct info would be done like so:

/* stack-based allocation */
bigstruct mybig;
mybig.mysub.a = 1;
mybig.mysub.c = '1';
mybig.b = 2;

/* heap-based allocation */
bigstruct * mybig = (bigstruct *)malloc(sizeof(bigstruct));
mybig->mysub.a = 1;
mybig->mysub.c = '1';
mybig->b = 2;

But you could also pass around pointers to the small struct:

void dosomething(smallstruct * small)
{ 
  small->a = 3;
  small->c = '3';
}

/* stack based */    
dosomething(&(mybig.mysub));

/* heap based */    
dosomething(&((*mybig).mysub));

Benefits:

No Macros
No external dependencies
No memory-order casting hacks
Cleaner, easier-to-read and use code.

回复收藏 0 原文

冬天的雪花 2024-07-28 04:46:14

如果更改字段的顺序不是不可能的，您可以重新排列 bigstruct 字段，使smallstruct 字段放在一起，然后只需从一个结构体字段转换为另一个结构体字段（可能添加偏移量）。
就像是：

typedef struct {int a;char c;int b;} bigstruct;
typedef struct {int a;char c;} smallstruct;

int save_struct(bigstruct *bs) {
    save_struct((smallstruct *)bs);
}

If changing the order of the fields isn't out of the question, you can rearrange the bigstruct fields in such a way that the smallstruct fields are together, and then its simply a matter of casting from one to another (possibly adding an offset).
Something like:

typedef struct {int a;char c;int b;} bigstruct;
typedef struct {int a;char c;} smallstruct;

int save_struct(bigstruct *bs) {
    save_struct((smallstruct *)bs);
}

回复收藏 0 原文

漫漫岁月 2024-07-28 04:46:14

宏是你的朋友。

一种解决方案是将大结构移至其自己的包含文件中，然后进行宏聚会。

不要通常定义结构，而是选择一些宏，例如 BEGIN_STRUCTURE、END_STRUCTURE、NORMAL_FIELD、SUBSET_FIELD

然后您可以多次包含该文件，为每次传递重新定义这些结构。第一个会将定义转换为正常结构，两种类型的字段都正常输出。第二个将定义 NORMAL_FIELD 没有任何内容，并将创建您的子集。第三个将创建适当的代码来复制子集字段。

您最终将得到结构的单个定义，它允许您控制子集中的字段并自动为您创建合适的代码。

回复收藏 0 原文

飘然心甜 2024-07-28 04:46:14

为了帮助您获取元数据，您可以参考 offsetof() 宏，它还有一个好处是可以处理您可能拥有的任何填充

回复收藏 0 原文

红颜悴 2024-07-28 04:46:14

我建议采取这种方法：

诅咒编写大结构的人。获得一个巫毒娃娃并享受一些乐趣。
以某种方式标记您需要的大结构的每个字段（宏或注释或其他）
编写一个小工具来读取头文件并提取标记的字段。如果您使用注释，您可以为每个字段指定优先级或对它们进行排序。
为子结构编写一个新的头文件（使用固定的页眉和页脚）。
编写一个新的 C 文件，其中包含一个函数 createSubStruct，该函数接受一个指向大结构的指针并返回一个指向子结构的指针。
在该函数中，循环收集的字段并发出 ss.field = bs.field（即将字段一一复制）。
将这个小工具添加到您的 makefile 中，并将新的标头和 C 源文件添加到您的构建中。

我建议使用 gawk 或任何您熟悉的脚本语言作为工具；构建起来需要半个小时。

[编辑] 如果您确实想尝试反射（我建议不要这样做；在 C 中实现它需要做大量的工作），那么 offsetof() 宏就是您的朋友。该宏返回结构中字段的偏移量（通常不是其前面的字段大小的总和）。请参阅本文。

[EDIT2] 不要编写自己的解析器。要让你自己的解析器正确运行需要几个月的时间；我知道，因为我一生中编写了很多解析器。相反，标记原始头文件中需要复制的部分，然后依赖您知道可以工作的解析器：您的 C 编译器之一。以下是如何实现此功能的一些想法：

struct big_struct {
    /**BEGIN_COPY*/
    int i;
    int j : 3;
    int k : 2;
    char * str;
    /**END_COPY*/
    ...
    struct x y; /**COPY_STRUCT*/
}

只需让您的工具复制 /**BEGIN_COPY*/ 和 /**END_COPY*/ 之间的任何内容即可。

使用诸如 /**COPY_STRUCT*/ 之类的特殊注释来指示您的工具生成 memcpy() 而不是赋值等。

这可以用一些代码编写和调试小时。为 C 设置一个没有任何功能的解析器需要很长时间；也就是说，您只需拥有可以读取有效 C 语言的东西，但您仍然必须编写解析器中能够理解 C 语言的部分，以及对数据执行有用操作的部分。

I suggest to take this approach:

Curse the guy who wrote the big structure. Get a voodoo doll and have some fun.
Mark each field of the big structure that you need somehow (macro or comment or whatever)
Write a small tool which reads the header file and extracts the marked fields. If you use comments, you can give each field a priority or something to sort them.
Write a new header file for the substructure (using a fixed header and footer).
Write a new C file which contains a function createSubStruct which takes a pointer to the big struct and returns a pointer to the substruct
In the function, loop over the fields collected and emit ss.field = bs.field (i.e. copy the fields one by one).
Add the small tool to your makefile and add the new header and C source file to your build

I suggest to use gawk, or any scripting language you're comfortable with, as the tool; that should take half an hour to build.

[EDIT] If you really want to try reflection (which I suggest against; it'll be a whole lot of work do get that working in C), then the offsetof() macro is your friend. This macro returns the offset of a field in a structure (which is most often not the sum of the sizes of the fields before it). See this article.

[EDIT2] Don't write your own parser. To get your own parser right will take months; I know since I've written lots of parsers in my life. Instead mark the parts of the original header file which need to be copied and then rely on the one parser which you know works: The one of your C compiler. Here are a couple of ideas how to make this work:

struct big_struct {
    /**BEGIN_COPY*/
    int i;
    int j : 3;
    int k : 2;
    char * str;
    /**END_COPY*/
    ...
    struct x y; /**COPY_STRUCT*/
}

Just have your tool copy anything between /**BEGIN_COPY*/ and /**END_COPY*/.

Use special comments like /**COPY_STRUCT*/ to instruct your tool to generate a memcpy() instead of an assignment, etc.

This can be written and debugged in a few hours. It would take as long to set up a parser for C without any functionality; that is you'd just have something which can read valid C but you'd still have to write the part of the parser which understands C, and the part which does something useful with the data.

回复收藏 0 原文

~没有更多了~