从 C 中的大结构中获取子结构
我在现有程序中有一个非常大的struct
。 该结构体包含大量位域。
我希望保存其中的一部分(例如,150 个字段中的 10 个字段)。
我用来保存子类的示例代码是:
typedef struct {int a;int b;char c} bigstruct;
typedef struct {int a;char c;} smallstruct;
void substruct(smallstruct *s,bigstruct *b) {
s->a = b->a;
s->c = b->c;
}
int save_struct(bigstruct *bs) {
smallstruct s;
substruct(&s,bs);
save_struct(s);
}
我还希望选择它的哪一部分不会太麻烦,因为我希望时不时地更改它。 我之前提出的幼稚方法非常脆弱且难以维护。 当扩展到 20 个不同的字段时,您必须更改 smallstruct
和 substruct
函数中的字段。
我想到了两种更好的方法。 不幸的是,两者都要求我使用一些外部 CIL 之类的工具来解析我的结构。
第一种方法是自动生成 substruct
函数。 我只需设置 smallstruct
的结构,并有一个程序可以解析它并根据 smallstruct
中的字段生成 substruct
函数。
第二种方法是(使用 C 解析器)构建有关 bigstruct 的元信息,然后编写一个允许我访问结构中特定字段的库。 这就像 Java 类反射的临时实现。
例如,假设没有结构对齐,对于结构,
struct st {
int a;
char c1:5;
char c2:3;
long d;
}
我将生成以下元信息:
int field2distance[] = {0,sizeof(int),sizeof(int),sizeof(int)+sizeof(char)}
int field2size[] = {sizeof(int),1,1,sizeof(long)}
int field2bitmask[] = {0,0x1F,0xE0,0};
char *fieldNames[] = {"a","c1","c2","d"};
我将使用此函数获取 i
th 字段:
long getFieldData(void *strct,int i) {
int distance = field2distance[i];
int size = field2size[i];
int bitmask = field2bitmask[i];
void *ptr = ((char *)strct + distance);
long result;
switch (size) {
case 1: //char
result = *(char*)ptr;
break;
case 2: //short
result = *(short*)ptr;
...
}
if (bitmask == 0) return result;
return (result & bitmask) >> num_of_trailing_zeros(bitmask);
}
两种方法都需要额外的工作,但是一旦解析器位于你的 makefile 中 - 更改子结构就很容易了。
但是我宁愿在没有任何外部依赖的情况下这样做。
有人有更好的主意吗? 我的想法有什么好处,互联网上是否有一些可以实现我的想法的方法?
I'm having a very big struct
in an existing program. This struct includes a great number of bitfields.
I wish to save a part of it (say, 10 fields out of 150).
An example code I would use to save the subclass is:
typedef struct {int a;int b;char c} bigstruct;
typedef struct {int a;char c;} smallstruct;
void substruct(smallstruct *s,bigstruct *b) {
s->a = b->a;
s->c = b->c;
}
int save_struct(bigstruct *bs) {
smallstruct s;
substruct(&s,bs);
save_struct(s);
}
I also wish that selecting which part of it wouldn't be too much hassle, since I wish to change it every now and then. The naive approach I presented before is very fragile and unmaintainable. When scaling up to 20 different fields, you have to change fields both in the smallstruct
, and in the substruct
function.
I thought of two better approaches. Unfortunately both requires me to use some external CIL like tool to parse my structs.
The first approach is automatically generating the substruct
function. I'll just set the struct of smallstruct
, and have a program that would parse it and generate the substruct
function according to the fields in smallstruct
.
The second approach is building (with C parser) a meta-information about bigstruct
, and then write a library that would allow me to access a specific field in the struct. It would be like ad-hoc implementation of Java's class reflection.
For example, assuming no struct-alignment, for struct
struct st {
int a;
char c1:5;
char c2:3;
long d;
}
I'll generate the following meta information:
int field2distance[] = {0,sizeof(int),sizeof(int),sizeof(int)+sizeof(char)}
int field2size[] = {sizeof(int),1,1,sizeof(long)}
int field2bitmask[] = {0,0x1F,0xE0,0};
char *fieldNames[] = {"a","c1","c2","d"};
I'll get the i
th field with this function:
long getFieldData(void *strct,int i) {
int distance = field2distance[i];
int size = field2size[i];
int bitmask = field2bitmask[i];
void *ptr = ((char *)strct + distance);
long result;
switch (size) {
case 1: //char
result = *(char*)ptr;
break;
case 2: //short
result = *(short*)ptr;
...
}
if (bitmask == 0) return result;
return (result & bitmask) >> num_of_trailing_zeros(bitmask);
}
Both methods requires extra work, but once the parser is in your makefile - changing the substruct is a breeze.
However I'd rather do that without any external dependencies.
Does anyone have any better idea? Where my ideas any good, is there some availible implementation of my ideas on the internet?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
从您的描述来看,您似乎有权访问并可以修改您的原始结构。 我建议您将子结构重构为完整的类型(就像您在示例中所做的那样),然后将该结构作为大结构上的一个字段,将原始结构中的所有这些字段封装到较小的结构中。
扩展你的小例子:
访问小结构信息可以像这样完成:
但你也可以传递指向小结构的指针:
优点:
From your description, it looks like you have access to and can modify your original structure. I suggest you refactor your substructure into a complete type (as you did in your example), and then make that structure a field on your big structure, encapsulating all of those fields in the original structure into the smaller structure.
Expanding on your small example:
Accessing the smallstruct info would be done like so:
But you could also pass around pointers to the small struct:
Benefits:
如果更改字段的顺序不是不可能的,您可以重新排列 bigstruct 字段,使smallstruct 字段放在一起,然后只需从一个结构体字段转换为另一个结构体字段(可能添加偏移量) 。
就像是:
If changing the order of the fields isn't out of the question, you can rearrange the bigstruct fields in such a way that the smallstruct fields are together, and then its simply a matter of casting from one to another (possibly adding an offset).
Something like:
宏是你的朋友。
一种解决方案是将大结构移至其自己的包含文件中,然后进行宏聚会。
不要通常定义结构,而是选择一些宏,例如 BEGIN_STRUCTURE、END_STRUCTURE、NORMAL_FIELD、SUBSET_FIELD
然后您可以多次包含该文件,为每次传递重新定义这些结构。 第一个会将定义转换为正常结构,两种类型的字段都正常输出。 第二个将定义 NORMAL_FIELD 没有任何内容,并将创建您的子集。 第三个将创建适当的代码来复制子集字段。
您最终将得到结构的单个定义,它允许您控制子集中的字段并自动为您创建合适的代码。
Macros are your friend.
One solution would be to move the big struct out into its own include file and then have a macro party.
Instead of defining the structure normally, come up with a selection of macros, such as BEGIN_STRUCTURE, END_STRUCTURE, NORMAL_FIELD, SUBSET_FIELD
You can then include the file a few times, redefining those structures for each pass. The first one will turn the defines into a normal structure, with both types of field being output as normal. The second would define NORMAL_FIELD has nothing and would create your subset. The third would create the appropriate code to copy the subset fields over.
You'll end up with a single definition of the structure, that lets you control which fields are in the subset and automatically creates suitable code for you.
为了帮助您获取元数据,您可以参考 offsetof() 宏,它还有一个好处是可以处理您可能拥有的任何填充
Just to help you in getting your metadata, you can refer to the offsetof() macro, which also has the benefit of taking care of any padding you may have
我建议采取这种方法:
createSubStruct
,该函数接受一个指向大结构的指针并返回一个指向子结构的指针。ss.field = bs.field
(即将字段一一复制)。我建议使用 gawk 或任何您熟悉的脚本语言作为工具; 构建起来需要半个小时。
[编辑] 如果您确实想尝试反射(我建议不要这样做;在 C 中实现它需要做大量的工作),那么
offsetof()
宏就是您的朋友。 该宏返回结构中字段的偏移量(通常不是其前面的字段大小的总和)。 请参阅本文。[EDIT2] 不要编写自己的解析器。 要让你自己的解析器正确运行需要几个月的时间; 我知道,因为我一生中编写了很多解析器。 相反,标记原始头文件中需要复制的部分,然后依赖您知道可以工作的解析器:您的 C 编译器之一。 以下是如何实现此功能的一些想法:
只需让您的工具复制
/**BEGIN_COPY*/
和/**END_COPY*/
之间的任何内容即可。使用诸如
/**COPY_STRUCT*/
之类的特殊注释来指示您的工具生成memcpy()
而不是赋值等。这可以用一些代码编写和调试小时。 为 C 设置一个没有任何功能的解析器需要很长时间; 也就是说,您只需拥有可以读取有效 C 语言的东西,但您仍然必须编写解析器中能够理解 C 语言的部分,以及对数据执行有用操作的部分。
I suggest to take this approach:
createSubStruct
which takes a pointer to the big struct and returns a pointer to the substructss.field = bs.field
(i.e. copy the fields one by one).I suggest to use
gawk
, or any scripting language you're comfortable with, as the tool; that should take half an hour to build.[EDIT] If you really want to try reflection (which I suggest against; it'll be a whole lot of work do get that working in C), then the
offsetof()
macro is your friend. This macro returns the offset of a field in a structure (which is most often not the sum of the sizes of the fields before it). See this article.[EDIT2] Don't write your own parser. To get your own parser right will take months; I know since I've written lots of parsers in my life. Instead mark the parts of the original header file which need to be copied and then rely on the one parser which you know works: The one of your C compiler. Here are a couple of ideas how to make this work:
Just have your tool copy anything between
/**BEGIN_COPY*/
and/**END_COPY*/
.Use special comments like
/**COPY_STRUCT*/
to instruct your tool to generate amemcpy()
instead of an assignment, etc.This can be written and debugged in a few hours. It would take as long to set up a parser for C without any functionality; that is you'd just have something which can read valid C but you'd still have to write the part of the parser which understands C, and the part which does something useful with the data.