倾向于严格符合container_of的用法

发布于 2025-02-01 18:00:30 字数 2585 浏览 3 评论 0原文

container_of及其Winapi等效containing_record是流行且有用的宏。原则上,他们使用char*的指针算术来恢复给定指向成员指针所属的聚合的指针。

简约实现通常是:

#define container_of(ptr, type, member) \
   (type*)((char*)(ptr) - offsetof(type, member))

但是,该宏的使用模式的严格遵守是有争议的。 例如:

struct S {
    int a;
    int b;
};

int foo(void) {
    struct S s = { .a = 42 };
    int *p = &s.b;
    struct S *q = container_of(p, struct S, b);
    return q->a;
}

对我的理解,该程序是不是严格符合的,因为:

  • 的l值
  • expression sb是类型int &amp ; sb是指针。它的 value 可能携带实施定义的属性 就像一个值的大小一样,它指向
  • 的值绑定的潜在元数据没有任何特别的作用
  • (char*)& sb对与指针(char*) 。 - 偏移(struct s,b),此处 ub 由于指针算术而被调用 除了指向指向的值之外,

我注意到问题不是container_of宏本身。这是ptr的方式 论点是构建的。 如果指针是从struct S类型的L值计算的 然后,就不会有算术算术。不会有UB。 该程序的潜在符合性版本将是:

int foo(void) {
    struct S s = { .a = 42 };
    int *p = (int*)((char*)&s + offsetof(struct S, b));
    struct S *q = container_of(p, struct S, b);
    return q->a;
}

实际算术发生的是:

container_of(ptr, struct S, b)

扩展container_of

(struct S*)((char*)(ptr) - offsetof(struct S, b))

将表达式放置ptr

(struct S*)((char*)((int*)((char*)&s + offsetof(struct S, b))) - offsetof(struct S, b))

drop casts (char*)(char*)(int*)

(struct S*)((char*)&s + offsetof(struct S, b) - offsetof(struct S, b)))

添加OffsetOf(struct s,b)不会溢出struct S。进行算术时没有 ub 。 正面和否定术语减少。

(struct S*)((char*)&s)

现在放下冗余的演员。

&s

问题。

上述推导是否

正确

?委派给一个名为member_of的新宏。 指针可以与container_of相似的方式构造。 这个新的宏将是container_of的补充,可在严格兼容的程序中使用。

#define member_of(ptr, type, member) \
   (void*)((char*)(ptr) + offsetof(type, member))

或更方便,更方便,但便携式(尽管在C23中很好)版本:

#define member_of(ptr, member) \
   (typeof(&(ptr)->member))((char*)(ptr) + offsetof(typeof(*(ptr)), member))

该程序将是:

int foo(void) {
    struct S s = { .a = 42 };
    int *p = member_of(&s, struct S, b);
    struct S *q = container_of(p, struct S, b);
    return q->a;
}

The container_of and its WinApi equivalent CONTAINING_RECORD are popular and useful macros. In principle, they use pointer arithmetic over char* to recover a pointer to an aggregate to which a given pointer to the member belongs.

The minimalistic implementation is usually:

#define container_of(ptr, type, member) \
   (type*)((char*)(ptr) - offsetof(type, member))

However, the strict compliance of a usage pattern of this macro is debatable.
For example:

struct S {
    int a;
    int b;
};

int foo(void) {
    struct S s = { .a = 42 };
    int *p = &s.b;
    struct S *q = container_of(p, struct S, b);
    return q->a;
}

To my understanding, the program is not strictly compliant because:

  • expression s.b is an l-value of type int
  • &s.b is a pointer. Its value may carry implementation defined attributes
    like a size of a value it is pointing to
  • (char*)&s.b does not do anything special to the potential metadata bound to the value of the pointer
  • (char*)&s.b - offsetof(struct S, b), here UB is invoked because of pointer arithmetic
    outside of the value that the pointer is pointing to

I've noticed that the problem is not the container_of macro itself. It is rather the way how ptr
argument is constructed.
If the pointer was computed from the l-value of struct S type
then there would be no out-of-bounds arithmetic. There would be no UB.
A potentially compliant version of the program would be:

int foo(void) {
    struct S s = { .a = 42 };
    int *p = (int*)((char*)&s + offsetof(struct S, b));
    struct S *q = container_of(p, struct S, b);
    return q->a;
}

The actual arithmetic taking place is:

container_of(ptr, struct S, b)

Expand container_of

(struct S*)((char*)(ptr) - offsetof(struct S, b))

Place expression for ptr

(struct S*)((char*)((int*)((char*)&s + offsetof(struct S, b))) - offsetof(struct S, b))

Drop casts (char*)(int*)

(struct S*)((char*)&s + offsetof(struct S, b) - offsetof(struct S, b)))

Adding offsetof(struct S,b) does not overflow struct S. There is no UB when doing arithmetics.
The positive and negative terms are reduced.

(struct S*)((char*)&s)

Now drop redundant casts.

&s

The question.

Is the above derivation correct?

Is such a usage of container_of strictly compliant?

If so, then the computation of a pointer to the member could be delegated to a new macro named member_of.
The pointer can be constructed in a similar fashion as container_of.
This new macro would be a complement of container_of to be used in strictly compliant programs.

#define member_of(ptr, type, member) \
   (void*)((char*)(ptr) + offsetof(type, member))

or a bit more convenient and typesafe but less portable (though fine in C23) version:

#define member_of(ptr, member) \
   (typeof(&(ptr)->member))((char*)(ptr) + offsetof(typeof(*(ptr)), member))

The program would be:

int foo(void) {
    struct S s = { .a = 42 };
    int *p = member_of(&s, struct S, b);
    struct S *q = container_of(p, struct S, b);
    return q->a;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

白云不回头 2025-02-08 18:00:30

& sb是一个指针。它可能携带实现定义属性的值,例如指向

的值的大小

有两种指针元数据

类型#1 - Pointers point 分配的缓冲区隐藏的序言将分配的块持有元数据。

slideshow

来自此 =“ https://i.sstatic.net/ml4rn.png” rel =“ nofollow noreferrer”>

这绝对不会影响指针算术,而不是OP所指的情况。

类型#2 - 出处或嵌入指针中的其他元数据,

这是” c的出处感知对象模型。它描述了实施指针解决出处的想法

指针成员偏移给定非零指针p在c typeτ处,该指向结构或联合类型对象的开始(ISO C建议必须存在,写下“值如果p(π,a) m的对象的命名成员。 >,抵消指针到成员m的结果具有相同的出处π,并且适当地抵消了a。

结合有关指针算术的两个后来的陈述:

指针添加和减法指针算术(整数的添加或减法)保留出处。如果结果不在存储实例(或一个past)中,则结果指针值是不确定的。

指针差指针差仅适用于具有相同出处和同一数组的指针...

事实上,没有提议在ISO标准中更改第6.2.5节,讨论指针算术。

导致唯一可能的结论,这是这是可以的


另一个问题是(char*)(ptr)操作是否违反了严格的别名规则。

严格的别名定义(以防万一),来自不同的堆栈溢出帖子:

“严格的混叠是由C(或C ++)编译器做出的假设,将指示指向不同类型的对象永远不会指代相同的内存位置(即互相别名)。”

。而且我们仅将其用于编译时间计算,这还可以。

&s.b is a pointer. Its value that may carry implementation defined attributes like the size of a value it is pointing to

There are two cases of pointer metadata

Type #1 - Pointers point to allocated buffers where a hidden preamble holds metadata for the allocated block.

From this slideshow (slide #9 onwards):

enter image description here

This definitely doesn't affect pointer arithmetic and was not the case OP was referring to.

Type #2 - Provenance or other metadata embedded into the pointer

Here's the draft for "A Provenance-aware Memory Object Model for C". It describes the idea behind implementing pointer resolution provenance in C.

There's a quote discussing member offsets:

Pointer member offset Given a non-null pointer p at C type τ , which points to the start of a struct or union type object (ISO C suggests this has to exist, writing “The value is that of the named member of the object to which the first expression points”) with a member m, if p is (π, a), the result of offsetting the pointer to member m has the same provenance π and the suitably offset a.

Combined with two later statements about pointer arithmetic:

Pointer addition and subtraction Pointer arithmetic (addition or subtraction of integers) preserves provenance. The resulting pointer value is indeterminate if the result not within (or one-past) the storage instance.

Pointer difference Pointer difference is only defined for pointers with the same provenance and within the same array...

And the fact there are no proposals to change section 6.2.5 in the ISO standard that discusses pointer arithmetic.

Leads to the only possible conclusion, which is this is ok.


A different question would be whether or not the (char*)(ptr) operation violates strict aliasing rules.

Strict aliasing definition (just in case), from a different stack overflow post:

"Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)"

But because the operation is within the same struct and we only use it for compile-time calculations, this is ok.

那伤。 2025-02-08 18:00:30

如果您正在寻找的正式术语是严格符合的,那么这意味着可能不会出现明确的行为。如果您的示例取决于对齐/填充的注意事项,则它们具有实施定义的行为,并且并非严格符合。

否则,C允许C17 6.3.2.3/7下的各种对象指针转换:

指向对象类型的指针可以转换为指针转换为其他对象类型。如果是
结果指针无法正确对齐引用类型,行为是
不明确的。否则,当再次转换后,结果应比较
原始指针。当将对象指针转换为指针转换为字符类型时,
结果指向对象的最低地址字节。连续的增量
结果,最多达到对象的大小,将指针屈服于对象的其余字节。

“再次转换后”规则保证不会丢失“元数据”。

该规则还意味着我们可以使用角色指针来检查结构,但实际上,在一开始,在其他任何地方启动该检查确实值得怀疑。 int*指针到结构中间的指针必须仅视为int*,而不是指向“汇总”类型的指针,因此我们可能无法访问它的距离(迭代在地址指向下方)。

相反,要理解上述规则,我们必须自由将结构视为
char [sizeof(the_struct)],否则我们会遇到指针算术规则(C17 6.5.6下方)。但是要做到这一点,我们需要从指向结构的指针开始。

至于“严格的别名规则”(6.5/7),仅在执行LVALUE访问时适用,因此在这里大多无关。此外,它还有特殊的例外,可以访问某些角色类型。

因此,根据上述规则6.3.2.3/7,您的假设似乎都是正确的,并且它们不会违反指针算术或严格的别名。


关于类型安全,也许您可​​以在标准C中实现宏:

#define member_of(ptr, type, member) \
  _Generic((ptr), type*: (type*)((char*)(ptr) + offsetof(type, member)) )

If the formal term you are looking for is strictly conforming, then that means no forms of poorly-defined behavior may be present. In case your examples depend on alignment/padding considerations, then they have implementation-defined behavior and are not strictly conforming for that reason.

Otherwise, C allows all manner of object pointer conversions under C17 6.3.2.3/7:

A pointer to an object type may be converted to a pointer to a different object type. If the
resulting pointer is not correctly aligned for the referenced type, the behavior is
undefined. Otherwise, when converted back again, the result shall compare equal to the
original pointer. When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.

The "converted back again" rule guarantees that no "meta data" is lost.

This rule also means that we are allowed to inspect a struct by using a character pointer, but it is indeed questionable to start that inspection anywhere else than at the beginning. An int* pointer into the middle of the struct has to be regarded as just an int* and not as a pointer to an "aggregate" type, so we may not access it out-of-bounds (iterating below the address pointed-at).

Rather, to make sense of the above rule we must be free to regard the struct as a
char [sizeof(the_struct)], or we'd get in trouble with the rules for pointer arithmetic (stated below C17 6.5.6). But to do that we need to start out with a pointer to the struct indeed.

As for the "strict aliasing rule" (6.5/7), it only applies when doing an lvalue access, so it is mostly not relevant here. Also it has special exceptions for accessing something as a character type.

So your assumptions all seem correct as per the above quoted rule 6.3.2.3/7, and they don't violate pointer arithmetic nor strict aliasing.


Regarding type safety, perhaps you could implement the macro in standard C like this:

#define member_of(ptr, type, member) \
  _Generic((ptr), type*: (type*)((char*)(ptr) + offsetof(type, member)) )
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文