我们可以将 va_arg 与 union 一起使用吗?

发布于 2024-12-04 17:30:35 字数 1638 浏览 5 评论 0原文

6.7.2.1 我的 C99 标准草案第 14 段对于联合和指针有这样的说法(一如既往地添加了强调):

联合的大小足以容纳其最大的成员。的值在 大多数成员可以随时存储在联合对象中。 指向a的指针 联合对象,经过适当转换,指向它的每个成员(或者如果一个成员是位- 字段,然后到它所在的单元),反之亦然。

一切都很好,这意味着执行类似以下操作将有符号或无符号 int 复制到联合中是合法的,假设我们只想将其复制到相同类型的数据中:

union ints { int i; unsigned u; };

int i = 4;
union ints is = *(union ints *)&i;
int j = is.i; // legal
unsigned k = is.u; // not so much

7.15.1.1 第 2 段有这样的内容:

va_arg 宏扩展为具有指定类型和值的表达式 调用中的下一个参数。参数ap应已由 va_startva_copy 宏(无需为 SameAP 干预调用 va_end 宏)。每次调用 va_arg 宏都会修改 ap,以便依次返回连续参数的值。参数type应该是指定的类型名称,这样可以通过在后缀*来简单地获得指向具有指定类型的对象的指针的类型。类型。如果没有实际的下一个参数,或者如果类型与实际的下一个参数的类型不兼容(根据默认参数升级进行升级),则行为未定义,但以下情况除外:

——一种类型是有符号整数类型,另一种类型是对应的无符号整数 类型,并且该值可以用两种类型表示;

——一个类型是指向 void 的指针,另一个是指向字符类型的指针。

我不会去引用有关默认参数提升的部分。我的问题是:这是定义的行为吗:

void func(int i, ...)
{
    va_list arg;
    va_start(arg, i);
    union ints is = va_arg(arg, union ints);
    va_end(arg);
}

int main(void)
{
    func(0, 1);
    return 0;
}

如果是这样,这似乎是一个巧妙的技巧,可以克服有符号/无符号整数转换的“并且该值与两种类型兼容”的要求(尽管以一种相当难以做到的方式)任何合法的事情)。如果不是,在这种情况下仅使用 unsigned 似乎是安全的,但如果 union 中有更多具有更多不兼容类型的元素怎么办?如果我们可以保证不会按元素访问联合(即我们只是将其复制到另一个联合或我们像联合一样对待的存储空间)并且联合体的所有元素都具有相同的大小,这对于可变参数来说是允许的吗?或者只允许使用指针?

在实践中,我希望这段代码几乎永远不会失败,但我想知道它是否是定义的行为。我目前的猜测是它似乎没有被定义,但这看起来非常愚蠢。

6.7.2.1 paragraph 14 of my draft of the C99 standard has this to say about unions and pointers (emphasis, as always, added):

The size of a union is sufficient to contain the largest of its members. The value of at
most one of the members can be stored in a union object at any time. A pointer to a
union object, suitably converted, points to each of its members (or if a member is a bit-
field, then to the unit in which it resides), and vice versa.

All well and good, that means that it is legal to do something like the following to copy either a signed or unsigned int into a union, assuming we only want to copy it out into data of the same type:

union ints { int i; unsigned u; };

int i = 4;
union ints is = *(union ints *)&i;
int j = is.i; // legal
unsigned k = is.u; // not so much

7.15.1.1 paragraph 2 has this to say:

The va_arg macro expands to an expression that has the specified type and the value of
the next argument in the call. The parameter ap shall have been initialized by the
va_start or va_copy macro (without an intervening invocation of the va_end macro for the sameap). Each invocation of the va_arg macro modifies ap so that the values of successive arguments are returned in turn. The parameter type shall be a type name specified such that the type of a pointer to an object that has the specified type can be obtained simply by postfixing a * to type. If there is no actual next argument, or if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:

—one type is a signed integer type, the other type is the corresponding unsigned integer
type, and the value is representable in both types;

—one type is pointer to void and the other is a pointer to a character type.

I'm not going to go and cite the part about default argument promotions. My question is: is this defined behavior:

void func(int i, ...)
{
    va_list arg;
    va_start(arg, i);
    union ints is = va_arg(arg, union ints);
    va_end(arg);
}

int main(void)
{
    func(0, 1);
    return 0;
}

If so, it would appear to be a neat trick to overcome the "and the value is compatible with both types" requirement of signed/unsigned integer conversion (albeit in a way that's rather difficult to do anything with legally). If not, it would appear to be safe to just use unsigned in this case, but what if there were more elements in the union with more incompatible types? If we can guarantee that we won't access the union by element (i.e. we just copy it into another union or storage space that we're treating like a union) and that all elements of the union are the same size, is this allowed with varargs? Or would it only be allowed with pointers?

In practice I expect this code will almost never fail, but I want to know if it's defined behavior. My current guess is that it appears not to be defined, but that seems incredibly dumb.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

我不是你的备胎 2024-12-11 17:30:35

你还有几件事要做。

指向联合对象的指针,经过适当转换后,指向其每个成员(或者如果成员是位字段,则指向它所在的单元),反之亦然。

这并不意味着这些类型是兼容的。事实上,它们并不兼容。所以下面的代码是错误的:

func(0, 1); // undefined behavior

如果你想传递一个联合体,

func(0, (union ints){ .u = BLAH });

你可以通过编写代码来检查,

union ints x;
x = 1;

GCC在编译时给出了“错误:赋值中的类型不兼容”的消息。

然而,大多数实现在这两种情况下“可能”都会做正确的事情。还有一些其他问题...

union ints {
    int i;
    unsigned u;
};

int i = 4;
union ints is = *(union ints *)&i; // Invalid
int j = is.i; // legal
unsigned k = is.u; // also legal (see note)

当您使用除实际类型 *(uinon ints *)&i 以外的类型取消引用类型的地址时,行为有时是未定义的(查找引用,但我对此非常确定)。然而,在 C99 中,允许访问除最近存储的联合成员之外的联合成员(或者是 C1x?),但该值是实现定义的,并且可能是陷阱表示。

关于通过联合进行类型双关:正如 Pascal Cuoq 所指出的,实际上是 TC3 定义了访问联合元素(而不是最近存储的元素)的行为。 TC3是C99的第三次更新。好消息是 TC3 的这一部分确实对现有实践进行了编纂 - 因此可以将其视为 TC3 之前的 C 的事实上的一部分。

You have a couple things off.

A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit- field, then to the unit in which it resides), and vice versa.

This does not mean that the types are compatible. In fact, they are not compatible. So the following code is wrong:

func(0, 1); // undefined behavior

If you want to pass a union,

func(0, (union ints){ .u = BLAH });

You can check by writing the code,

union ints x;
x = 1;

GCC gives an "error: incompatible types in assignment" message when compiling.

However, most implementations will "probably" do the right thing in both cases. There are some other problems...

union ints {
    int i;
    unsigned u;
};

int i = 4;
union ints is = *(union ints *)&i; // Invalid
int j = is.i; // legal
unsigned k = is.u; // also legal (see note)

The behavior when you dereference the address of a type using a type other than its actual type *(uinon ints *)&i is sometimes undefined (looking up the reference, but I'm pretty sure about this). However, in C99 it is permitted to access a union member other than the most recently stored union member (or is it C1x?), but the value is implementation defined and may be a trap representation.

About type punning through unions: As Pascal Cuoq notes, it's actually TC3 that defines the behavior of accessing a union element other than the most recently stored one. TC3 is the third update to C99. The good news is that this part of TC3 is really codifying existing practice — so think of it as a de facto part of C prior to TC3.

请叫√我孤独 2024-12-11 17:30:35

因为标准说:

参数类型应是指定的类型名称,这样只需通过在类型后缀 * 即可获得指向具有指定类型的对象的指针的类型。

对于union ints,满足该条件。由于 union ints * 是指向 union ints 的指针的完美表示,因此该句子中没有任何内容可以阻止它被用来收集推送到堆栈作为一个联合。

如果您作弊并尝试传递普通的 intunsigned int 来代替 union,那么您将调用未定义的行为。因此,您可以使用:

union ints u1 = ...;

func(0, (union ints) { .i = 0 });
func(1, (union ints) { .u = UINT_MAX });
func(2, u1);

您不能使用:

func(1, 0);

参数不是联合类型。

Since the standard says:

The parameter type shall be a type name specified such that the type of a pointer to an object that has the specified type can be obtained simply by postfixing a * to type.

For union ints, that condition is satisfied. Since union ints * is a perfectly good representation of a pointer to a union ints, so there is nothing in that sentence to prevent it being used to collect a value pushed onto the stack as a union.

If you cheat and try to pass a plain int or unsigned int in place of a union, then you would be invoking undefined behaviour. Thus, you could use:

union ints u1 = ...;

func(0, (union ints) { .i = 0 });
func(1, (union ints) { .u = UINT_MAX });
func(2, u1);

You could not use:

func(1, 0);

The arguments are not union types.

感情废物 2024-12-11 17:30:35

我不明白为什么你认为代码在实践中永远不应该失败。在任何整数类型通过寄存器传递但聚合类型(即使很小)在堆栈上传递的实现上都会失败,并且我在标准中没有看到任何禁止此类实现的内容。包含 int 的联合不是与 int 兼容的类型,即使它们的大小相同。

回到您的第一个代码片段,它也有一个问题:

union ints is = *(union ints *)&i;

这是别名冲突并调用未定义的行为。您可以通过使用 memcpy 来避免它,我想这将是合法的。

我对您在这里的评论也有点困惑:

unsigned k = is.u; // not so much

由于值 4 以有符号和无符号类型表示,这应该是合法的,除非作为特殊情况明确禁止。

如果这不能回答您的问题,也许您可​​以详细说明您要解决的问题(尽管是理论上的)。

I don't see why you think that code should never fail in practice. It would fail on any implementation where integer types are passed by register but aggregate types (even when small) are passed on the stack, and I see nothing in the standard that forbids such implementations. A union containing an int is not a type compatible with int, even if their sizes are the same.

Back to your first code fragment, it has a problem too:

union ints is = *(union ints *)&i;

This is an aliasing violation and invokes undefined behavior. You could avoid it by using memcpy and I suppose then it would be legal..

I'm also a bit confused about your comment here:

unsigned k = is.u; // not so much

Since the value 4 is represented in both the signed and unsigned types, this should be legal, unless it's specifically forbidden as a special case.

If this doesn't answer your question, perhaps you could elaborate more on what (albeit theoretical) problem you're trying to solve.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文