在 POSIX 中将整数值转换为 void* 并再次转换回来总是安全的吗？

发布于 2024-12-10 22:17:31 字数 1357 浏览 4 评论 0原文

这个问题几乎与我发现的其他一些问题重复，但这特别涉及 POSIX，并且是我多次遇到的 pthreads 中一个非常常见的示例。我主要关心当前的情况（即 C99 和 POSIX.1-2008 或更高版本），但任何有趣的历史信息当然也很有趣。

问题基本上归结为 b 是否总是与以下代码中的 a 取相同的值：

long int a = /* some valid value */
void *ptr = (void *)a;
long int b = (long int)ptr;

我知道这通常有效，但问题是这是否是正确的做法（即，C99 和/或POSIX 标准保证它能够工作）。

当谈到C99时似乎没有，我们有6.3.2.3：

5 整数可以转换为任何指针类型。除非作为先前指定，结果是实现定义的，可能不是正确对齐，可能不指向引用的实体类型，并且可能是陷阱表示。56)
6 任何指针类型都可以转换为整数类型。除先前指定的情况外，结果是实现定义的。如果结果无法表示在整数类型中，行为是未定义的。结果不一定是在任何整数类型的值范围内。

即使使用 intptr_t，标准似乎也只能保证任何有效的 void* 可以转换为 intptr_t 并再次转换回来，但它不保证任何 intptr_t 可以转换为 void* 并再次转换回来。

然而 POSIX 标准仍然有可能允许这样做。

我不太希望使用 void* 作为任何变量的存储空间（即使 POSIX 应该允许它，我发现它也很难看），但我觉得我必须问，因为 pthreads_create 函数的常见示例使用其中start_routine 的参数是一个整数，它作为 void* 传入，并在 start_routine 函数中转换为 int 或 long int。例如此手册页有这样一个示例（完整代码请参阅链接）：

//Last argument casts int to void *
pthread_create(&tid[i], NULL, sleeping, (void *)SLEEP_TIME);
/* ... */
void * sleeping(void *arg){
    //Casting void * back to int
    int sleep_time = (int)arg;
    /* ... */
}

我也在教科书中看到过类似的例子（Peter S. Pacheco 的《并行编程简介》）。考虑到这似乎是一个比我更了解这些东西的人使用的常见示例，我想知道我是否错了，这实际上是一件安全且便携的事情。

原文

This question is almost a duplicate of some others I've found, but this specifically concerns POSIX, and a very common example in pthreads that I've encountered several times. I'm mostly concerned with the current state of affairs (i.e., C99 and POSIX.1-2008 or later), but any interesting historical information is of course interesting as well.

The question basically boils down to whether b will always take the same value as a in the following code:

long int a = /* some valid value */
void *ptr = (void *)a;
long int b = (long int)ptr;

I am aware that this usually works, but the question is whether it is a proper thing to do (i.e., does the C99 and/or POSIX standards guarantee that it will work).

When it comes to C99 it seems it does not, we have 6.3.2.3:

5 An integer may be converted to any pointer type. Except as
previously speciﬁed, the result is implementation-deﬁned, might not be
correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.56)
6 Any pointer type may be
converted to an integer type. Except as previously speciﬁed, the
result is implementation-deﬁned. If the result cannot be represented
in the integer type, the behavior is undeﬁned. The result need not be
in the range of values of any integer type.

Even using intptr_t the standard seems to only guarantee that any valid void* can be converted to intptr_t and back again, but it does not guarantee that any intptr_t can be converted to void* and back again.

However it is still possible that the POSIX standard allows this.

I have no great desire to use a void* as a storage space for any variable (I find it pretty ugly even if POSIX should allow it), but I feel I have to ask because of the common example use of the pthreads_create function where the argument to start_routine is an integer, and it is passed in as void* and converted to int or long int in the start_routine function. For example this manpage has such an example (see link for full code):

//Last argument casts int to void *
pthread_create(&tid[i], NULL, sleeping, (void *)SLEEP_TIME);
/* ... */
void * sleeping(void *arg){
    //Casting void * back to int
    int sleep_time = (int)arg;
    /* ... */
}

I've also seen a similar example in a textbook (An Introduction to Parallel Programming by Peter S. Pacheco). Considering that it seems to be a common example used by people who should know this stuff much better than me, I'm wondering if I'm wrong and this is actually a safe and portable thing to be doing.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

笔芯 2024-12-17 22:17:31

正如您所说，C99 不保证任何整数类型都可以转换为 void* 并再次转换回来，而不会丢失信息。它确实为中定义的 intptr_t 和 uintptr_t 提供了类似的保证，但这些类型是可选的。（保证 void* 可以转换为 {u,}intptr_t 并转换回来，而不会丢失信息；对于任意整数值没有这样的保证。）

POSIX 不似乎也没有做出任何此类保证。

的 POSIX 描述要求 int 和 unsigned int 至少为 32 位。这超出了 C99 要求它们至少为 16 位的要求。（实际上，要求是范围，而不是大小，但效果是 int 和 unsigned int 必须至少为 32（在 POSIX 下）或 16（在C99) 位，因为 C99 需要二进制表示。）

的 POSIX 描述表示 intptr_t 和uintptr_t 必须至少为 16 位，这与 C 标准的要求相同。由于 void* 可以转换为 intptr_t 并再次转换回来，而不会丢失信息，这意味着 void* 可能小至 16 位。将其与 int 至少为 32 位的 POSIX 要求（以及 long 至少为 32 位的 POSIX 和 C 要求）相结合，并且有可能 void* 只是不够大，无法容纳 int 或 long 值而不丢失信息。

pthread_create() 的 POSIX 描述与此并不矛盾。它只是说 arg （pthread_create() 的 void* 第四个参数）被传递给 start_routine() 。据推测，其意图是 arg 指向 start_routine() 可以使用的一些数据。 POSIX 没有显示 arg 用法的示例。

您可以在此处查看 POSIX 标准；您必须创建一个免费帐户才能访问它。

回复收藏 0 原文

天暗了我发光 2024-12-17 22:17:31

到目前为止，答案的焦点似乎是指针的宽度，事实上，正如@Nico 指出的那样（@Quantumboredom 也在评论中指出），intptr_t 可能是比指针宽。 @Kevin的回答暗示了另一个重要问题，但没有完全描述它。

另外，虽然我不确定标准中的确切段落，但 Harbison & Steele 指出 intptr_t 和 uintptr_t 也是可选类型，甚至可能不存在于有效的 C99 实现中。 OpenGroup 表示符合 XSI 的系统必须支持这两种类型，但这意味着普通 POSIX 因此不需要它们（至少从 2003 版开始）。

但这里真正被忽略的部分是，指针并不总是需要具有与整数的内部表示相匹配的简单数字表示。一直如此（自 K&R 1978 以来），而且我很确定 POSIX 也很小心，不否认这种可能性。

因此，C99 确实要求可以将指针转换为该类型存在的 intptr_t IFF，然后再次转换回指针，以便新指针仍指向与旧指针位于内存中的同一对象上，实际上，如果指针具有非整数表示形式，则意味着存在一种算法，可以将一组特定的整数值转换为有效指针。然而，这也意味着并非 INTPTR_MIN 和 INTPTR_MAX 之间的所有整数都一定是有效的指针值，即使 intptr_t 的宽度（并且/或uintptr_t）与指针的宽度完全相同。

因此，标准不能保证任何 intptr_t 或 uintptr_t 可以转换为指针并返回到相同的整数值，甚至不能保证哪一组整数值可以在此类转换中幸存，因为它们不可能定义将整数值转换为指针值的所有可能的规则和算法。即使对于所有已知的架构，这样做仍然可能妨碍该标准对尚未发明的新型架构的适用性。

The focus in answers so far seems to be on the width of a pointer, and indeed as @Nico points out (and @Quantumboredom also points out in a comment), there is a possibility that intptr_t may be wider than a pointer. @Kevin's answer hints at the other important issue, but doesn't completely describe it.

Also, though I'm not sure of the exact paragraph in the standard, Harbison & Steele point out that intptr_t and uintptr_t are optional types too and may not even exist in a valid C99 implementation. OpenGroup says that XSI-conformant systems must support both types, but that means plain POSIX therefore does does not require them (at least as of the 2003 edition).

The part that's really been missed here though is that pointers need not always have a simple numerical representation that matches the internal representation of an integer. This has always been so (since K&R 1978), and I'm pretty sure POSIX is careful not to overrule this possibility either.

So, C99 does require that it be possible to convert a pointer to an intptr_t IFF that type exists, and then back to a pointer again such that the new pointer will still point at the same object in memory as the old pointer, and indeed if pointers have a non-integer representation this implies that an algorithm exists which can convert a a specific set of integer values into valid pointers. However this also means that not all integers between INTPTR_MIN and INTPTR_MAX are necessarily valid pointer values, even if the width of intptr_t (and/or uintptr_t) is exactly the same as the width of a pointer.

So, the standards cannot guarantee that any intptr_t or uintptr_t can be converted to a pointer and back to the same integer value, or even which set of integer values can survive such conversion, because they cannot possibly define all of the possible rules and algorithms for converting integer values into pointer values. Doing so even for all known architectures could still prevent the applicability of the standard to novel types of architectures yet to be invented.

回复收藏 0 原文