转换为 uint64 时 int32 或 32 位指针出现意外的符号扩展

发布于 2024-12-08 13:00:26 字数 2027 浏览 4 评论 0原文

我使用 Visual Studio 2010 (cl.exe /W4) 将此代码编译为 C 文件:

int main( int argc, char *argv[] )
{
    unsigned __int64 a = 0x00000000FFFFFFFF;
    void *orig = (void *)0xFFFFFFFF;
    unsigned __int64 b = (unsigned __int64)orig;
    if( a != b )
        printf( " problem\ta: %016I64X\tb: %016I64X\n", a, b );
    return;
}

没有警告,结果是:

问题a:00000000FFFFFFFF b:FFFFFFFFFFFFFFFF

我想 int orig = (int)0xFFFFFFFF 争议较少,因为我没有将指针分配给整数。但结果是一样的。

有人可以向我解释一下 C 标准中的哪些地方涵盖了 orig 是从 0xFFFFFFFF 到 0xFFFFFFFFFFFFFFFF 的符号扩展吗?

我曾假设 (unsigned __int64)orig 将变为 0x00000000FFFFFFFF。看来是先转换为有符号的 __int64 类型,然后才变成无符号的?

编辑:这个问题已经得到解答,因为指针是符号扩展的,这就是为什么我在 gcc 和 msvc 中看到这种行为。但是我不明白为什么当我做类似 (unsigned __int64)(int)0xF0000000 的事情时,它的符号扩展到 0xFFFFFFFFF0000000 但 (unsigned __int64)0xF0000000 不显示什么我想要的是 0x00000000F0000000。

编辑:对上述编辑的回答。 (unsigned __int64)(int)0xF0000000 进行符号扩展的原因是,正如用户 R< /a>:

将有符号类型(或任何类型)转换为无符号类型 总是通过减少模一加上最大值来进行 目的地类型。

(unsigned __int64)0xF0000000 中,0xF0000000 首先作为无符号整数类型,因为它无法适合整数类型。接下来,已无符号的类型将被转换为 unsigned __int64

因此,对我来说,这个函数的要点是使用一个返回 32 位或 64 位指针作为 unsigned __int64 进行比较的函数,我必须首先在我的 32 位应用程序中转换 32 位指针在提升为 unsigned __int64 之前先转换为无符号类型。生成的代码如下所示(但是,你知道,更好):

unsigned __int64 functionidontcontrol( char * );
unsigned __int64 x;
void *y = thisisa32bitaddress;
x = functionidontcontrol(str);
if( x != (uintptr_t)y )



再次编辑: 这是我在 C99 标准中找到的内容: 6.3.1.3 有符号和无符号整数

  • 1 当整数类型的值转换为另一个整数时 _Bool 以外的类型,如果该值可以用新的表示 类型,它没有改变。
  • 2 否则,如果新类型是无符号的,则该值将转换为 比最大值重复加或减1 可以用新类型表示,直到该值位于 新类型的范围。49)
  • 3 否则,新类型是有符号的,并且该值不能被 其中所代表的;结果要么是实现定义的,要么是 引发实现定义的信号。
  • 49) 规则描述的是数学值的算术,而不是 给定类型表达式的值。

I compiled this code using Visual Studio 2010 (cl.exe /W4) as a C file:

int main( int argc, char *argv[] )
{
    unsigned __int64 a = 0x00000000FFFFFFFF;
    void *orig = (void *)0xFFFFFFFF;
    unsigned __int64 b = (unsigned __int64)orig;
    if( a != b )
        printf( " problem\ta: %016I64X\tb: %016I64X\n", a, b );
    return;
}

There are no warnings and the result is:

problem a: 00000000FFFFFFFF b: FFFFFFFFFFFFFFFF

I suppose int orig = (int)0xFFFFFFFF would be less controversial as I'm not assigning a pointer to an integer. However the result would be the same.

Can someone explain to me where in the C standard it is covered that orig is sign extended from 0xFFFFFFFF to 0xFFFFFFFFFFFFFFFF?

I had assumed that (unsigned __int64)orig would become 0x00000000FFFFFFFF. It appears that the conversion is first to the signed __int64 type and then it becomes unsigned?

EDIT: This question has been answered in that pointers are sign extended which is why I see this behavior in gcc and msvc. However I don't understand why when I do something like (unsigned __int64)(int)0xF0000000 it sign extends to 0xFFFFFFFFF0000000 but (unsigned __int64)0xF0000000 does not instead showing what I want which is 0x00000000F0000000.

EDIT: An answer to the above edit. The reason that (unsigned __int64)(int)0xF0000000 is sign extended is because, as noted by user R:

Conversion of a signed type (or any type) to an unsigned type
always takes place via reduction modulo one plus the max value of
the destination type.

And in (unsigned __int64)0xF0000000 0xF0000000 starts off as an unsigned integer type because it cannot fit in an integer type. Next that already unsigned type is converted unsigned __int64.

So the takeaway from this for me is with a function that's returning a 32-bit or 64-bit pointer as an unsigned __int64 to compare I must first convert the 32-bit pointer in my 32-bit application to an unsigned type before promoting to unsigned __int64. The resulting code looks like this (but, you know, better):

unsigned __int64 functionidontcontrol( char * );
unsigned __int64 x;
void *y = thisisa32bitaddress;
x = functionidontcontrol(str);
if( x != (uintptr_t)y )

EDIT again:
Here is what I found in the C99 standard:
6.3.1.3 Signed and unsigned integers

  • 1 When a value with integer type is converted to another integer
    type other than _Bool, if the value can be represented by the new
    type, it is unchanged.
  • 2 Otherwise, if the new type is unsigned, the value is converted by
    repeatedly adding or subtracting one more than the maximum value
    that can be represented in the new type until the value is in the
    range of the new type.49)
  • 3 Otherwise, the new type is signed and the value cannot be
    represented in it; either the result is implementation-defined or an
    implementation-defined signal is raised.
  • 49) The rules describe arithmetic on the mathematical value, not the
    value of a given type of expression.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

音盲 2024-12-15 13:00:26

指针与整数之间的转换是由实现定义的。

这里是gcc的做法,即它符号扩展如果整数类型大于指针类型(无论整数是有符号还是无符号,都会发生这种情况,只是因为这就是 gcc 决定实现它的方式)。

推测 msvc 的行为类似。编辑,我在 MSDN 上能找到的最接近的东西是 this/这个,建议将 32 位指针转换为 64 位也进行符号扩展。

Converting a pointer to/from an integer is implementation defined.

Here is how gcc does it, i.e. it sign extends if the integer type is larger than the pointer type(this'll happen regardless of the integer being signed or unsigned, just because that's how gcc decided to implement it).

Presumably msvc behaves similar. Edit, the closest thing I can find on MSDN is this/this, suggesting that converting 32 bit pointers to 64 bit also sign extends.

烧了回忆取暖 2024-12-15 13:00:26

来自 C99 标准 (§6.3.2.3/6):

任何指针类型都可以转换为整数类型。除先前指定的情况外,
结果是实现定义的。如果结果不能用整数类型表示,
该行为是未定义的。结果不必在任何整数的值范围内
类型。


因此,您需要找到有关该内容的编译器文档。

From the C99 standard (§6.3.2.3/6):

Any pointer type may be converted to an integer type. Except as previously specified, the
result is implementation-defined
. If the result cannot be represented in the integer type,
the behavior is undefined. The result need not be in the range of values of any integer
type.

So you'll need to find your compiler's documentation that talks about that.

一场春暖 2024-12-15 13:00:26

默认情况下,整数常量(例如0x00000000FFFFFFFF)是有符号整数,因此在分配给 64 位变量时可能会遇到符号扩展。尝试将第 3 行的值替换为:

0x00000000FFFFFFFFULL

Integer constants (e.g, 0x00000000FFFFFFFF) are signed integers by default, and hence may experience sign extension when assigned to a 64-bit variable. Try replacing the value on line 3 with:

0x00000000FFFFFFFFULL
屋檐 2024-12-15 13:00:26

使用它可以避免符号扩展:

unsigned __int64 a = 0x00000000FFFFFFFFLL;

注意末尾的 L。如果没有这个,它会被解释为 32 位有符号数 (-1),然后进行强制转换。

Use this to avoid the sign extension:

unsigned __int64 a = 0x00000000FFFFFFFFLL;

Note the L on the end. Without this it is interpreted as a 32-bit signed number (-1) and then cast.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文