Reinterpret_cast 与 C 风格强制转换

发布于 2024-12-11 02:49:05 字数 103 浏览 0 评论 0原文

我听说reinterpret_cast是实现定义的,但我不知道这真正意味着什么。你能提供一个例子来说明它是如何出错的吗?如果出错了,使用 C 风格的强制转换是否更好?

I hear that reinterpret_cast is implementation defined, but I don't know what this really means. Can you provide an example of how it can go wrong, and it goes wrong, is it better to use C-Style cast?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

尴尬癌患者 2024-12-18 02:49:05

C型演员阵容也好不到哪儿去。

它只是按顺序尝试各种 C++ 风格的转换,直到找到一个有效的转换。这意味着,当它像 reinterpret_cast 一样运行时,它会遇到与 reinterpret_cast 完全相同的问题。但除此之外,它还存在以下问题:

  • 它可以做许多不同的事情,并且从阅读代码中并不总是清楚将调用哪种类型的强制转换(它的行为可能类似于 reinterpret_cast,一个 const_caststatic_cast,它们的作用非常不同)
  • 因此,更改周围的代码可能会改变强制转换的行为,
  • 在阅读或搜索代码时很难找到 - reinterpret_cast很容易找到,这很好,因为强制转换很难看,使用时应该注意。相反,通过搜索来可靠地找到 C 风格的转换(如 (int)42.0)要困难得多。

要回答问题的其他部分,是的,reinterpret_cast 是实现定义的。这意味着,当您使用它从 int* 转换为 float* 时,您无法保证生成的指针将指向相同的地址。该部分是实现定义的。但是,如果您将生成的 float*reinterpret_cast 转换回 int*,那么您将获得原始指针。这部分是有保证的。

但再次请记住,无论您使用reinterpret_cast还是C风格的强制转换都是如此:

int i;
int* p0 = &i;

float* p1 = (float*)p0; // implementation-defined result
float* p2 = reinterpret_cast<float*>(p0); // implementation-defined result

int* p3 = (int*)p1; // guaranteed that p3 == p0
int* p4 = (int*)p2; // guaranteed that p4 == p0
int* p5 = reinterpret_cast<int*>(p1); // guaranteed that p5 == p0
int* p6 = reinterpret_cast<int*>(p2); // guaranteed that p6 == p0

The C-style cast isn't better.

It simply tries the various C++-style casts in order, until it finds one that works. That means that when it acts like a reinterpret_cast, it has the exact same problems as a reinterpret_cast. But in addition, it has these problems:

  • It can do many different things, and it's not always clear from reading the code which type of cast will be invoked (it might behave like a reinterpret_cast, a const_cast or a static_cast, and those do very different things)
  • Consequently, changing the surrounding code might change the behaviour of the cast
  • It's hard to find when reading or searching the code - reinterpret_cast is easy to find, which is good, because casts are ugly and should be paid attention to when used. Conversely, a C-style cast (as in (int)42.0) is much harder to find reliably by searching

To answer the other part of your question, yes, reinterpret_cast is implementation-defined. This means that when you use it to convert from, say, an int* to a float*, then you have no guarantee that the resulting pointer will point to the same address. That part is implementation-defined. But if you take the resulting float* and reinterpret_cast it back into an int*, then you will get the original pointer. That part is guaranteed.

But again, remember that this is true whether you use reinterpret_cast or a C-style cast:

int i;
int* p0 = &i;

float* p1 = (float*)p0; // implementation-defined result
float* p2 = reinterpret_cast<float*>(p0); // implementation-defined result

int* p3 = (int*)p1; // guaranteed that p3 == p0
int* p4 = (int*)p2; // guaranteed that p4 == p0
int* p5 = reinterpret_cast<int*>(p1); // guaranteed that p5 == p0
int* p6 = reinterpret_cast<int*>(p2); // guaranteed that p6 == p0
梦毁影碎の 2024-12-18 02:49:05

它是在某种意义上定义的实现,标准没有(几乎)规定不同类型的值在位级别上应该是什么样子,地址空间应该如何构造等等。所以它确实是一个非常适合转换的平台,例如:

double d;
int &i = reinterpret_cast<int&>(d);

但是正如标准所说

它的目的是让那些知道寻址结构的人不会感到惊讶
底层机器的。

因此,如果您知道自己在做什么以及这一切在低级别上看起来是什么样子,那么就不会出错。

C 风格的强制转换在某种意义上有些相似,它可以执行reinterpret_cast,但它也首先“尝试”static_cast,并且可以放弃cv 资格(而static_cast 和reinterpret_cast 不能)并执行不考虑访问控制的转换(参见5.4) C++11 标准中的 /4)。例如:

#include <iostream>

using namespace std;

class A { int x; };
class B { int y; };

class C : A, B { int z; };

int main()
{
  C c;

  // just type pun the pointer to c, pointer value will remain the same
  // only it's type is different.
  B *b1 = reinterpret_cast<B *>(&c);

  // perform the conversion with a semantic of static_cast<B*>(&c), disregarding
  // that B is an unaccessible base of C, resulting pointer will point
  // to the B sub-object in c.
  B *b2 = (B*)(&c);

  cout << "reinterpret_cast:\t" << b1 << "\n";
  cout << "C-style cast:\t\t" << b2 << "\n";
  cout << "no cast:\t\t" << &c << "\n";
}

这里是 ideone 的输出:

reinterpret_cast:  0xbfd84e78
C-style cast:      0xbfd84e7c
no cast:           0xbfd84e78

请注意,reinterpret_cast 生成的值与“c”的地址完全相同,而 C 样式转换产生了正确的偏移指针。

It is implementation defined in a sense that standard doesn't (almost) prescribe how different types values should look like on a bit level, how address space should be structured and so on. So it's really a very platform specific for conversions like:

double d;
int &i = reinterpret_cast<int&>(d);

However as standard says

It is intended to be unsurprising to those who know the addressing structure
of the underlying machine.

So if you know what you do and how it all looks like on a low-level nothing can go wrong.

The C-style cast is somewhat similar in a sense that it can perform reinterpret_cast, but it also "tries" static_cast first and it can cast away cv qualification (while static_cast and reinterpret_cast can't) and perform conversions disregarding access control (see 5.4/4 in C++11 standard). E.g.:

#include <iostream>

using namespace std;

class A { int x; };
class B { int y; };

class C : A, B { int z; };

int main()
{
  C c;

  // just type pun the pointer to c, pointer value will remain the same
  // only it's type is different.
  B *b1 = reinterpret_cast<B *>(&c);

  // perform the conversion with a semantic of static_cast<B*>(&c), disregarding
  // that B is an unaccessible base of C, resulting pointer will point
  // to the B sub-object in c.
  B *b2 = (B*)(&c);

  cout << "reinterpret_cast:\t" << b1 << "\n";
  cout << "C-style cast:\t\t" << b2 << "\n";
  cout << "no cast:\t\t" << &c << "\n";
}

and here is an output from ideone:

reinterpret_cast:  0xbfd84e78
C-style cast:      0xbfd84e7c
no cast:           0xbfd84e78

note that value produced by reinterpret_cast is exactly the same as an address of 'c', while C-style cast resulted in a correctly offset pointer.

秋心╮凉 2024-12-18 02:49:05

使用reinterpret_cast有充分的理由,并且由于这些原因,标准实际上定义了发生的情况。

第一种是使用不透明的指针类型,无论是用于库 API 还是只是将各种指针存储在单个数组中(显然与其类型一起)。您可以将指针转换为适当大小的整数,然后再转换回指针,它将是完全相同的指针。例如:

T b;
intptr_t a = reinterpret_cast<intptr_t>( &b );
T * c = reinterpret_cast<T*>(a);

在此代码中,c 保证按照您的预期指向对象 b。转换回不同的指针类型当然是未定义的(某种程度上)。

函数指针和成员函数指针也允许进行类似的转换,但在后一种情况下,您可以简单地与另一个成员函数指针进行强制转换以获得足够大的变量。

第二种情况是使用标准布局类型。这是 C++11 之前事实上支持的内容,现在已在标准中指定。在这种情况下,标准首先将reinterpret_cast 视为对void* 的static_cast,然后对目标类型进行static_cast。这在执行二进制协议时经常使用,其中数据结构通常具有相同的标头信息,并允许您转换具有相同布局但 C++ 类结构不同的类型。

在这两种情况下,您应该使用显式的reinterpret_cast 运算符而不是C 风格。虽然 C 风格通常会做同样的事情,但它有遭受重载转换运算符的危险。

There are valid reasons to use reinterpret_cast, and for these reasons the standard actually defines what happens.

The first is to use opaque pointer types, either for a library API or just to store a variety of pointers in a single array (obviously along with their type). You are allowed to convert a pointer to a suitably sized integer and then back to a pointer and it will be the exact same pointer. For example:

T b;
intptr_t a = reinterpret_cast<intptr_t>( &b );
T * c = reinterpret_cast<T*>(a);

In this code c is guaranteed to point to the object b as you'd expected. Conversion back to a different pointer type is of course undefined (sort of).

Similar conversions are allowed for function pointers and member function pointers, but in the latter case you can cast to/from another member function pointer simply to have a variable that is big enouhg.

The second case is for using standard layout types. This is something that was de factor supported prior to C++11 and has now been specified in the standard. In this case the standard treats reinterpret_cast as a static_cast to void* first and then a static_cast to the desination type. This is used a lot when doing binary protocols where data structures often have the same header information and allows you to convert types which have the same layout, but differ in C++ class structure.

In both of these cases you should use the explicit reinterpret_cast operator rather than the C-Style. Though the C-style would normally do the same thing, it has the danger of being subjected to overloaded conversion operators.

勿忘初心 2024-12-18 02:49:05

C++ 具有类型,它们之间通常相互转换的唯一方式是通过您编写的明确定义的转换运算符。一般来说,这就是您编写程序所需要并且应该使用的全部内容。

然而,有时您希望将表示类型的位重新解释为其他内容。这通常用于非常低级的操作,通常不应该使用。对于这些情况,您可以使用reinterpret_cast

它是实现定义的,因为 C++ 标准并没有真正说明事物应该如何在内存中实际布局。这是由您的 C++ 特定实现控制的。因此,reinterpret_cast 的行为取决于编译器如何在内存中布置结构以及如何实现 reinterpret_cast

C 风格的转换与reinterpret_cast非常相似,但它们的语法要少得多,因此不推荐。人们的想法是,强制转换本质上是一种丑陋的操作,它需要丑陋的语法来通知程序员正在发生可疑的事情。

一个简单的例子说明它可能会出错:

std::string a;
double* b;
b = reinterpret_cast<double*>(&a);
*b = 3.4;

该程序的行为是未定义的 - 编译器可以对此执行任何它喜欢的操作。最有可能的是,当调用 string 的析构函数时,您会崩溃,但谁知道呢!它可能只会损坏您的堆栈并导致不相关的函数崩溃。

C++ has types, and the only way they normally convert between each other is by well-defined conversion operators that you write. In general, that's all you both need and should use to write your programs.

Sometimes, however, you want to reinterpret the bits that represent a type into something else. This is usually used for very low-level operations and is not something you should typically use. For those cases, you can use reinterpret_cast.

It is implementation defined because the C++ standard does not really say much at all about how things should actually be laid out in memory. That is controlled by your specific implementation of C++. Because of this, the behaviour of reinterpret_cast depends upon how your compiler lays structures out in memory and how it implements reinterpret_cast.

C-style casts are quite similar to reinterpret_casts, but they have much less syntax and are not recommended. The thinking goes that casting is inherently an ugly operation and it requires ugly syntax to inform the programmer that something dubious is happening.

An easy example of how it could go wrong:

std::string a;
double* b;
b = reinterpret_cast<double*>(&a);
*b = 3.4;

That program's behaviour is undefined - a compiler could do anything it likes to that. Most probably, you would get a crash when the string's destructor is called, but who knows! It might just corrupt your stack and cause a crash in an unrelated function.

林空鹿饮溪 2024-12-18 02:49:05

reinterpret_cast 和 c 风格的强制转换都是实现定义的,并且它们执行几乎相同的操作。区别是:
1. reinterpret_cast 无法去除constness。例如:

const unsigned int d = 5;
int *g=reinterpret_cast< int* >( &d );

会发出错误:

error: reinterpret_cast from type 'const unsigned int*' to type 'int*' casts away qualifiers  

2. 如果您使用reinterpret_cast,很容易找到您执行此操作的位置。使用 c 风格的强制转换是不可能的

Both reinterpret_cast and c-style casts are implementation defined and they do almost the same thing. The differences are :
1. reinterpret_cast can not remove constness. For example :

const unsigned int d = 5;
int *g=reinterpret_cast< int* >( &d );

will issue an error :

error: reinterpret_cast from type 'const unsigned int*' to type 'int*' casts away qualifiers  

2. If you use reinterpret_cast, it is easy to find the places where you did it. It is not possible to do with c-style casts

り繁华旳梦境 2024-12-18 02:49:05

C 风格的强制转换有时会以未指定的方式对对象进行类型双关,例如 (unsigned int)-1,有时会将相同的值转换为不同的格式,例如 (double)42,有时可以做任何一个,比如 (void*)0xDEADBEEF 如何重新解释位,但 (void*)0 保证是一个空指针常量,这确实不一定具有相同的对象表示(intptr_t)0,并且很少告诉编译器执行类似 shoot_self_in_foot_with((char*)&const_object); 的操作。

这通常都很好,但是当您想要将 double 转换为 uint64_t 时,有时您需要值,有时需要位。如果您了解 C,您就知道 C 风格的强制转换执行哪一个,但在某些方面,对两者使用不同的语法会更好。

Bjarne Stroustrup 在他的指南中建议在另一种情况下使用 reinterpret_cast:如果您想以语言未通过 static_cast 定义的方式输入双关语,他建议:您可以使用reinterpret_cast(uint64)之类的方法而不是其他方法来完成此操作。它们都是未定义的行为,但这使得您正在做的事情非常明确并且您是故意这样做的。阅读与您上次写信不同的工会成员则不会。

C-style casts sometimes type-pun an object in an unspecified way, such as (unsigned int)-1, sometimes convert the same value to a different format, such as (double)42, sometimes could do either, like how (void*)0xDEADBEEF reinterprets bits but (void*)0 is guaranteed to be a null pointer constant, which does not necessarily have the same object representation as (intptr_t)0, and very rarely tells the compiler to do something like shoot_self_in_foot_with((char*)&const_object);.

That's usually all well and good, but when you want to cast a double to a uint64_t, sometimes you want the value and sometimes you want the bits. If you know C, you know which one the C-style cast does, but it's nicer in some ways to have different syntax for both.

Bjarne Stroustrup, in his guidelines, recommended reinterpret_cast in another context: if you want to type-pun in a way that the language does not define by a static_cast, he suggested that you do it with something like reinterpret_cast<double&>(uint64) rather than the other methods. They're all undefined behavior, but that makes it very explicit what you're doing and that you're doing it on purpose. Reading a different member of a union than you last wrote to does not.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文