为什么将指针强制转换回原始类会产生奇怪的行为？

发布于 2024-11-14 17:36:52 字数 1311 浏览 2 评论 0原文

假设在我的代码中，我必须将 void* 存储为数据成员，并在需要时将其类型转换回原始 class 指针。为了测试它的可靠性，我编写了一个测试程序(linux ubuntu 4.4.1 g++ -04 -Wall)，我对这种行为感到震惊。

struct A
{
  int i;
  static int c;
  A () : i(c++) { cout<<"A() : i("<<i<<")\n"; }
};
int A::c;

int main ()
{
  void *p = new A[3];  // good behavior for A* p = new A[3];
  cout<<"p->i = "<<((A*)p)->i<<endl;
  ((A*&)p)++;
  cout<<"p->i = "<<((A*)p)->i<<endl;
  ((A*&)p)++;
  cout<<"p->i = "<<((A*)p)->i<<endl;
}

这只是一个测试程序；实际上，对于我的情况，必须将任何指针存储为 void* ，然后将其转换回实际指针（在 template 的帮助下）。所以我们不用担心那部分。上述代码的输出是，

p->i = 0
p->i = 0 // ?? why not 1
p->i = 1

但是，如果您更改 void* p; 到 A* p; 它给出了预期行为。为什么？

另一个问题，我无法摆脱 (A*&) 否则我无法使用 operator ++;但它也发出警告，取消引用类型双关指针将中断严格别名规则。有什么体面的方法来克服警告吗？

原文

Assume that in my code I have to store a void* as data member and typecast it back to the original class pointer when needed. To test its reliability, I wrote a test program (linux ubuntu 4.4.1 g++ -04 -Wall) and I was shocked to see the behavior.

struct A
{
  int i;
  static int c;
  A () : i(c++) { cout<<"A() : i("<<i<<")\n"; }
};
int A::c;

int main ()
{
  void *p = new A[3];  // good behavior for A* p = new A[3];
  cout<<"p->i = "<<((A*)p)->i<<endl;
  ((A*&)p)++;
  cout<<"p->i = "<<((A*)p)->i<<endl;
  ((A*&)p)++;
  cout<<"p->i = "<<((A*)p)->i<<endl;
}

This is just a test program; in actual for my case, it's mandatory to store any pointer as void* and then cast it back to the actual pointer (with help of template). So let's not worry about that part. The output of the above code is,

p->i = 0
p->i = 0 // ?? why not 1
p->i = 1

However if you change the void* p; to A* p; it gives expected behavior. WHY ?

Another question, I cannot get away with (A*&) otherwise I cannot use operator ++; but it also gives warning as, dereferencing type-punned pointer will break strict-aliasing rules. Is there any decent way to overcome warning ?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

送你一个梦 2024-11-21 17:36:52

好吧，正如编译器警告您的那样，您违反了严格的别名规则，这正式意味着结果未定义。

您可以通过使用增量函数模板来消除严格的别名冲突：

template<typename T>
void advance_pointer_as(void*& p, int n = 1) {
    T* p_a(static_cast<T*>(p));
    p_a += n;
    p = p_a;
}

使用此函数模板，以下 main() 定义可在 Ideone 编译器上产生预期结果（并且不会发出警告）：

int main()
{
    void* p = new A[3];
    std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
    advance_pointer_as<A>(p);
    std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
    advance_pointer_as<A>(p);
    std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
}

Well, as the compiler warns you, you are violating the strict aliasing rule, which formally means that the results are undefined.

You can eliminate the strict aliasing violation by using a function template for the increment:

template<typename T>
void advance_pointer_as(void*& p, int n = 1) {
    T* p_a(static_cast<T*>(p));
    p_a += n;
    p = p_a;
}

With this function template, the following definition of main() yields the expected results on the Ideone compiler (and emits no warnings):

int main()
{
    void* p = new A[3];
    std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
    advance_pointer_as<A>(p);
    std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
    advance_pointer_as<A>(p);
    std::cout << "p->i = " << static_cast<A*>(p)->i << std::endl;
}

回复收藏 0 原文

亢潮 2024-11-21 17:36:52

您已经收到了正确的答案，并且确实违反了严格的别名规则，导致了代码的不可预测的行为。我只是注意到你的问题的标题提到了“将指针投射回原始类”。实际上，您的代码与“返回”任何内容没有任何关系。您的代码将 void * 指针占用的原始内存内容重新解释为 A * 指针。这不是“回退”。这是重新解释。甚至根本不是同一件事。

说明差异的一个好方法是使用 int 和 float 示例。声明并初始化为的 float 值

float f = 2.0;

可以被强制转换（显式或隐式转换）为

int i = (int) f;

具有预期结果的

assert(i == 2);

int 类型这确实是一个强制转换（一个转换）。

或者，相同的 float 值也可以重新解释为 int 值。

int i = (int &) f;

但是，在这种情况下，i 的值将完全没有意义，并且通常不可预测的。我希望从这些示例中很容易看出转换和内存重新解释之间的区别。

重新解释正是您在代码中所做的事情。 (A *&) p 表达式只不过是将指针 void *p 占用的原始内存重新解释为 A *。该语言不保证这两种指针类型具有相同的表示形式，甚至相同的大小。因此，期望代码出现可预测的行为就像期望上述 (int &) f 表达式的计算结果为 2 一样。

真正“投射回”您的 void * 指针的正确方法是执行 (A *) p，而不是 (A *&) p代码>. (A *) p 的结果确实是原始指针值，可以通过指针算术安全地操作。获取原始值作为左值的唯一正确方法是使用附加变量

A *pa = (A *) p;
...
pa++;
...

并且没有合法的方法来“就地”创建左值，正如您尝试通过 (A *&) p< /code> 演员。您的代码的行为就说明了这一点。

You have already received the correct answer and it is indeed the violation of the strict aliasing rule that leads to the unpredictable behavior of the code. I'd just note that the title of your question makes reference to "casting back pointer to the original class". In reality your code does not have anything to do with casting anything "back". Your code performs reinterpretation of raw memory content occupied by a void * pointer as a A * pointer. This is not "casting back". This is reinterpretation. Not even remotely the same thing.

A good way to illustrate the difference would be to use and int and float example. A float value declared and initialized as

float f = 2.0;

cab be cast (explicitly or implicitly converted) to int type

int i = (int) f;

with the expected result

assert(i == 2);

This is indeed a cast (a conversion).

Alternatively, the same float value can be also reinterpreted as an int value

int i = (int &) f;

However, in this case the value of i will be totally meaningless and generally unpredictable. I hope it is easy to see the difference between a conversion and a memory reinterpretation from these examples.

Reinterpretation is exactly what you are doing in your code. The (A *&) p expression is nothing else than a reinterpretation of raw memory occupied by pointer void *p as pointer of type A *. The language does not guarantee that these two pointer types have the same representation and even the same size. So, expecting the predictable behavior from your code is like expecting the above (int &) f expression to evaluate to 2.

The proper way to really "cast back" your void * pointer would be to do (A *) p, not (A *&) p. The result of (A *) p would indeed be the original pointer value, that can be safely manipulated by pointer arithmetic. The only proper way to obtain the original value as an lvalue would be to use an additional variable

A *pa = (A *) p;
...
pa++;
...

And there's no legal way to create an lvalue "in place", as you attempted to by your (A *&) p cast. The behavior of your code is an illustration of that.

回复收藏 0 原文

Oo萌小芽oO 2024-11-21 17:36:52

正如其他人评论的那样，您的代码看起来应该可以工作。只有一次（在 C++ 编码的 17 年多的时间里）我遇到过一些东西，我直接查看代码和行为，就像你的情况一样，只是没有意义。我最终通过调试器运行代码并打开反汇编窗口。我发现只能解释为 VS2003 编译器中的错误，因为它恰好缺少一条指令。只需重新排列函数顶部的局部变量（从错误开始大约 30 行），编译器就会将正确的指令放回原处。因此，尝试使用反汇编调试器并跟踪内存/寄存器，看看它实际上在做什么？

至于推进指针，你应该能够通过这样做来推进它：

p = (char*)p + sizeof( A );

VS2003到VS2010永远不会给你抱怨，不确定g++

As others have commented, your code appears like it should work. Only once (in 17+ years of coding in C++) I ran across something where I was looking straight at the code and the behavior, like in your case, just didn't make sense. I ended up running the code through debugger and opening a disassembly window. I found what could only be explained as a bug in VS2003 compiler because it was missing exactly one instruction. Simply rearranging local variables at the top of the function (30 lines or so from the error) made the compiler put the correct instruction back in. So try debugger with disassembly and follow memory/registers to see what it's actually doing?

As far as advancing the pointer, you should be able to advance it by doing: