愉快地链接不兼容的类型会导致混乱

发布于 2024-12-04 17:10:27 字数 2099 浏览 0 评论 0原文

我一直在尝试找出 g++ 的一些边界，尤其是链接 (C++) 对象文件。我发现了以下好奇心，在提问之前我试图尽可能压缩它。

代码

文件 common.h

#ifndef _COMMON_H
#define _COMMON_H

#include <iostream>

#define TMPL_Y(name,T) \
struct Y { \
  T y; \
  void f() { \
    std::cout << name << "::f " << y << std::endl; \
  } \
  virtual void vf() { \
    std::cout << name << "::vf " << y << std::endl; \
  } \
  Y() { \
    std::cout << name << " ctor" << std::endl; \
  } \
  ~Y() { \
    std::cout << name << " dtor" << std::endl; \
  } \
}

#define TMPL_Z(Z) \
struct Z { \
  Y* y; \
  Z(); \
  void g(); \
}

#define TMPL_Z_impl(name,Z) \
Z::Z() { \
  y = new Y(); \
  y->y = name; \
  std::cout << #Z << "(); sizeof(Y) = " << sizeof(Y) << std::endl; \
} \
void Z::g() { \
  y->f(); \
  y->vf(); \
}

#endif

使用 g++ -Wall -c a.cpp 编译的文件 a.cpp

#include "common.h"

TMPL_Y('a',char);

TMPL_Z(Za);

TMPL_Z_impl('a',Za);

文件 b.cpp > 使用 g++ -Wall -c b.cpp 编译使用

#include "common.h"

TMPL_Y('b',unsigned long long);

TMPL_Z(Zb);

TMPL_Z_impl('b',Zb);

g++ -Wall ao bo main.cpp 编译并链接的文件 main.cpp

#include "common.h"

struct Y;
TMPL_Z(Za);
TMPL_Z(Zb);

int main() {
  Za za;
  Zb zb;
  za.g();
  zb.g();
  za.y = zb.y;
  return 0;
}

结果的./a.out 现在是

a ctor
Za(); sizeof(Y) = 8
a ctor  // <- mayhem
Zb(); sizeof(Y) = 12
a::f a
a::vf a
a::f b  // <- mayhem
a::vf b // <- mayhem

个问题

，我本来希望 g++ 会因为尝试链接 ao 和 而给我起一些令人讨厌的名字一起。尤其是 za.y = zb.y 的赋值是邪恶的。不仅 g++ 根本不抱怨，我希望它将具有相同名称 (Y) 的不兼容类型链接在一起，而且它完全忽略中的辅助定义>bo（分别为b.cpp）。

我的意思是我并没有做一些太牵强的事情。两个编译单元可以对本地类使用相同的名称，特别是。在一个大项目中。

这是一个错误吗？有人能解释一下这个问题吗？

原文

I've been trying to figure out some boundaries of g++, especially linking (C++) object files. I found the following curiosity which I tried to compress as much as possible before asking.

Code

File common.h

#ifndef _COMMON_H
#define _COMMON_H

#include <iostream>

#define TMPL_Y(name,T) \
struct Y { \
  T y; \
  void f() { \
    std::cout << name << "::f " << y << std::endl; \
  } \
  virtual void vf() { \
    std::cout << name << "::vf " << y << std::endl; \
  } \
  Y() { \
    std::cout << name << " ctor" << std::endl; \
  } \
  ~Y() { \
    std::cout << name << " dtor" << std::endl; \
  } \
}

#define TMPL_Z(Z) \
struct Z { \
  Y* y; \
  Z(); \
  void g(); \
}

#define TMPL_Z_impl(name,Z) \
Z::Z() { \
  y = new Y(); \
  y->y = name; \
  std::cout << #Z << "(); sizeof(Y) = " << sizeof(Y) << std::endl; \
} \
void Z::g() { \
  y->f(); \
  y->vf(); \
}

#endif

File a.cpp compiled with g++ -Wall -c a.cpp

#include "common.h"

TMPL_Y('a',char);

TMPL_Z(Za);

TMPL_Z_impl('a',Za);

File b.cpp compiled with g++ -Wall -c b.cpp

#include "common.h"

TMPL_Y('b',unsigned long long);

TMPL_Z(Zb);

TMPL_Z_impl('b',Zb);

File main.cpp compiled and linked with g++ -Wall a.o b.o main.cpp

#include "common.h"

struct Y;
TMPL_Z(Za);
TMPL_Z(Zb);

int main() {
  Za za;
  Zb zb;
  za.g();
  zb.g();
  za.y = zb.y;
  return 0;
}

The result of ./a.out is

a ctor
Za(); sizeof(Y) = 8
a ctor  // <- mayhem
Zb(); sizeof(Y) = 12
a::f a
a::vf a
a::f b  // <- mayhem
a::vf b // <- mayhem

Question

Now, I would have expected g++ to call me some nasty names for trying to link a.o and b.o together. Especially the assignment of za.y = zb.y is evil. Not only that g++ does not complain at all, that I want it to link together incompatible types with the same name (Y) but it completely ignores the secondary definition in b.o (resp. b.cpp).

I mean I'm not doing something sooo far fetched. It is quite reasonable that two compilation units could use the same name for local classes, esp. in a large project.

Is this a bug? Could anybody shed some light on the issue?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

难忘№最初的完美 2024-12-11 17:10:27

引用 Bjarne Stroustrup 的《C++ 编程语言》：

9.2 联动
函数、类、模板、变量、命名空间、枚举和枚举器的名称必须在所有翻译单元中一致使用，除非明确指定为本地名称。
程序员的任务是确保每个名称空间、类、函数等在其出现的每个翻译单元中都得到正确声明，并且引用同一实体的所有声明都是一致的。 [...]

回复收藏 0 原文

不再让梦枯萎 2024-12-11 17:10:27

在您的示例中，您可以将 Y 的定义放入匿名命名空间中，如下所示：

#define TMPL_Y(name,T) \
namespace { \
    struct Y { \
      T y; \
      void f() { \
        std::cout << name << "::f " << y << std::endl; \
      } \
      virtual void vf() { \
        std::cout << name << "::vf " << y << std::endl; \
      } \
      Y() { \
        std::cout << name << " ctor" << std::endl; \
      } \
      ~Y() { \
        std::cout << name << " dtor" << std::endl; \
      } \
    }; \
}

这实际上为每个编译单元创建了一个唯一的命名空间，并且您实际上拥有唯一的 Y，并且链接器将能够正确关联。

至于声明，

za.y = zb.y;

这仍然会产生不可预测的结果，因为这两种类型不兼容。

In your example, you could put the definition of Y in an anonymous namespace like this:

#define TMPL_Y(name,T) \
namespace { \
    struct Y { \
      T y; \
      void f() { \
        std::cout << name << "::f " << y << std::endl; \
      } \
      virtual void vf() { \
        std::cout << name << "::vf " << y << std::endl; \
      } \
      Y() { \
        std::cout << name << " ctor" << std::endl; \
      } \
      ~Y() { \
        std::cout << name << " dtor" << std::endl; \
      } \
    }; \
}

this essentially creates a unique namespace for each compilation unit and you have, in effect, unique Y's, and the linker will be able to associate correctly.

As for the statement

za.y = zb.y;

this will still yield unpredictable results of course as the 2 types are incompatible.

回复收藏 0 原文

物价感观 2024-12-11 17:10:27

在许多情况下，C++ 编译器不需要捕获错误。例如，其中许多错误是无法通过一次分析一个翻译单元来检测到的。

例如，如果您只是在头文件中声明

void foo(int x);

，然后在不同的翻译单元中为函数提供两个不同的定义，则无需使用模板创建复杂的情况，C++ 编译器不需要要求在链接时给出错误。

请注意，这显然不是不可能错误地发生，因为实际上甚至可能存在两个不同的标头，它们具有具有相同签名的全局函数，并且项目的一部分使用一个标头，而项目的一部分使用另一个标头。

如果您在两个具有不同声明和不同实现的不同头文件中声明某个类 Foo ，也会发生同样的情况。

这种命名滥用只是编译器不需要能够捕获的一种错误。

In many cases there are errors that the C++ compiler is not required to catch. Many of them are for example errors that are impossible to detect by analyzing one translation unit at a time.

For example without making complex cases with templates if you just declare in an header file

void foo(int x);

and then you provide two distinct definitions for the function in different translation units the C++ compiler is not required to give an error at link time.

Note that this is clearly not impossible to happen by mistake because indeed there could even be two distinct headers with a global function with the same signature and part of the project using one header and part of the project using the other.

The same can happen if you declare a certain class Foo in two different header files with different declarations and with different implementations.

This abuse of naming is simply a kind of error that the compiler is not required to be able to catch.

回复收藏 0 原文

~没有更多了~