编译后的 C++ 有何作用?类是什么样子的?
有了一些汇编指令和 C 程序的背景,我可以想象编译后的函数是什么样子,但有趣的是我从未如此仔细地考虑过编译后的 C++ 类会是什么样子。
bash$ cat class.cpp
#include<iostream>
class Base
{
int i;
float f;
};
bash$ g++ -c class.cpp
我跑了:
bash$objdump -d class.o
bash$readelf -a class.o
但是我得到的却是我很难理解的。
有人可以解释一下我或建议一些好的起点吗?
With some background in assemble instructions and C programs, I can visualize how a compiled function would look like, but it's funny I have never so carefully thought about how a compiled C++ class would look like.
bash$ cat class.cpp
#include<iostream>
class Base
{
int i;
float f;
};
bash$ g++ -c class.cpp
I ran:
bash$objdump -d class.o
bash$readelf -a class.o
but what I get is hard for me to understand.
Could somebody please explain me or suggest some good starting points.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这些类(或多或少)被构造为常规结构。这些方法(或多或少......)转换为第一个参数是“this”的函数。对类变量的引用是作为“this”的偏移量完成的。
至于继承,让我们引用 C++ FAQ LITE,它在这里镜像 http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.4。本章展示了如何在真实硬件中调用虚函数(编译后会生成什么机器码。
让我们举个例子。假设类 Base 有 5 个虚函数:
virt0()
到virt4 。
步骤#1:编译器构建一个包含 5 个函数指针的静态表,将该表埋入静态内存中的某个位置。许多(不是全部)编译器在编译时定义了该表 定义 Base 的第一个非内联虚函数的 .cpp 我们将该表称为 v-table;如果函数指针适合一个机器字。目标硬件平台,
Base::__vtable
最终将消耗 5 个隐藏字的内存,每个实例不是 5 个,每个函数不是 5 个;它可能类似于以下伪代码:< strong>步骤#2:编译器向 Base 类的每个对象添加一个隐藏指针(通常也是机器字),这称为 v 指针。将此隐藏指针视为隐藏数据成员,就好像编译器将您的类重写为如下所示:
步骤#3:编译器在其中初始化
this->__vptr
每个构造函数。这个想法是让每个对象的 v 指针指向其类的 v 表,就好像它在每个构造函数的 init 列表中添加以下指令一样:现在让我们计算一个派生类。假设您的 C++ 代码定义了从类 Base 继承的类 Der。编译器重复步骤 #1 和 #3(但不重复步骤 #2)。在步骤 #1 中,编译器创建一个隐藏的 v 表,保留与 Base::__vtable 中相同的函数指针,但替换与覆盖相对应的那些槽。例如,如果 Der 通过
virt2()
覆盖virt0()
并按原样继承其他项,则 Der 的 v 表可能看起来像这样(假装 Der 不这样做)添加任何新的虚函数):在步骤 #3 中,编译器在每个 Der 构造函数的开头添加类似的指针赋值。这个想法是更改每个 Der 对象的 v 指针,使其指向其类的 v 表。 (这不是第二个 v 指针;它与基类 Base 中定义的 v 指针相同;记住,编译器不会在类 Der 中重复步骤 #2。)
最后,让我们看看编译器如何实现调用虚函数。您的代码可能如下所示:
编译器不知道这是否会调用
Base::virt3()
或Der::virt3()
或者可能是virt3() 方法。它只确定您正在调用 virt3() ,而该函数恰好是 v 表的槽 #3 中的函数。它将调用重写为如下内容:
我强烈建议每个 C++ 开发人员阅读常见问题解答。这可能需要几周的时间(因为它很难阅读而且很长),但它会教你很多关于 C++ 的知识以及可以用它做什么。
The classes are (more or less) constructed as regular structs. The methods are (more or less...) converted into functions which first parameter is "this". References to the class variables are done as an offset to "this".
As far as inheritance, lets quote from the C++ FAQ LITE, which is mirrored here http://www.parashift.com/c++-faq-lite/virtual-functions.html#faq-20.4 . This chapter shows how Virtual functions are called in the real hardware (what does the compile make in machine code.
Let's work an example. Suppose class Base has 5 virtual functions:
virt0()
throughvirt4()
.Step #1: the compiler builds a static table containing 5 function-pointers, burying that table into static memory somewhere. Many (not all) compilers define this table while compiling the .cpp that defines Base's first non-inline virtual function. We call that table the v-table; let's pretend its technical name is
Base::__vtable
. If a function pointer fits into one machine word on the target hardware platform,Base::__vtable
will end up consuming 5 hidden words of memory. Not 5 per instance, not 5 per function; just 5. It might look something like the following pseudo-code:Step #2: the compiler adds a hidden pointer (typically also a machine-word) to each object of class Base. This is called the v-pointer. Think of this hidden pointer as a hidden data member, as if the compiler rewrites your class to something like this:
Step #3: the compiler initializes
this->__vptr
within each constructor. The idea is to cause each object's v-pointer to point at its class's v-table, as if it adds the following instruction in each constructor's init-list:Now let's work out a derived class. Suppose your C++ code defines class Der that inherits from class Base. The compiler repeats steps #1 and #3 (but not #2). In step #1, the compiler creates a hidden v-table, keeping the same function-pointers as in
Base::__vtable
but replacing those slots that correspond to overrides. For instance, if Der overridesvirt0()
throughvirt2()
and inherits the others as-is, Der's v-table might look something like this (pretend Der doesn't add any new virtuals):In step #3, the compiler adds a similar pointer-assignment at the beginning of each of Der's constructors. The idea is to change each Der object's v-pointer so it points at its class's v-table. (This is not a second v-pointer; it's the same v-pointer that was defined in the base class, Base; remember, the compiler does not repeat step #2 in class Der.)
Finally, let's see how the compiler implements a call to a virtual function. Your code might look like this:
The compiler has no idea whether this is going to call
Base::virt3()
orDer::virt3()
or perhaps thevirt3()
method of another derived class that doesn't even exist yet. It only knows for sure that you are callingvirt3()
which happens to be the function in slot #3 of the v-table. It rewrites that call into something like this:I strongly recommend every C++ developer to read the FAQ. It might take several weeks (as it's hard to read and long) but it will teach you a lot about C++ and what can be done with it.
好的。编译类没有什么特别的。编译的类甚至不存在。存在的对象是平坦的内存块,并且字段之间可能有填充?代码中某处的独立成员函数将指向对象的指针作为第一个参数。
所以 Base 类的对象应该是
(*base_address) : i
(*base_address + sizeof(int)) :f
字段之间可以有填充吗?但这是特定于硬件的。基于处理器内存模型。
另外...在调试版本中,可以捕获调试符号中的类描述。但这是编译器特定的。您应该搜索一个为您的编译器转储调试符号的程序。
ok. there is nothing special with compiled classes. compiled classes even does not exists. what exist is objects wich are flat chunk of memory with possible paddings between fields? and standalone member functions somewhere in code which take pointer to an object as first parameter.
so object of class Base should be something
(*base_address) : i
(*base_address + sizeof(int)) : f
it is possible to have paddings between fields? but that is hardware specific. based on processors memory model.
also... in debug version it is possible to catch class description in debug symbols. but that is compiler specific. you should search for a program which dumps debug symbols for your compiler.
“编译类”的意思是“编译方法”。
方法是一个带有额外参数的普通函数,通常放在寄存器中(我相信主要是 %ecx,这至少对于大多数必须使用 __thiscall 约定生成 COM 对象的 Windows 编译器来说是这样)。
因此,C++ 类与一堆普通函数没有太大区别,除了名称修改和构造函数/析构函数中用于设置 vtable 的一些魔法之外。
"Compiled classes" mean "compiled methods".
A method is an ordinary function with an extra parameter, usually put in a register (mostly %ecx I believe, this is at least true for most Windows compilers who have to produce COM objects using __thiscall convention).
So C++ classes are not terribly different from a bunch of ordinary functions, except for name mangling and some magic in constructors/destructors for setting up vtables.
与读取 C 对象文件的主要区别在于 C++ 方法名称是损坏的。您可以尝试将选项
-C|--demangle
与objdump
一起使用。The main difference from reading C object files is that the C++ method names are mangled. You may try to use option
-C|--demangle
withobjdump
.尝试一下
这将为您提供一个汇编文件“class.s”(文本文件),您可以使用文本编辑器读取该文件。
但是,您的代码不会执行任何操作(声明类不会自行生成代码),因此汇编文件中不会有太多内容。
Try the
That will give you an assembly file 'class.s' (text file) which you can read with a text editor.
However, your code doesn't do anything (declaring a class doesn't generate code on its own) so you won't have much in the assembly file.
就像一个 C 结构体和一组带有附加参数(指向该结构体的指针)的函数。
遵循编译器所做操作的最简单方法可能是在不进行优化的情况下进行构建,然后将代码加载到调试器中并使用混合源/汇编器模式逐步执行它。
然而,编译器的要点是您不需要了解这些东西(除非您正在编写编译器)。
Like a C struct and a set of functions with an additional parameter that is a pointer to the struct.
The easiest way to follow what the compiler did perhaps is to build without optimisation, then load the code into a debugger and step through it in with mixed source/assembler mode.
However, the point of the compiler is that you don't need to know this stuff (unless perhaps you are writing a compiler).