类是如何在编译器中实现的
我想为我自己的小语言实现一个类类型,但我一开始认为不会太难却让我难住了。我已经准备好了解析器,这是我遇到问题的代码生成方面。任何人都可以阐明解决此问题的最佳/正确方法吗?具体来说,我想在 LLVM 中执行此操作,因此虽然我需要了解此的一般性,但我应该使用的任何特定 LLVM 代码都会很棒。
谢谢 T.
NB 我对 LLVM 的体验基本上来自 Kaleidscope 教程,还有一些额外的体验,但我还远远没有完全理解 LLVM API。
I'd like to implement a class type for my own little language but what I thought at first wouldn't be too hard has got me stumped. I have the parser in place and it's the code generation side of things I'm having problems with. Can anyone shed any light on the best/correct way to go about this? Specifically I'd like to do this in LLVM so while I need to know the generalities of this any specific LLVM code I should be working with would be fantastic.
Thanks T.
N.B. The experience I have with LLVM is basically what comes from the Kaleidoscope tutorials and a little extra from playing around with it but I am far from having a full understanding of the LLVM API's.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
一个非常非常不完整的概述:
类是一种结构(你知道 C/C++ 吗?)
方法 是普通函数,除了它们接收额外的隐式参数:物体本身。该参数通常在函数中称为“this”或“self”。默认情况下,类作用域符号在方法中可能(C++、JavaScript)可访问,也可能不可访问(PHP、Python)。
继承本质上是将结构粘合在一起,也可能合并符号表,因为通常默认情况下可以从您现在正在解析的类的方法中访问基类的符号。当您在方法中遇到符号(字段或方法)时,您需要进行升序查找,从当前类开始沿层次结构向上查找。或者您可以实现它,以便仅在一个符号表中查找它,该符号表是合并的结果。
虚拟方法是间接调用的。在某些语言中,所有方法默认都是虚拟的。实现将取决于它是否是完全动态的语言,在这种情况下,您总是在运行时在类中查找函数名称,因此所有方法都会自动变为虚拟;或者在静态语言的情况下,编译器通常构建所谓的虚拟方法表。我不确定您是否需要这个,所以我不会在这里详细介绍。
构造函数是特殊方法,在构造新对象(通常使用“new”)时调用,或者作为构造函数调用链的一部分从后代构造函数中调用。这里可以有许多不同的实现,其中之一是构造函数采用隐式“this”参数(如果尚未创建对象,则该参数可能为 NULL),并返回它。
析构函数是普通方法,通常在对象超出范围时隐式调用。同样,您需要考虑析构函数的升序调用链的可能性。
界面很棘手,除非您的语言是完全动态的。
A very, very incomplete overview:
Class is a structure (you know C/C++ don't you?)
Methods are otherwise ordinary functions except they receive an extra implicit argument: the object itself. This argument is usually called 'this' or 'self' within the function. Class-scope symbols may (C++, JavaScript) or may not (PHP, Python) be accessible by default within methods.
Inheritance is essentially gluing together the structures and possibly also merging symbol tables as well, as normally symbols of the base class are accessible by default from within the methods of a class you are now parsing. When you encounter a symbol (field or method) within a method you need to do an ascending lookup, starting from the current class going up the hierarchy. Or you may implement it so that you look it up only in one symbol table which is a result of a merger.
Virtual methods are called indirectly. In some languages all methods are virtual by default. The implementation would depend on whether it's a fully dynamic language, in which case you always look up a function name within a class at run-time and thus all your methods become virtual automatically; or in case of static languages compilers usually build so called virtual method tables. I'm not sure if you need this at all, so I won't go into details here.
Constructors are special methods that are called either when constructing a new object (usually with 'new') or otherwise are called as part of the constructor call chain from within descendant constructors. Many different implementations are possible here, one being that a constructor takes an implicit 'this' argument, which may be NULL if an object hasn't been created yet, and returns it as well.
Destructiors are ordinary methods that are normally called implicitly when an object goes out of scope. Again you need to take into account a possibility of an ascending call chain for destructors.
Interfaces are tricky unless, again, your language is fully dynamic.
你应该买 Stan Lippmann 的《Inside The C++ Object Model》。你需要的一切都在里面。
You should buy Stan Lippmann, Inside The C++ Object Model. Everything you need is in there.
可能有几种策略可以实现这一点,这里是一种:
vtable(虚拟表)是一种带有函数指针的编译时常量结构。 (所有值在编译时都是已知的。)
(如果需要,您可以将指向 vtable 的指针称为“接口”。)
没有任何继承能力的语言中的 OOP 类是包含 const 指针的结构到它的 vtable 作为第一个成员变量。
该指针用于准确标识对象的类型,以及该对象上的方面/视图(作为什么类型转换?)的多重继承。
如果您想要具有多重继承,那么您需要能够将指向派生类的指针(static_)强制转换为其父类,从而动态更正字节地址。这可以通过一个虚拟函数或(更好)通过存储在 vtable 中的有符号偏移值来实现。
从指向父类的指针到指向派生类的指针的 (dynamic_) 转换要么意味着在可能很大的数据结构(数组、哈希表等)中进行查找,要么也通过一个虚拟函数来实现。
每次从 vtable 调用函数都需要将对象指针转换为适合该函数的类型。这可以由调用者从 vtable 读取带符号的偏移量(对应于函数)来完成,也可以由被调用者完成,后者只是原始函数的代理。
在某些语言(尤其是函数式语言)中,您可以定义对(无类型)对象的引用,这些对象实例化在该对象上有效的接口/类型类列表。这样的引用包含一个指向基对象的指针和一组指向相关虚函数表的指针。
There are probably several strategies to realize this, here is one:
A vtable (Virtual Table) is a compile-time-constant struct with function pointers. (All values are known at compile-time.)
(You can call the pointer to a vtable an "interface", if you want.)
An OOP-class in a language without any ability of inheritance is a struct that contains a const pointer to its vtable as first member-variable.
This pointer is used to exactly identify the type of the object, and with multi-inheritance the aspect/view (as what casted?) on that object.
If you want to have multi-inheritance, then you need to be able to (static_)cast the pointer to a derived class to its parent class, correcting the byte-address on the fly. This could be realized with one virtual function or (better) with a signed offset value stored in the vtable.
A (dynamic_)cast from the pointer to a parent class to the pointer to a derived class either implies a lookup in a probably large datastructure (array,hashtable,whatever) or is realized via one virtual function, too.
Each call to a function from the vtable needs the object-pointer to be casted to the type, that is appropriate for the function. This might be done either by the caller, reading the signed offset (correspoinding to the function) from the vtable, or by the callee, which then is only a proxy of the original function.
In some languages (especially functional languages) you can define references to (untyped) objects that instanciate a list of interfaces/typeclasses, valid on that object. Such a reference contains one pointer to the base-object and a list of pointers to the relevant vtables.