C++及其类型系统:如何处理多种类型的数据?
“简介”
我对 C++ 比较陌生。我完成了所有基本内容,并设法为我的编程语言构建了 2-3 个简单的解释器。
第一件事让我头疼:用 C++ 实现我的语言的类型系统
想想看:Ruby、Python、PHP 和 Co. 有很多内置类型,显然这些类型是在C中实现的。 所以我首先尝试的是让我的语言能够赋予一个值三种可能的类型:Int、String 和 Nil。
我想出了这个:
enum ValueType
{
Int, String, Nil
};
class Value
{
public:
ValueType type;
int intVal;
string stringVal;
};
是的,哇,我知道。由于必须始终调用字符串分配器,因此传递此类的速度非常慢。
下次我尝试类似的操作时:
enum ValueType
{
Int, String, Nil
};
extern string stringTable[255];
class Value
{
public:
ValueType type;
int index;
};
我会将所有字符串存储在 stringTable
中,并将它们的位置写入 index
。如果 Value
的类型是 Int
,我只是将整数存储在 index
中,那么使用 int 索引根本没有意义访问另一个 int,或者?
无论如何,上面的事情也让我很头疼。一段时间后,从这里的表中访问该字符串,在那里引用它并在那里复制它,这让我难以理解——我失去了控制。我不得不放下翻译草案。
现在:好的,C 和 C++ 是静态类型的。
上述语言的主要实现如何处理程序中的不同类型(fixnums、bignums、nums、字符串、数组、资源...)?
我应该怎样做才能在许多不同的可用类型中获得最大速度?
这些解决方案与我上面的简化版本相比如何?
"Introduction"
I'm relatively new to C++. I went through all the basic stuff and managed to build 2-3 simple interpreters for my programming languages.
The first thing that gave and still gives me a headache: Implementing the type system of my language in C++
Think of that: Ruby, Python, PHP and Co. have a lot of built-in types which obviously are implemented in C.
So what I first tried was to make it possible to give a value in my language three possible types: Int, String and Nil.
I came up with this:
enum ValueType
{
Int, String, Nil
};
class Value
{
public:
ValueType type;
int intVal;
string stringVal;
};
Yeah, wow, I know. It was extremely slow to pass this class around as the string allocator had to be called all the time.
Next time I've tried something similar to this:
enum ValueType
{
Int, String, Nil
};
extern string stringTable[255];
class Value
{
public:
ValueType type;
int index;
};
I would store all strings in stringTable
and write their position to index
. If the type of Value
was Int
, I just stored the integer in index
, it wouldn't make sense at all using an int index to access another int, or?
Anyways, the above gave me a headache too. After some time, accessing the string from the table here, referencing it there and copying it over there grew over my head - I lost control. I had to put the interpreter draft down.
Now: Okay, so C and C++ are statically typed.
How do the main implementations of the languages mentioned above handle the different types in their programs (fixnums, bignums, nums, strings, arrays, resources,...)?
What should I do to get maximum speed with many different available types?
How do the solutions compare to my simplified versions above?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以在这里执行几项不同的操作。不同的解决方案及时出现,其中大多数需要动态分配实际数据(boost::variant可以避免为小对象使用动态分配的内存——感谢@MSalters)。
纯 C 方法:
存储类型信息和指向必须根据类型信息(通常是枚举)进行解释的内存的 void 指针:
在 C++ 中,您可以通过使用类来简化使用来改进此方法,但更重要的是您可以对于更复杂的解决方案,请使用现有库作为 boost::any 或 boost::variant ,为同一问题提供不同的解决方案。
boost::any 和 boost::variant 都将值存储在动态分配的内存中,通常通过指向层次结构中虚拟类的指针,并使用重新解释(向下转换)为具体类型的运算符。
There are a couple of different things that you can do here. Different solutions have come up in time, and most of them require dynamic allocation of the actual datum (boost::variant can avoid using dynamically allocated memory for small objects --thanks @MSalters).
Pure C approach:
Store type information and a void pointer to memory that has to be interpreted according to the type information (usually an enum):
In C++ you can improve this approach by using classes to simplify the usage, but more importantly you can go for more complex solutions and use existing libraries as boost::any or boost::variant that offer different solutions to the same problem.
Both boost::any and boost::variant store the values in dynamically allocated memory, usually through a pointer to a virtual class in a hierarchy, and with operators that reinterpret (down casts) to the concrete types.
一个明显的解决方案是定义类型层次结构:
等等。作为一个完整的例子,让我们为一种小型语言编写一个解释器。该语言允许这样声明变量:
这将创建一个
Int
对象,为其分配值10
并将其存储在名为a
的变量表中代码>.可以对变量调用操作。例如,两个 Int 值的加法运算如下所示:这是解释器的完整代码:
与解释器交互的示例:
One obvious solution is to define a type hierarchy:
and so on. As a complete example, let us write an interpreter for a tiny language. The language allows declaring variables like this:
That will create an
Int
object, assign it the value10
and store it in a variable's table under the namea
. Operations can be invoked on variables. For instance the addition operation on two Int values looks like:Here is the complete code for the interpreter:
A sample interaction with the interpreter:
关于速度,你说:
您确实知道大多数时候应该通过引用传递对象吗?您的解决方案对于简单的解释器来说似乎是可行的。
Regarding speed, you say:
You do know that you should be passing objects by reference the vast majority of the time? Your solution looks workable for a simple interpreter.
C++ 是一种强类型语言。我可以看出您来自非类型语言,并且仍然以这些术语进行思考。
如果您确实需要在变量中存储多种类型,请查看boost::any.
但是,如果您正在实现解释器,则应该使用继承和表示特定类型的类。
C++ is a strongly typed language. I can see that you are coming from a non typed language and still think in those terms.
If you really need to store several types in a variable then take a look at boost::any.
However, if you are implementing an interpreter you should be using inheritance and classes that represent a specific type.
根据 Vijay 的解决方案,实现将是:
他的代码中缺少的一点是如何提取这些值...这是我的版本(实际上我从 Ogre 那里学到了这一点,并根据我的喜好对其进行了修改)。
用法类似于:
好的,现在查看特定元素是否是字符串:
Any 类的代码如下所示。您可以随意使用它:)。希望这有帮助!
在头文件中...说 Any.h
现在在 CPP 文件中...Any.cpp
According to Vijay's solution the implementation will be:
The bit missing from his code is HOW to extract those values... Here's my version (actually I learned this from Ogre and modified it to my liking).
Usage is something like:
Ok, now to see if a particular element is a string:
The code for the Any class is given below. Feel free to use it however you like :). Hope this helps!
In the header file... say Any.h
Now in the CPP file... Any.cpp