在主内存中存储关系的最佳方法是什么?
我正在开发一个应用程序,它是一个用于评估 SPJ 查询的迷你 DBMS 设计。该程序正在用 C++ 实现。
当我必须处理连接和分组查询时,我需要在主内存中维护一组记录。因此,我必须在主内存中维护临时表来执行用户输入的查询。
我的问题是,在 C++ 中实现此目的的最佳方法是什么?为了实现这一目标,我需要使用什么数据结构?
在我的应用程序中,我将数据存储在二进制文件中并使用目录(其中包含所有现有表的架构),我需要检索数据并处理它们。
我的应用程序中只有 2 种数据类型:int(4 字节)和 char(1 字节)
我可以使用 std::vector。事实上,我尝试使用向量的向量:内部向量用于存储属性,但问题是数据库中可以存在很多关系,并且每个关系可以是任意数量的属性。此外,每个属性都可以是 int 或 char。因此,我无法确定实现这一目标的最佳方法是什么。
编辑
我无法对表使用结构,因为我不知道新添加的表中存在多少列,因为所有表都是在运行时根据用户查询创建的。因此,表模式不能存储在结构中。
I am working on an application which is a mini DBMS design for evaluating SPJ queries. The program is being implemented in C++.
When I have to process a query for joins and group-by, I need to maintain a set of records in the main memory. Thus, I have to maintain temporary tables in main memory for executing the queries entered by the user.
My question is, what is the best way to achieve this in C++? What data structure do I need to make use of in order to achieve this?
In my application, I am storing data in binary files and using the Catalog (which contains the schema for all the existing tables), I need to retrieve data and process them.
I have only 2 datatypes in my application: int (4 Bytes) and char (1 Byte)
I can use std:: vector. In fact, I tried to use vector of vectors: the inner vector is used for storing attributes, but the problem is there can be many relations existing in the database, and each of them may be any number of attributes. Also, each of these attributes can be either an int or a char. So, I am unable to identify what is the best way to achieve this.
Edit
I cannot use a struct for the tables because I do not know how many columns exist in the newly added tables, since all tables are created at runtime as per the user query. So, a table schema cannot be stored in a struct.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
关系是一组元组(在 SQL 中,表是一组行)。在关系理论和 SQL 中,关系(/表)中的所有元组(/行)“符合标题”。
因此,让一个存储关系的对象 (/tables) 由两个组件组成:一个“Heading”类型的对象和一个包含实际元组 (/rows) 的 Set (/Bag) 对象。
“标题”对象本身就是属性(/列)名称到“声明的数据类型”的映射。我不懂 C,但在 Java 中它可能类似于 Map或 Map<属性名称,类型>甚至 Map(前提是您可以使用这些字符串从它们所在的位置获取实际的“类型”对象)。
元组集 (/rows) 由成员组成,这些成员都是属性 (/column) 名称到属性值的映射,在您的情况下,属性值可以是 int 或 String。这里最大的问题是,这表明您需要像 Map这样的东西,但您可能会因为您的 int 不是对象而遇到麻烦。
A Relation is a Set of Tuples (and in SQL, a Table is a Bag of Rows). Both in Relational Theory and in SQL, all tuples (/rows) in a relation (/table) "comply to the heading".
So it is interesting to make an object to store relations (/tables) consist of two components: an object of type "Heading" and a Set (/Bag) object containing the actual tuples (/rows).
The "Heading" object is itself a Mapping of attribute (/column) names to "declared data types". I don't know C, but in Java it might be something like Map<AttributeName,TypeName> or Map<AttributeName,Type> or even Map<String,String> (provided you can use those Strings to go get the actual 'Type' objects from wherever they reside).
The set of tuples (/rows) consists of members that are all a Mapping of attribute (/column) names to attribute Values, which are either int or String, in your case. Biggest problem here is that this suggests that you need something like Map<AttributeName,Object>, but you might get into trouble over your int's not being an object.
作为任何表行的通用容器,我最有可能使用 std::vector (正如 Iarsmans 所指出的)。至于表列,我最有可能使用表示表模式的结构来定义那些列。例如:
As a generic container for any table rows, I'd most likely use
std::vector
(as pointed out by Iarsmans). As for the table columns, I'd most likely define those with structs representing the table schema. For example: