如何高效存储多类别、多属性的数据?

发布于 2024-08-24 18:50:30 字数 447 浏览 6 评论 0原文

我们拥有许多类别的大量数据,具有许多属性,例如,

category 1: Book

properties: BookID, BookName, BookType, BookAuthor, BookPrice

category 2: Fruit

properties: FruitID, FruitName, FruitShape, FruitColor, FruitPrice

我们有许多类别,例如书籍和水果。显然我们可以为它们创建很多表(例如MySQL),每个类别一个表。但这需要创建太多的表,并且我们必须编写许多“适配器”来统一操作数据。

困难在于:

1)每个类别都有不同的属性,这会导致不同的数据结构。

2) 每个类别的属性可能需要随时更改。

3)如果每个类别一张表(表太多),数据很难操作,

这样的数据如何存储?

We have a large number of data in many categories with many properties, e.g.

category 1: Book

properties: BookID, BookName, BookType, BookAuthor, BookPrice

category 2: Fruit

properties: FruitID, FruitName, FruitShape, FruitColor, FruitPrice

We have many categories like book and fruit. Obviously we can create many tables for them (MySQL e.g.), and each category a table. But this will have to create too many tables and we have to write many "adapters" to unify manipulating data.

The difficulties are:

1) Every category has different properties and this results in a different data structure.

2) The properties of every categoriy may have to be changed at anytime.

3) Hard to manipulate data if each category a table (too many tables)

How do you store such kind of data?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

慕烟庭风 2024-08-31 18:50:30

您可以将数据库分为两部分:定义表和数据表。基本上,定义表用于解释存储实际数据的数据表(有人会说,如果用 XML 表示,定义表会更优雅)。

以下是基本想法。

定义表:

TABLE class  
class_id (int)  
class_name (varchar)

TABLE class_property  
property_id (int)  
class_id (int)  
property_name (varchar)  
property_type (varchar)  

数据表:

TABLE object  
object_id (int)  
class_id (varchar)  

TABLE object_property  
property_id (int)  
property_value (varchar) 

最好还可以创建额外的Layer来解释结构,以便数据层更容易对数据进行操作。当然,您必须考虑性能、查询的便捷性等。

仅我的两分钱,我希望它能有所帮助。

问候。

You can separate the database into two parts: Definition Tables and Data Tables. Basically the Definition Tables is used to interpret the Data Tables where the actual data is stored (some would say that the definition tables is more elegant if represented in XML).

The following is the basic idea.

Definition Tables:

TABLE class  
class_id (int)  
class_name (varchar)

TABLE class_property  
property_id (int)  
class_id (int)  
property_name (varchar)  
property_type (varchar)  

Data Tables:

TABLE object  
object_id (int)  
class_id (varchar)  

TABLE object_property  
property_id (int)  
property_value (varchar) 

It would be best if you could also create additional Layer to interpret the structure so as to make it easier for the Data Layer to operate on the data. And you must of course take into consideration performance, ease of query, etc.

Just my two cents, I hope it could be of any help.

Regards.

温暖的光 2024-08-31 18:50:30

如果您的数据集合不太大,Entity-Attribute-Value< /strong> (EAV) 模型可能很适合这个要求。

简而言之,此结构允许定义类别,[必需或可选]属性列表(又名属性)此类类别中的实体包括一组称为元数据的表(如果您愿意的话),即数据的逻辑模式。实体实例存储在两个表中:标头表和值表,其中每个属性都存储在后一个表的单个 [SQL] 记录中(又名“垂直”存储:传统 DBMS 模型中曾经是一条记录)值表的几条记录)。

这种格式非常实用,特别是因为它的灵活性:它允许逻辑模式中的后期和正在进行的更改(添加新类别、添加/更改给定类别的属性等)以及隐式数据 -在应用程序级别驱动处理底层目录的逻辑模式。这种格式的主要缺点是[在某种程度上]更加复杂、抽象、实现,并且主要是当目录大小增长时(例如超过一百万个实体的范围)在扩展等方面的一些限制。

请参阅 这个 SO 答案中详细描述的 EAV 模型我的

If your data collection isn't too big, the Entity-Attribute-Value (EAV) model may fit nicely the bill.

In a nutshell, this structure allows the definition of Categories, the list of [required or optional] Attributes (aka properties) the entities in such category include etc, in a set of tables known as the meta-data, the logical schema of the data, if you will. The entity instances are stored in two tables a header and a values tables, whereby each attribute is stored in a single [SQL] record of the later table (aka "vertical" storage: what used to be a record in traditional DBMS model is made of several records of the value table).

This format is very practical in particular for its flexibility: it allows both late and on-going changes in the logical schema (addition of new categories, additions/changes in the attributes of a given category etc.), as well the implicit data-driven handling of the underlying catalog's logical schema, at the level of the application. The main drawbacks of this format are the [somewhat] more sophisticated, abstract, implementation and, mainly, some limitations with regards to scaling etc. when the catalog size grows, say in the million+ entities range.

See the EAV model described in more details in this SO answer of mine.

杯别 2024-08-31 18:50:30

受到这个问题和其他类似问题的触发,我写了一个 关于如何使用图形数据库处理此类情况的博客文章。简而言之,图形数据库不存在“如何将树/层次结构强制放入表中”的问题,因为根本不需要它:您按原样存储树结构。他们并不擅长所有事情(例如创建报告),但这是图形数据库的亮点。

Triggered by this question and other similar ones, I wrote a blog post on how to handle such cases using a graph database. In short, graph databases don't have the problem "how to force a tree/hierarchy into tables" as there's simply no need for it: you store your tree structure as it is. They're not good at everything (like for example creating reports) but this is a case where graph databases shine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文