支持动态属性的架构
我正在开发一个编辑器,使其用户能够实时创建“对象”定义。定义可以包含零个或多个属性。属性有名称和类型。创建定义后,用户可以创建该定义的对象并设置该对象的属性值。
因此,通过单击鼠标按钮,用户应该即。能够创建一个名为“Bicycle”的新定义,并添加“Numeric”类型的属性“Size”。然后是另一个名为“文本”类型的“名称”的属性,然后是另一个名为“数字”类型的“价格”的属性。完成后,用户应该能够创建几个“自行车”对象并填写每辆自行车的“名称”和“价格”属性值。
现在,我已经在几个软件产品中看到了这个功能,所以它一定是一个众所周知的概念。当我坐下来尝试提出一个数据库模式来支持这种数据结构时,我的问题就开始了,因为我希望使用适当的列类型来存储属性值。 IE。数字属性值在数据库中存储为 INT,文本属性值存储为 VARCHAR。
首先,我需要一个表来保存所有对象定义:
Table obj_defs
id | name |
----------------
1 | "Bicycle" |
2 | "Book" |
然后我需要一个表来保存每个对象定义应具有的属性:
Table prop_defs
id | obj_def_id | name | type |
------------------------------------
1 | 1 | "Size" | ? |
2 | 1 | "Name" | ? |
3 | 1 | "Price" | ? |
4 | 2 | "Title" | ? |
5 | 2 | "Author" | ? |
6 | 2 | "ISBN" | ? |
我还需要一个表来保存每个对象:
Table objects
id | created | updated |
------------------------------
1 | 2011-05-14 | 2011-06-15 |
2 | 2011-05-14 | 2011-06-15 |
3 | 2011-05-14 | 2011-06-15 |
最后,我需要一个表来保存每个对象的定义 :保存每个对象的实际属性值,一种解决方案是该表为每种可能的值类型设置一列,例如:
Table prop_vals
id | prop_def_id | object_id | numeric | textual | boolean |
------------------------------------------------------------
1 | 1 | 1 | 27 | | |
2 | 2 | 1 | | "Trek" | |
3 | 3 | 1 | 1249 | | |
4 | 1 | 2 | 26 | | |
5 | 2 | 2 | | "GT" | |
6 | 3 | 2 | 159 | | |
7 | 4 | 3 | | "It" | |
8 | 5 | 3 | | "King" | |
9 | 6 | 4 | 9 | | |
如果我实现此模式,prop_defs 表的“type”列将保存什么?每个映射到一个列名的整数,仅保存列名的 varchars?还有其他可能性吗?存储过程会以某种方式帮助我吗?获取对象 2 的“name”属性的 SQL 是什么样的?
I'm working on an editor that enables its users to create "object" definitions in real-time. A definition can contain zero or more properties. A property has a name a type. Once a definition is created, a user can create an object of that definition and set the property values of that object.
So by the click of a mouse-button, the user should ie. be able to create a new definition called "Bicycle", and add the property "Size" of type "Numeric". Then another property called "Name" of type "Text", and then another property called "Price" of type "Numeric". Once that is done, the user should be able to create a couple of "Bicycle" objects and fill in the "Name" and "Price" property values of each bike.
Now, I've seen this feature in several software products, so it must be a well-known concept. My problem started when I sat down and tried to come up with a DB schema to support this data structure, because I want the property values to be stored using the appropriate column types. Ie. a numeric property value is stored as, say, an INT in the database, and a textual property value is stored as VARCHAR.
First, I need a table that will hold all my object definitions:
Table obj_defs
id | name |
----------------
1 | "Bicycle" |
2 | "Book" |
Then I need a table for holding what sort of properties each object definition should have:
Table prop_defs
id | obj_def_id | name | type |
------------------------------------
1 | 1 | "Size" | ? |
2 | 1 | "Name" | ? |
3 | 1 | "Price" | ? |
4 | 2 | "Title" | ? |
5 | 2 | "Author" | ? |
6 | 2 | "ISBN" | ? |
I would also need a table that holds each object:
Table objects
id | created | updated |
------------------------------
1 | 2011-05-14 | 2011-06-15 |
2 | 2011-05-14 | 2011-06-15 |
3 | 2011-05-14 | 2011-06-15 |
Finally, I need a table that will hold the actual property values of each object, and one solution is for this table to have one column for each possible value type, such as this:
Table prop_vals
id | prop_def_id | object_id | numeric | textual | boolean |
------------------------------------------------------------
1 | 1 | 1 | 27 | | |
2 | 2 | 1 | | "Trek" | |
3 | 3 | 1 | 1249 | | |
4 | 1 | 2 | 26 | | |
5 | 2 | 2 | | "GT" | |
6 | 3 | 2 | 159 | | |
7 | 4 | 3 | | "It" | |
8 | 5 | 3 | | "King" | |
9 | 6 | 4 | 9 | | |
If I implemented this schema, what would the "type" column of the prop_defs table hold? Integers that each map to a column name, varchars that simply hold the column name? Any other possibilities? Would a stored procedure help me out here in some way? And what would the SQL for fetching the "name" property of object 2 look like?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您正在实现称为实体属性值模型 http://en.wikipedia.org/ wiki/Entity-attribute-value_model。
很多人会说这是一个坏主意(通常我就是其中之一),因为你的最后一个问题“用于获取......的 SQL 是什么”的答案往往是“厚厚的毛茸茸的和令人讨厌的,而且变得更糟”。
一旦您允许用户开始将对象嵌套在其他对象内,这些批评往往会持续下去,如果您不允许这样做,情况将仍然是可控的。
对于您的第一个问题,“prop_defs 表的“类型”列将保存什么”,如果您有一个包含 {“numeric”,“Any Number”},{“textual” 的类型和描述表,一切都会变得更简单,"String"} 等。第一个值是主键。然后在 prop_defs 中,您的列“类型”是该表的外键,并保存值“数字”、“文本”等。有些人会错误地告诉您始终使用整数键,因为它们连接速度更快,但如果您使用值“数字”、“文本”等。您不必 JOIN,最快的 JOIN 就是您不执行的 JOIN。
获取单个值的查询将有一个 CASE 语句:
You are implementing something called Entity-Attribute-Value model http://en.wikipedia.org/wiki/Entity-attribute-value_model.
Lots of folks will say it's a bad idea (usually I am one of those) because the answer to your last question, "What would the SQL for fetching..." tends to be "thick hairy and nasty, and gettting worse."
These criticisms tend to hold once you allow users to start nesting objects inside of other objects, if you do not allow that, the situation will remain manageable.
For your first question, "what would the "type" column of the prop_defs table hold", everything will be simpler if you have a table of types and descriptions that holds {"numeric","Any Number"}, {"textual","String"}, etc. The first value is the primary key. Then in prop_defs your column "type" is a foreign key to that table and holds values "numeric", "textual", etc. Some will tell you incorrectly to always use integer keys because they JOIN faster, but if you use the values "numeric", "textual" etc. you don't have to JOIN and the fastest JOIN is the one you don't do.
The query to grab a single value will have a CASE statement:
您必须承认关系数据库不擅长提供此类功能。他们可以提供,但不擅长。 (我希望我错了)。关系数据库更适合定义接口,而不是更改接口。
--EAV 表提供动态字段,但会降低性能。索引很糟糕。而且查询起来比较复杂。它在许多情况下都能完成工作,但在有大量用户访问系统的大表上可能会崩溃。
-- 具有多个占位符列的“常规”表对于性能来说是可以的,但是您会得到非描述性的列名称,并且可以“添加”的列数受到限制。而且它不支持子类型分离。
--通常您在开发时而不是运行时创建/修改表。我们真的应该区别对待在运行时修改数据库吗?也许,也许不是。在运行时创建新表、外键和列可以实现真正的动态对象,同时提供“常规”表的性能优势。但是您必须查询数据库的架构,然后动态生成所有查询。那会很糟糕。它将完全打破表作为界面的概念。
You must accept that relational databases are not good at providing this kind of functionality. They CAN provide it, but are not good at it. (I hope I'm wrong). Relational databases lend themselves better to defined interfaces, not changing interfaces.
--EAV tables give dynamic fields but suck on performance. Sucks on indexing. And it is complex to query. It gets the job done in many situations, but can fall apart on big tables with lots of users hitting the system.
--"Regular" tables with several place holder columns are OK for performance, but you get non-descriptive column names and are limited in the number of columns you can "add". Also it does not support sub-type separation.
--Typically you create/modify tables at development time, not run time. Should we really discriminate against modifying the database at run time? maybe, maybe not. Creating new tables, foreign keys, and columns at run-time can achieve true dynamic objects, while giving the performance benefits of "regular" tables. But you would have to query the schema of the database, then dynamically generate all of your queries. That would suck. It would totally break the concept of tables as an interface.