I know this is a super old post, and much has changed in the last 11 years, but thought I would added this as it might be helpful to future readers. One of the reason's why my co-founders and I created HarperDB is to natively accomplish Dynamic schema in a single, unduplicated data set while providing full index capability. You can read more about it here: https://harperdb.io/blog/dynamic-schema-the-harperdb-way/
ES 会推断您发送的任何新 JSON 字段的架构,无论是自动(带有提示)还是手动,您可以通过一个 HTTP 命令(“映射”)来定义/更改。
虽然它不支持 SQL,但它具有一些强大的查找功能甚至聚合功能。
ElasticSearch. You should consider it especially if you're dealing with datasets that you can partition by date, you can use JSON for your data, and are not fixed on using SQL for retrieving the data.
ES infers your schema for any new JSON fields you send, either automatically, with hints, or manually which you can define/change by one HTTP command ("mappings").
Although it does not support SQL, it has some great lookup capabilities and even aggregations.
过去我选择了选项 C -- 创建一个“长而窄”的表,将动态列值存储为行,然后需要对其进行旋转以创建一个包含所有值的“短而宽”的行集特定实体。。 然而,我使用的是 ORM,这确实让事情变得痛苦。 我无法想象你会如何在 LinqToSql 中做到这一点。 我想我必须创建一个哈希表来引用这些字段。
@Skliwz:我猜他对允许用户创建用户定义的字段更感兴趣。
In the past I've chosen option C -- Creating a 'long, narrow' table that stores dynamic column values as rows that then need to be pivoted to create a 'short, wide' rowset containing all the values for a specific entity.. However, I was using an ORM, and that REALLY made things painful. I can't think of how you'd do it in, say, LinqToSql. I guess I'd have to create a Hashtable to reference the fields.
@Skliwz: I'm guessing he's more interested in allowing users to create user-defined fields.
Over at the c2.com wiki, the idea of "Dynamic Relational" was explored. You DON'T need a DBA: columns and tables are Create-On-Write, unless you start adding constraints to make it act more like a traditional RDBMS: as a project matures, you can incrementally "lock it down".
Conceptually you can think of each row as an XML statement. For example, an employee record could be represented as:
This does not imply it has to be implemented as XML, it's just a handy conceptualization. If you ask for a non-existing column, such as "SELECT madeUpColumn ...", it's treated as blank or null (unless added constraints forbid such). And it's possible to use SQL, although one has to be careful about comparisons because of the implied type model. But other than type handling, users of a Dynamic Relational system would feel right at home because they can leverage most of their existing RDBMS knowledge. Now, if somebody would just build it...
select id, password from user where email_address = "[email protected]"
成标准 SQL 的形式如下所示:
select
parent_id, -- user id
data -- password
from
items
where
spec_id = 3 -- make sure this is a 'password' item
and
parent_id in
( -- get the 'user' item to which this 'password' item belongs
select
id
from
items
where
spec_id = 1 -- make sure this is a 'user' item
and
id in
( -- fetch all item id's with the desired 'email_address' child item
select
parent_id -- id of the parent item of the 'email_address' item
from
items
where
spec_id = 2 -- make sure this is a 'email_address' item
and
data = "[email protected]" -- with the desired data value
)
)
select
parent_id,
data
from
items
where
spec_id = (select id from specs where name = "password")
and
parent_id in (
select
id
from
items
where
spec_id = (select id from specs where name = "user")
and
id in (
select
parent_id
from
items
where
spec_id = (select id from specs where name = "email_address")
and
data = "[email protected]"
)
)
我希望您能明白这一点,并能自己确定这种方法对您是否可行。
享受! :-)
I know it's an old topic, but I guess that it never loses actuality.
I'm developing something like that right now.
Here is my approach.
I use a server setting with a MySQL, Apache, PHP, and Zend Framework 2 as application framework, but it should work as well with any other settings.
Here is a simple implementation guide, you can evolve it yourself further from this.
You would need to implement your own query language interpreter, because the effective SQL would be too complicated.
Example:
select id, password from user where email_address = "[email protected]"
The physical database layout:
Table 'specs': (should be cached in your data access layer)
The translation of the example in our own query language:
select id, password from user where email_address = "[email protected]"
to standard SQL would look like this:
select
parent_id, -- user id
data -- password
from
items
where
spec_id = 3 -- make sure this is a 'password' item
and
parent_id in
( -- get the 'user' item to which this 'password' item belongs
select
id
from
items
where
spec_id = 1 -- make sure this is a 'user' item
and
id in
( -- fetch all item id's with the desired 'email_address' child item
select
parent_id -- id of the parent item of the 'email_address' item
from
items
where
spec_id = 2 -- make sure this is a 'email_address' item
and
data = "[email protected]" -- with the desired data value
)
)
You will need to have the specs table cached in an associative array or hashtable or something similar to get the spec_id's from the spec names. Otherwise you would need to insert some more SQL overhead to get the spec_id's from the names, like in this snippet:
Bad example, don't use this, avoid this, cache the specs table instead!
select
parent_id,
data
from
items
where
spec_id = (select id from specs where name = "password")
and
parent_id in (
select
id
from
items
where
spec_id = (select id from specs where name = "user")
and
id in (
select
parent_id
from
items
where
spec_id = (select id from specs where name = "email_address")
and
data = "[email protected]"
)
)
I hope you get the idea and can determine for yourself whether that approach is feasible for you.
Sounds to me like what you really want is some sort of "meta-schema", a database schema which is capable of describing a flexible schema for storing the actual data. Dynamic schema changes are touchy and not something you want to mess with, especially not if users are allowed to make the change.
You're not going to find a database which is more suited to this task than any other, so your best bet is just to select one based on other criteria. For example, what platform are you using to host the DB? What language is the app written in? etc
To clarify what I mean by "meta-schema":
CREATE TABLE data (
id INTEGER NOT NULL AUTO_INCREMENT,
key VARCHAR(255),
data TEXT,
PRIMARY KEY (id)
);
This is a very simple example, you would likely have something more specific to your needs (and hopefully a little easier to work with), but it does serve to illustrate my point. You should consider the database schema itself to be immutable at the application level; any structural changes should be reflected in the data (that-is, the instantiation of that schema).
I know that models indicated in the question are used in production systems all over. A rather large one is in use at a large university/teaching institution that I work for. They specifically use the long narrow table approach to map data gathered by many varied data acquisition systems.
Also, Google recently released their internal data sharing protocol, protocol buffer, as open source via their code site. A database system modeled on this approach would be quite interesting.
它适用于“一次写入、主要读取、不可删除/标记为已删除”的数据(大多数 Web 应用程序实际上都是这样)。
缺点:
仅按完整单词建立索引,无缩写,
可以进行复杂查询,但性能略有下降。
取决于您的首选数据库系统是否支持数组和字索引(它已在 PROGRESS RDBMS 中实现)。
关系模型仅存在于程序员的脑海中(即仅在运行时)。
现在我认为下一步可能是在文件系统级别实现这样的数据库。 这可能相对容易。
I did it ones in a real project:
The database consisted of one table with one field which was an array of 50. It had a 'word' index set on it. All the data was typeless so the 'word index' worked as expected. Numeric fields were represented as characters and the actual sorting had been done at client side. (It still possible to have several array fields for each data type if needed).
The logical data schema for logical tables was held within the same database with different table row 'type' (the first array element). It also supported simple versioning in copy-on-write style using same 'type' field.
Advantages:
You can rearrange and add/delete your columns dynamically, no need for dump/reload of database. Any new column data may be set to initial value (virtually) in zero time.
Fragmentation is minimal, since all records and tables are same size, sometimes it gives better performance.
All table schema is virtual. Any logical schema stucture is possible (even recursive, or object-oriented).
It is good for "write-once, read-mostly, no-delete/mark-as-deleted" data (most Web apps actually are like that).
Disadvantages:
Indexing only by full words, no abbreviation,
Complex queries are possible, but with slight performance degradation.
Depends on whether your preferred database system supports arrays and word indexes (it was inplemented in PROGRESS RDBMS).
Relational model is only in programmer's mind (i.e. only at run-time).
And now I'm thinking the next step could be - to implement such a database on the file system level. That might be relatively easy.
The whole point of having a relational DB is keeping your data safe and consistent. The moment you allow users to alter the schema, there goes your data integrity...
If your need is to store heterogeneous data, for example like a CMS scenario, I would suggest storing XML validated by an XSD in a row. Of course you lose performance and easy search capabilities, but it's a good trade off IMHO.
Since it's 2016, forget XML! Use JSON to store the non-relational data bag, with an appropriately typed column as backend. You shouldn't normally need to query by value inside the bag, which will be slow even though many contemporary SQL databases understand JSON natively.
Like some others have said, don't do this unless you have no other choice. One case where this is required is if you are selling an off-the-shelf product that must allow users to record custom data. My company's product falls into this category.
If you do need to allow your customers to do this, here are a few tips:
- Create a robust administrative tool to perform the schema changes, and do not allow these changes to be made any other way.
- Make it an administrative feature; don't allow normal users to access it.
- Log every detail about every schema change. This will help you debug problems, and it will also give you CYA data if a customer does something stupid.
If you can do those things successfully (especially the first one), then any of the architectures you mentioned will work. My preference is to dynamically change the database objects, because that allows you to take advantage of your DBMS's query features when you access the data stored in the custom fields. The other three options require you load large chunks of data and then do most of your data processing in code.
I have a similar requirement and decided to use the schema-less MongoDB.
MongoDB (from "humongous") is an open source, scalable, high-performance, schema-free, document-oriented database written in the C++ programming language. (Wikipedia)
Highlights:
has rich query functionality (maybe the closest to SQL DBs)
production ready (foursquare, sourceforge use it)
Lowdarks (stuff you need to understand, so you can use mongo correctly):
no transactions (actually it has transactions but only on atomic operations)
What you are proposing is not new. Plenty of people have tried it... most have found that they chase "infinite" flexibility and instead end up with much, much less than that. It's the "roach motel" of database designs -- data goes in, but it's almost impossible to get it out. Try and conceptualize writing the code for ANY sort of constraint and you'll see what I mean.
The end result typically is a system that is MUCH more difficult to debug, maintain, and full of data consistency problems. This is not always the case, but more often than not, that is how it ends up. Mostly because the programmer(s) don't see this train wreck coming and fail to defensively code against it. Also, often ends up the case that the "infinite" flexibility really isn't that necessary; it's a very bad "smell" when the dev team gets a spec that says "Gosh I have no clue what sort of data they are going to put here, so let 'em put WHATEVER"... and the end users are just fine having pre-defined attribute types that they can use (code up a generic phone #, and let them create any # of them -- this is trivial in a nicely normalized system and maintains flexibility and integrity!)
If you have a very good development team and are intimately aware of the problems you'll have to overcome with this design, you can successfully code up a well designed, not terribly buggy system. Most of the time.
Why start out with the odds stacked so much against you, though?
发布评论
评论(16)
我知道这是一篇非常老的文章,在过去 11 年里发生了很多变化,但我想我会添加它,因为它可能对未来的读者有所帮助。 我和我的联合创始人创建 HarperDB 的原因之一是在单个、不重复的数据集中本地完成动态模式,同时提供完整的索引功能。 您可以在这里阅读更多相关信息:
https://harperdb.io/blog/dynamic-schema-the-harperdb -路/
I know this is a super old post, and much has changed in the last 11 years, but thought I would added this as it might be helpful to future readers. One of the reason's why my co-founders and I created HarperDB is to natively accomplish Dynamic schema in a single, unduplicated data set while providing full index capability. You can read more about it here:
https://harperdb.io/blog/dynamic-schema-the-harperdb-way/
sql 已经提供了一种更改架构的方法:ALTER 命令。
只需有一个表列出不允许用户更改的字段,并为 ALTER 编写一个漂亮的界面即可。
sql already provides a way to change your schema: the ALTER command.
simply have a table that lists the fields that users are not allowed to change, and write a nice interface for ALTER.
弹性搜索。 如果您正在处理可以按日期分区的数据集,您可以使用 JSON 作为数据,并且不固定使用 SQL 来检索数据,那么您应该特别考虑它。
ES 会推断您发送的任何新 JSON 字段的架构,无论是自动(带有提示)还是手动,您可以通过一个 HTTP 命令(“映射”)来定义/更改。
虽然它不支持 SQL,但它具有一些强大的查找功能甚至聚合功能。
ElasticSearch. You should consider it especially if you're dealing with datasets that you can partition by date, you can use JSON for your data, and are not fixed on using SQL for retrieving the data.
ES infers your schema for any new JSON fields you send, either automatically, with hints, or manually which you can define/change by one HTTP command ("mappings").
Although it does not support SQL, it has some great lookup capabilities and even aggregations.
过去我选择了选项 C -- 创建一个“长而窄”的表,将动态列值存储为行,然后需要对其进行旋转以创建一个包含所有值的“短而宽”的行集特定实体。。 然而,我使用的是 ORM,这确实让事情变得痛苦。 我无法想象你会如何在 LinqToSql 中做到这一点。 我想我必须创建一个哈希表来引用这些字段。
@Skliwz:我猜他对允许用户创建用户定义的字段更感兴趣。
In the past I've chosen option C -- Creating a 'long, narrow' table that stores dynamic column values as rows that then need to be pivoted to create a 'short, wide' rowset containing all the values for a specific entity.. However, I was using an ORM, and that REALLY made things painful. I can't think of how you'd do it in, say, LinqToSql. I guess I'd have to create a Hashtable to reference the fields.
@Skliwz: I'm guessing he's more interested in allowing users to create user-defined fields.
在 c2.com wiki 上,探讨了“动态关系” 的想法。 您不需要 DBA:列和表都是 Create-On-Write,除非您开始添加约束以使其更像传统的 RDBMS:随着项目的成熟,您可以逐步“锁定它”。
从概念上讲,您可以将每一行视为一条 XML 语句。 例如,员工记录可以表示为:
这并不意味着它必须以 XML 形式实现,这只是一个方便的概念化。 如果您请求不存在的列,例如“SELECT madeUpColumn ...”,它将被视为空白或 null(除非添加的约束禁止这样做)。 并且可以使用SQL,尽管由于隐含的类型模型而必须小心比较。 但除了类型处理之外,动态关系系统的用户会感到宾至如归,因为他们可以利用大部分现有的 RDBMS 知识。 现在,如果有人愿意建造它......
Over at the c2.com wiki, the idea of "Dynamic Relational" was explored. You DON'T need a DBA: columns and tables are Create-On-Write, unless you start adding constraints to make it act more like a traditional RDBMS: as a project matures, you can incrementally "lock it down".
Conceptually you can think of each row as an XML statement. For example, an employee record could be represented as:
This does not imply it has to be implemented as XML, it's just a handy conceptualization. If you ask for a non-existing column, such as "SELECT madeUpColumn ...", it's treated as blank or null (unless added constraints forbid such). And it's possible to use SQL, although one has to be careful about comparisons because of the implied type model. But other than type handling, users of a Dynamic Relational system would feel right at home because they can leverage most of their existing RDBMS knowledge. Now, if somebody would just build it...
创建2个数据库
Create 2 databases
我认为 EAV 方法是最好的方法,但成本高昂
EAV approach i believe is the best approach, but comes with a heavy cost
我知道这是一个老话题,但我想它永远不会失去现实性。
我现在正在开发类似的东西。
这是我的方法。
我使用 MySQL、Apache、PHP 和 Zend Framework 2 作为应用程序框架的服务器设置,但它应该与任何其他设置一起工作。
这是一个简单的实施指南,您可以在此基础上进一步发展。
您需要实现自己的查询语言解释器,因为有效的 SQL 太复杂。
示例:
物理数据库布局:
表 'specs': (应缓存在数据访问层中)
表 'items':
表 'specs' 的内容:
表 'items' 的内容:
用我们自己的查询语言将示例翻译
成标准 SQL 的形式如下所示:
您需要将规格表缓存在关联数组或哈希表或类似的东西中,以便从规格名称中获取规格 ID。 否则,您将需要插入一些更多的 SQL 开销来从名称中获取 spec_id,如以下代码片段所示:
坏示例,不要使用此,避免此,而是缓存规格表!
我希望您能明白这一点,并能自己确定这种方法对您是否可行。
享受! :-)
I know it's an old topic, but I guess that it never loses actuality.
I'm developing something like that right now.
Here is my approach.
I use a server setting with a MySQL, Apache, PHP, and Zend Framework 2 as application framework, but it should work as well with any other settings.
Here is a simple implementation guide, you can evolve it yourself further from this.
You would need to implement your own query language interpreter, because the effective SQL would be too complicated.
Example:
The physical database layout:
Table 'specs': (should be cached in your data access layer)
Table 'items':
Contents of table 'specs':
Contents of table 'items':
The translation of the example in our own query language:
to standard SQL would look like this:
You will need to have the specs table cached in an associative array or hashtable or something similar to get the spec_id's from the spec names. Otherwise you would need to insert some more SQL overhead to get the spec_id's from the names, like in this snippet:
Bad example, don't use this, avoid this, cache the specs table instead!
I hope you get the idea and can determine for yourself whether that approach is feasible for you.
Enjoy! :-)
在我看来,您真正想要的是某种“元模式”,一种数据库模式,能够描述用于存储实际数据的灵活模式。 动态模式更改是敏感的,您不想搞乱它,尤其是在允许用户进行更改的情况下。
您不会找到比任何其他数据库更适合此任务的数据库,因此最好的选择就是根据其他标准选择一个数据库。 例如,您使用什么平台来托管数据库? 该应用程序是用什么语言编写的? 这
为了澄清我所说的“元模式”的含义:
是一个非常简单的例子,您可能会有一些更具体的内容来满足您的需求(并且希望更容易使用),但它确实可以说明我的观点。 您应该考虑数据库模式本身在应用程序级别是不可变的; 任何结构变化都应该反映在数据中(即该模式的实例化)。
Sounds to me like what you really want is some sort of "meta-schema", a database schema which is capable of describing a flexible schema for storing the actual data. Dynamic schema changes are touchy and not something you want to mess with, especially not if users are allowed to make the change.
You're not going to find a database which is more suited to this task than any other, so your best bet is just to select one based on other criteria. For example, what platform are you using to host the DB? What language is the app written in? etc
To clarify what I mean by "meta-schema":
This is a very simple example, you would likely have something more specific to your needs (and hopefully a little easier to work with), but it does serve to illustrate my point. You should consider the database schema itself to be immutable at the application level; any structural changes should be reflected in the data (that-is, the instantiation of that schema).
我知道问题中指出的模型广泛用于生产系统中。 我工作的一所大型大学/教学机构正在使用一个相当大的设备。 他们专门使用长窄表方法来绘制由许多不同的数据采集系统收集的数据。
此外,谷歌最近通过其代码网站发布了他们的内部数据共享协议,protocol buffer,作为开源协议。 以这种方法为模型的数据库系统将非常有趣。
检查以下内容:
实体属性值模型
Google 协议缓冲区
I know that models indicated in the question are used in production systems all over. A rather large one is in use at a large university/teaching institution that I work for. They specifically use the long narrow table approach to map data gathered by many varied data acquisition systems.
Also, Google recently released their internal data sharing protocol, protocol buffer, as open source via their code site. A database system modeled on this approach would be quite interesting.
Check the following:
Entity-attribute-value model
Google Protocol Buffer
我在一个真实的项目中做到了这一点:
数据库由一张表和一个字段组成,该字段是一个包含 50 个字段的数组。它设置了一个“单词”索引。 所有数据都是无类型的,因此“单词索引”按预期工作。 数字字段表示为字符,实际排序是在客户端完成的。 (如果需要,每种数据类型仍然可以有多个数组字段)。
逻辑表的逻辑数据模式保存在具有不同表行“类型”(第一个数组元素)的同一数据库中。 它还支持使用相同“类型”字段的写时复制样式的简单版本控制。
优点:
缺点:
现在我认为下一步可能是在文件系统级别实现这样的数据库。 这可能相对容易。
I did it ones in a real project:
The database consisted of one table with one field which was an array of 50. It had a 'word' index set on it. All the data was typeless so the 'word index' worked as expected. Numeric fields were represented as characters and the actual sorting had been done at client side. (It still possible to have several array fields for each data type if needed).
The logical data schema for logical tables was held within the same database with different table row 'type' (the first array element). It also supported simple versioning in copy-on-write style using same 'type' field.
Advantages:
Disadvantages:
And now I'm thinking the next step could be - to implement such a database on the file system level. That might be relatively easy.
拥有关系数据库的全部意义在于保持数据安全和一致。 一旦您允许用户更改架构,您的数据完整性就会消失...
如果您需要存储异构数据,例如 CMS 场景,我建议将由 XSD 验证的 XML 存储在一行中。 当然,您会失去性能和简单的搜索功能,但恕我直言,这是一个很好的权衡。既然已经是 2016 年了,忘记 XML吧! 使用 JSON 存储非关系数据包,并使用适当类型的列作为后端。 您通常不需要按包内的值进行查询,尽管许多现代 SQL 数据库本身就可以理解 JSON,但查询速度会很慢。
The whole point of having a relational DB is keeping your data safe and consistent. The moment you allow users to alter the schema, there goes your data integrity...
If your need is to store heterogeneous data, for example like a CMS scenario, I would suggest storing XML validated by an XSD in a row. Of course you lose performance and easy search capabilities, but it's a good trade off IMHO.Since it's 2016, forget XML! Use JSON to store the non-relational data bag, with an appropriately typed column as backend. You shouldn't normally need to query by value inside the bag, which will be slow even though many contemporary SQL databases understand JSON natively.
就像其他人所说的那样,除非您别无选择,否则不要这样做。 需要这样做的一种情况是,如果您正在销售必须允许用户记录自定义数据的现成产品。 我公司的产品就属于这一类。
如果您确实需要允许客户执行此操作,请参阅以下一些提示:
- 创建一个强大管理工具来执行架构更改,并且不允许以任何其他方式进行这些更改。
- 使其成为一项管理功能; 不允许普通用户访问它。
- 记录有关每个架构更改的每个详细信息。 这将帮助您调试问题,并且如果客户做了一些愚蠢的事情,它还会为您提供 CYA 数据。
如果您可以成功完成这些事情(尤其是第一件事),那么您提到的任何架构都可以工作。 我的偏好是动态更改数据库对象,因为这样您可以在访问存储在自定义字段中的数据时利用 DBMS 的查询功能。 其他三个选项要求您加载大量数据,然后在代码中完成大部分数据处理。
Like some others have said, don't do this unless you have no other choice. One case where this is required is if you are selling an off-the-shelf product that must allow users to record custom data. My company's product falls into this category.
If you do need to allow your customers to do this, here are a few tips:
- Create a robust administrative tool to perform the schema changes, and do not allow these changes to be made any other way.
- Make it an administrative feature; don't allow normal users to access it.
- Log every detail about every schema change. This will help you debug problems, and it will also give you CYA data if a customer does something stupid.
If you can do those things successfully (especially the first one), then any of the architectures you mentioned will work. My preference is to dynamically change the database objects, because that allows you to take advantage of your DBMS's query features when you access the data stored in the custom fields. The other three options require you load large chunks of data and then do most of your data processing in code.
我有类似的要求,并决定使用无模式 MongoDB。
亮点:
Lowdarks(您需要了解的内容,以便您可以正确使用 mongo):
I have a similar requirement and decided to use the schema-less MongoDB.
Highlights:
Lowdarks (stuff you need to understand, so you can use mongo correctly):
MSSQL 中的强类型 xml 字段对我们有用。
A strongly typed xml field in MSSQL has worked for us.
您提出的建议并不新鲜。 很多人都尝试过……大多数人发现他们追求“无限”的灵活性,但最终得到的却比这少得多。 这是数据库设计中的“蟑螂汽车旅馆”——数据进入,但几乎不可能将其取出。 尝试并概念化为任何类型的约束编写代码,您就会明白我的意思。
最终结果通常是一个更难以调试、维护并且充满数据一致性问题的系统。 情况并非总是如此,但通常情况下,结果就是这样。 主要是因为程序员没有看到这列火车失事的到来,并且未能针对它进行防御性编码。 而且,通常最终会出现这样的情况:“无限”的灵活性实际上并不是那么必要; 当开发团队收到一个规范说“天哪,我不知道他们要在这里放什么样的数据,所以让他们放任何数据”时,这是一种非常糟糕的“气味”......而最终用户却很好拥有他们可以使用的预定义属性类型(编写通用电话号码,并让他们创建其中的任何号码 - 这在一个良好规范化的系统中是微不足道的,并且保持灵活性和完整性!)
如果您有一个非常好的开发团队并密切了解您必须通过此设计克服的问题,您就可以成功地编写出一个设计良好、没有严重错误的系统。 大多数时候。
但为什么一开始就面临如此不利的局面呢?
不相信我? 谷歌“一个真正的查找表”或“单表设计”。 一些好的结果:
http://asktom.oracle。 com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056
http://thedailywtf.com/Comments/Tom_Kyte_on_The_Ultimate_Extensibility.aspx?pg=3
http://www.dbazine.com/ofinterest/oi-articles/celko22
http://thedailywtf.com/Comments/The_Inner-Platform_Effect.aspx?pg=2
What you are proposing is not new. Plenty of people have tried it... most have found that they chase "infinite" flexibility and instead end up with much, much less than that. It's the "roach motel" of database designs -- data goes in, but it's almost impossible to get it out. Try and conceptualize writing the code for ANY sort of constraint and you'll see what I mean.
The end result typically is a system that is MUCH more difficult to debug, maintain, and full of data consistency problems. This is not always the case, but more often than not, that is how it ends up. Mostly because the programmer(s) don't see this train wreck coming and fail to defensively code against it. Also, often ends up the case that the "infinite" flexibility really isn't that necessary; it's a very bad "smell" when the dev team gets a spec that says "Gosh I have no clue what sort of data they are going to put here, so let 'em put WHATEVER"... and the end users are just fine having pre-defined attribute types that they can use (code up a generic phone #, and let them create any # of them -- this is trivial in a nicely normalized system and maintains flexibility and integrity!)
If you have a very good development team and are intimately aware of the problems you'll have to overcome with this design, you can successfully code up a well designed, not terribly buggy system. Most of the time.
Why start out with the odds stacked so much against you, though?
Don't believe me? Google "One True Lookup Table" or "single table design". Some good results:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056
http://thedailywtf.com/Comments/Tom_Kyte_on_The_Ultimate_Extensibility.aspx?pg=3
http://www.dbazine.com/ofinterest/oi-articles/celko22
http://thedailywtf.com/Comments/The_Inner-Platform_Effect.aspx?pg=2