当前位置：文江博客话题详情

动态数据库模式

发布于 2024-07-05 23:02:58 字数 1721 浏览 12 评论 0 原文

Closed. This question is opinion-based. It is not currently accepting answers.

想要改进这个问题？更新问题，以便可以通过编辑这篇文章用事实和引文来回答它。

3 年前已关闭。

社区4个月前审查了是否重新开放此问题，并将其关闭：

原始关闭原因未解决

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夜清冷一曲。 2024-07-12 23:02:59

我知道这是一篇非常老的文章，在过去 11 年里发生了很多变化，但我想我会添加它，因为它可能对未来的读者有所帮助。我和我的联合创始人创建 HarperDB 的原因之一是在单个、不重复的数据集中本地完成动态模式，同时提供完整的索引功能。您可以在这里阅读更多相关信息：
https://harperdb.io/blog/dynamic-schema-the-harperdb -路/

回复收藏 0 原文

小红帽 2024-07-12 23:02:59

sql 已经提供了一种更改架构的方法：ALTER 命令。

只需有一个表列出不允许用户更改的字段，并为 ALTER 编写一个漂亮的界面即可。

回复收藏 0 原文

一影成城 2024-07-12 23:02:59

弹性搜索。如果您正在处理可以按日期分区的数据集，您可以使用 JSON 作为数据，并且不固定使用 SQL 来检索数据，那么您应该特别考虑它。

ES 会推断您发送的任何新 JSON 字段的架构，无论是自动（带有提示）还是手动，您可以通过一个 HTTP 命令（“映射”）来定义/更改。
虽然它不支持 SQL，但它具有一些强大的查找功能甚至聚合功能。

回复收藏 0 原文

书信已泛黄 2024-07-12 23:02:59

过去我选择了选项 C -- 创建一个“长而窄”的表，将动态列值存储为行，然后需要对其进行旋转以创建一个包含所有值的“短而宽”的行集特定实体。。然而，我使用的是 ORM，这确实让事情变得痛苦。我无法想象你会如何在 LinqToSql 中做到这一点。我想我必须创建一个哈希表来引用这些字段。

@Skliwz：我猜他对允许用户创建用户定义的字段更感兴趣。

回复收藏 0 原文

感情废物 2024-07-12 23:02:59

在 c2.com wiki 上，探讨了“动态关系” 的想法。您不需要 DBA：列和表都是 Create-On-Write，除非您开始添加约束以使其更像传统的 RDBMS：随着项目的成熟，您可以逐步“锁定它”。

从概念上讲，您可以将每一行视为一条 XML 语句。例如，员工记录可以表示为：

<employee lastname="Li" firstname="Joe" salary="120000" id="318"/>

这并不意味着它必须以 XML 形式实现，这只是一个方便的概念化。如果您请求不存在的列，例如“SELECT madeUpColumn ...”，它将被视为空白或 null（除非添加的约束禁止这样做）。并且可以使用SQL，尽管由于隐含的类型模型而必须小心比较。但除了类型处理之外，动态关系系统的用户会感到宾至如归，因为他们可以利用大部分现有的 RDBMS 知识。现在，如果有人愿意建造它......

Over at the c2.com wiki, the idea of "Dynamic Relational" was explored. You DON'T need a DBA: columns and tables are Create-On-Write, unless you start adding constraints to make it act more like a traditional RDBMS: as a project matures, you can incrementally "lock it down".

Conceptually you can think of each row as an XML statement. For example, an employee record could be represented as:

<employee lastname="Li" firstname="Joe" salary="120000" id="318"/>

This does not imply it has to be implemented as XML, it's just a handy conceptualization. If you ask for a non-existing column, such as "SELECT madeUpColumn ...", it's treated as blank or null (unless added constraints forbid such). And it's possible to use SQL, although one has to be careful about comparisons because of the implied type model. But other than type handling, users of a Dynamic Relational system would feel right at home because they can leverage most of their existing RDBMS knowledge. Now, if somebody would just build it...

回复收藏 0 原文

梦里寻她 2024-07-12 23:02:59

创建2个数据库

DB1包含静态表，并代表数据的“真实”状态。
DB2 是免费的，用户可以随心所欲地使用 - 他们（或您）必须编写代码来填充 DB1 中的奇怪形状的表。

回复收藏 0 原文

债姬 2024-07-12 23:02:59

我认为 EAV 方法是最好的方法，但成本高昂

回复收藏 0 原文

梦萦几度 2024-07-12 23:02:59

我知道这是一个老话题，但我想它永远不会失去现实性。
我现在正在开发类似的东西。
这是我的方法。
我使用 MySQL、Apache、PHP 和 Zend Framework 2 作为应用程序框架的服务器设置，但它应该与任何其他设置一起工作。

这是一个简单的实施指南，您可以在此基础上进一步发展。

您需要实现自己的查询语言解释器，因为有效的 SQL 太复杂。

示例：

select id, password from user where email_address = "[email protected]"

物理数据库布局：

表 'specs': （应缓存在数据访问层中）

id: int
Parent_id: int
name: varchar(255)

表 'items':

id: int
Parent_id: int
spec_id: int
data: varchar(20000)

表 'specs' 的内容：

1, 0, 'user'
2, 1, 'email_address'
3, 1, 'password'

表 'items' 的内容：

1, 0, 1, ''
2, 1 , 2, '[电子邮件受保护]'
3, 1, 3, '我的密码'

用我们自己的查询语言将示例翻译

select id, password from user where email_address = "[email protected]"

成标准 SQL 的形式如下所示：

select 
    parent_id, -- user id
    data -- password
from 
    items 
where 
    spec_id = 3 -- make sure this is a 'password' item
    and 
    parent_id in 
    ( -- get the 'user' item to which this 'password' item belongs
        select 
            id 
        from 
            items 
        where 
            spec_id = 1 -- make sure this is a 'user' item
            and 
            id in 
            ( -- fetch all item id's with the desired 'email_address' child item
                select 
                    parent_id -- id of the parent item of the 'email_address' item
                from 
                    items 
                where 
                    spec_id = 2 -- make sure this is a 'email_address' item
                    and
                    data = "[email protected]" -- with the desired data value
            )
    )

您需要将规格表缓存在关联数组或哈希表或类似的东西中，以便从规格名称中获取规格 ID。否则，您将需要插入一些更多的 SQL 开销来从名称中获取 spec_id，如以下代码片段所示：

坏示例，不要使用此，避免此，而是缓存规格表！

select 
    parent_id, 
    data 
from 
    items 
where 
    spec_id = (select id from specs where name = "password") 
    and 
    parent_id in (
        select 
            id 
        from 
            items 
        where 
            spec_id = (select id from specs where name = "user") 
            and 
            id in (
                select 
                    parent_id 
                from 
                    items 
                where 
                    spec_id = (select id from specs where name = "email_address") 
                    and 
                    data = "[email protected]"
            )
    )

我希望您能明白这一点，并能自己确定这种方法对您是否可行。

享受！ :-)

I know it's an old topic, but I guess that it never loses actuality.
I'm developing something like that right now.
Here is my approach.
I use a server setting with a MySQL, Apache, PHP, and Zend Framework 2 as application framework, but it should work as well with any other settings.

Here is a simple implementation guide, you can evolve it yourself further from this.

You would need to implement your own query language interpreter, because the effective SQL would be too complicated.

Example:

select id, password from user where email_address = "[email protected]"

The physical database layout:

Table 'specs': (should be cached in your data access layer)

id: int
parent_id: int
name: varchar(255)

Table 'items':

id: int
parent_id: int
spec_id: int
data: varchar(20000)

Contents of table 'specs':

1, 0, 'user'
2, 1, 'email_address'
3, 1, 'password'

Contents of table 'items':

1, 0, 1, ''
2, 1, 2, '[email protected]'
3, 1, 3, 'my password'

The translation of the example in our own query language:

select id, password from user where email_address = "[email protected]"

to standard SQL would look like this:

select 
    parent_id, -- user id
    data -- password
from 
    items 
where 
    spec_id = 3 -- make sure this is a 'password' item
    and 
    parent_id in 
    ( -- get the 'user' item to which this 'password' item belongs
        select 
            id 
        from 
            items 
        where 
            spec_id = 1 -- make sure this is a 'user' item
            and 
            id in 
            ( -- fetch all item id's with the desired 'email_address' child item
                select 
                    parent_id -- id of the parent item of the 'email_address' item
                from 
                    items 
                where 
                    spec_id = 2 -- make sure this is a 'email_address' item
                    and
                    data = "[email protected]" -- with the desired data value
            )
    )

You will need to have the specs table cached in an associative array or hashtable or something similar to get the spec_id's from the spec names. Otherwise you would need to insert some more SQL overhead to get the spec_id's from the names, like in this snippet:

Bad example, don't use this, avoid this, cache the specs table instead!

select 
    parent_id, 
    data 
from 
    items 
where 
    spec_id = (select id from specs where name = "password") 
    and 
    parent_id in (
        select 
            id 
        from 
            items 
        where 
            spec_id = (select id from specs where name = "user") 
            and 
            id in (
                select 
                    parent_id 
                from 
                    items 
                where 
                    spec_id = (select id from specs where name = "email_address") 
                    and 
                    data = "[email protected]"
            )
    )

I hope you get the idea and can determine for yourself whether that approach is feasible for you.

Enjoy! :-)

回复收藏 0 原文

你与昨日 2024-07-12 23:02:59

在我看来，您真正想要的是某种“元模式”，一种数据库模式，能够描述用于存储实际数据的灵活模式。动态模式更改是敏感的，您不想搞乱它，尤其是在允许用户进行更改的情况下。

您不会找到比任何其他数据库更适合此任务的数据库，因此最好的选择就是根据其他标准选择一个数据库。例如，您使用什么平台来托管数据库？该应用程序是用什么语言编写的？这

为了澄清我所说的“元模式”的含义：

CREATE TABLE data (
    id INTEGER NOT NULL AUTO_INCREMENT,
    key VARCHAR(255),
    data TEXT,

    PRIMARY KEY (id)
);

是一个非常简单的例子，您可能会有一些更具体的内容来满足您的需求（并且希望更容易使用），但它确实可以说明我的观点。您应该考虑数据库模式本身在应用程序级别是不可变的；任何结构变化都应该反映在数据中（即该模式的实例化）。

Sounds to me like what you really want is some sort of "meta-schema", a database schema which is capable of describing a flexible schema for storing the actual data. Dynamic schema changes are touchy and not something you want to mess with, especially not if users are allowed to make the change.

You're not going to find a database which is more suited to this task than any other, so your best bet is just to select one based on other criteria. For example, what platform are you using to host the DB? What language is the app written in? etc

To clarify what I mean by "meta-schema":

CREATE TABLE data (
    id INTEGER NOT NULL AUTO_INCREMENT,
    key VARCHAR(255),
    data TEXT,

    PRIMARY KEY (id)
);

This is a very simple example, you would likely have something more specific to your needs (and hopefully a little easier to work with), but it does serve to illustrate my point. You should consider the database schema itself to be immutable at the application level; any structural changes should be reflected in the data (that-is, the instantiation of that schema).

回复收藏 0 原文

衣神在巴黎 2024-07-12 23:02:59

我知道问题中指出的模型广泛用于生产系统中。我工作的一所大型大学/教学机构正在使用一个相当大的设备。他们专门使用长窄表方法来绘制由许多不同的数据采集系统收集的数据。

此外，谷歌最近通过其代码网站发布了他们的内部数据共享协议，protocol buffer，作为开源协议。以这种方法为模型的数据库系统将非常有趣。

检查以下内容：

实体属性值模型

Google 协议缓冲区

回复收藏 0 原文

爱给你人给你 2024-07-12 23:02:58

我在一个真实的项目中做到了这一点：

数据库由一张表和一个字段组成，该字段是一个包含 50 个字段的数组。它设置了一个“单词”索引。所有数据都是无类型的，因此“单词索引”按预期工作。数字字段表示为字符，实际排序是在客户端完成的。（如果需要，每种数据类型仍然可以有多个数组字段）。

逻辑表的逻辑数据模式保存在具有不同表行“类型”（第一个数组元素）的同一数据库中。它还支持使用相同“类型”字段的写时复制样式的简单版本控制。

优点：

您可以动态重新排列和添加/删除列，无需转储/重新加载数据库。任何新的列数据都可以在零时间内设置为初始值（实际上）。
碎片是最小的，因为所有记录和表的大小相同，有时它可以提供更好的性能。
所有表模式都是虚拟的。任何逻辑模式结构都是可能的（甚至是递归的或面向对象的）。
它适用于“一次写入、主要读取、不可删除/标记为已删除”的数据（大多数 Web 应用程序实际上都是这样）。

缺点：

仅按完整单词建立索引，无缩写，
可以进行复杂查询，但性能略有下降。
取决于您的首选数据库系统是否支持数组和字索引（它已在 PROGRESS RDBMS 中实现）。
关系模型仅存在于程序员的脑海中（即仅在运行时）。

现在我认为下一步可能是在文件系统级别实现这样的数据库。这可能相对容易。

回复收藏 0 原文

很酷不放纵 2024-07-12 23:02:58

拥有关系数据库的全部意义在于保持数据安全和一致。一旦您允许用户更改架构，您的数据完整性就会消失...

如果您需要存储异构数据，例如 CMS 场景，我建议将由 XSD 验证的 XML 存储在一行中。当然，您会失去性能和简单的搜索功能，但恕我直言，这是一个很好的权衡。

既然已经是 2016 年了，忘记 XML吧！使用 JSON 存储非关系数据包，并使用适当类型的列作为后端。您通常不需要按包内的值进行查询，尽管许多现代 SQL 数据库本身就可以理解 JSON，但查询速度会很慢。

回复收藏 0 原文

梦回旧景 2024-07-12 23:02:58

就像其他人所说的那样，除非您别无选择，否则不要这样做。需要这样做的一种情况是，如果您正在销售必须允许用户记录自定义数据的现成产品。我公司的产品就属于这一类。

如果您确实需要允许客户执行此操作，请参阅以下一些提示：
- 创建一个强大管理工具来执行架构更改，并且不允许以任何其他方式进行这些更改。
- 使其成为一项管理功能；不允许普通用户访问它。
- 记录有关每个架构更改的每个详细信息。这将帮助您调试问题，并且如果客户做了一些愚蠢的事情，它还会为您提供 CYA 数据。

如果您可以成功完成这些事情（尤其是第一件事），那么您提到的任何架构都可以工作。我的偏好是动态更改数据库对象，因为这样您可以在访问存储在自定义字段中的数据时利用 DBMS 的查询功能。其他三个选项要求您加载大量数据，然后在代码中完成大部分数据处理。

回复收藏 0 原文

毁我热情 2024-07-12 23:02:58

我有类似的要求，并决定使用无模式 MongoDB。

MongoDB（来自“humongous”）是一个用 C++ 编程语言编写的开源、可扩展、高性能、无模式、面向文档的数据库。（维基百科）

亮点：

具有丰富的查询功能（可能是最接近 SQL DB）
生产就绪（foursquare、sourceforge 使用它）

Lowdarks（您需要了解的内容，以便您可以正确使用 mongo）：

没有事务（实际上它有事务，但是仅在原子操作上）
这里的东西： http://ethangunderson.com /blog/two-reasons-to-not-use-mongodb/
持久性..主要是与 ACID 相关的东西

回复收藏 0 原文

多谢你的绝情让我学会死心 2024-07-12 23:02:58

MSSQL 中的强类型 xml 字段对我们有用。

回复收藏 0 原文

风蛊 2024-07-12 23:02:58

您提出的建议并不新鲜。很多人都尝试过……大多数人发现他们追求“无限”的灵活性，但最终得到的却比这少得多。这是数据库设计中的“蟑螂汽车旅馆”——数据进入，但几乎不可能将其取出。尝试并概念化为任何类型的约束编写代码，您就会明白我的意思。

最终结果通常是一个更难以调试、维护并且充满数据一致性问题的系统。情况并非总是如此，但通常情况下，结果就是这样。主要是因为程序员没有看到这列火车失事的到来，并且未能针对它进行防御性编码。而且，通常最终会出现这样的情况：“无限”的灵活性实际上并不是那么必要；当开发团队收到一个规范说“天哪，我不知道他们要在这里放什么样的数据，所以让他们放任何数据”时，这是一种非常糟糕的“气味”......而最终用户却很好拥有他们可以使用的预定义属性类型（编写通用电话号码，并让他们创建其中的任何号码 - 这在一个良好规范化的系统中是微不足道的，并且保持灵活性和完整性！）

如果您有一个非常好的开发团队并密切了解您必须通过此设计克服的问题，您就可以成功地编写出一个设计良好、没有严重错误的系统。大多数时候。

但为什么一开始就面临如此不利的局面呢？

不相信我？谷歌“一个真正的查找表”或“单表设计”。一些好的结果：
http://asktom.oracle。 com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:10678084117056

http://thedailywtf.com/Comments/Tom_Kyte_on_The_Ultimate_Extensibility.aspx?pg=3

http://www.dbazine.com/ofinterest/oi-articles/celko22

http://thedailywtf.com/Comments/The_Inner-Platform_Effect.aspx?pg=2

回复收藏 0 原文

~没有更多了~

关于作者

(り薆情海

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

动态数据库模式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（16）

创建2个数据库

Create 2 databases

关于作者

相关话题

热门标签

推荐作者

tomoekana

无边思念无边月

眼角的笑意。

在风中等你

是你

syong71

友情链接

动态数据库模式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（16）

创建2个数据库

Create 2 databases

关于作者

相关话题

热门标签

推荐作者

tomoekana

无边思念无边月

眼角的笑意。

在风中等你

是你

syong71

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。