存储所有“东西”的缺点在中央的桌子上
我不确定是否有一个术语来描述这一点,但我观察到内容管理系统将所有类型的数据以其最低限度的属性存储在一个表中,而元数据以键值对的形式存储在另一个表中。
例如。所有内容(博客文章、页面、图像、事件等)都存储在一个表中并被视为一篇文章。
我知道这允许抽象和易于扩展,
我们正在考虑以这种方式设计我们的新项目。它不完全是一个 CMS,但我们计划继续分阶段向其添加模块。假设最初只有可以发表评论的帖子和图像。稍后我们可能会添加具有评论功能的视频。
这种方法有什么缺点?它能满足我们这样的要求吗?
谢谢
I am not sure if there is a term to describe this, but I have observed that content management systems store all kinds of data in a single table with their bare minimum properties while the meta data is stored in another table in form of key value pairs.
for eg. everything (blog posts, pages, images, events etc) is stored in one table and considered as a post.
I understand that this allows for abstraction and easy extensibility
we are considering designing our new project this way. It is not exactly a CMS but we plan to keep adding modules to it in stages. Lets say initially there will be only posts and images on which comments can be posted. Later on we might add videos which will also have the commenting feature.
what are the drawbacks of this approach ? and will it work for a requirement like ours ?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
缺点是主表将获得无数次读取(以及大量写入)。
这意味着将会出现大量的锁争用、大量的重新索引等。
为了稍微缓解这种情况,您可以考虑将“主表”拆分为一系列非主表。
比如说,您将有一个“帖子”主表(可能通过特定类型帖子的元数据或子表进行细化,例如置顶、公告、Shoutbox、私人...)
一张图像主表(可能针对 gif、jpeg 等进行细化) .)
视频的一个主表...
如果这是一个自定义应用程序(而不是像 CMS 或 Portal 框架那样必须“无限调整”的东西),我认为这种拆分是可以接受的,并且可能提供更好的性能(如果您期望拥有大量数据)。
关于您的“示例”评论...首先,如果您再次将评论保留在一个巨大的表中,您可能会遇到类似的问题,就像您将所有类型的项目都保留在其中一样。
假设这不是问题,您显然可以放置一种将评论链接到其原始项目的引用键(当然,您不能使用普通的外键)。
当您从项目转到评论时,此功能效果很好,而当您必须从评论移动到原始项目时,效果会稍差一些。因此,权衡在于哪种操作对于您的问题会更频繁。
The drawback is that the main table will get zillions of reads (and plenty of writes, too).
This means that there will be lots of lock contentions, heavy reindexing etc.
In order to mitigate this a bit you may consider splitting the "main table" in a series of not-so-main-tables.
Say, you will have one main table for "Posts" (possibly refined through metadata or subtables for specific types of posts, like Sticky, Announcement, Shoutbox, Private...)
One main table for Images (possibly refined for gifs, jpegs etc.)
One main table for Videos...
If this is a custom application (and not intended to be something that has to be "infinitely tweakable" like a CMS or a Portal framework) I think this kind of split is acceptable, and may provide some better performance (if you expect to have large amounts of data).
Regarding your "examples" comment... first of all, if you keep comments again in a single gigantic table you may have similar problems as if you kept all type of items in it.
Assuming this is not a problem, you can obviously put a sort of reference key (you can't use the normal foreign keys, of course) that links comments to their original item.
This works fine when you go from item to comments, a bit less when you have to move from comments to the originating item. So the tradeoff is about what kind of operations would be more frequent for your problem.
简单性和可扩展性确实通常是属性值和(如您所说)“单一事物表”方法的有吸引力的方面。
这里没有100%正确的答案 - 根据您的性能/吞吐量目标和可扩展性需求,这种方法可能也适合您。
然而,在大多数情况下,当您知道要存储什么类型的数据时,将不同的实体建模到它们自己的表中并相应地关联数据通常符合您的利益。 RDBMS 已经经过数十年的架构和改进来满足这种用例,并且简单地使用表作为通用垃圾场通常不会给您带来任何明显的优势,除了延迟正确建模数据的不可避免的需要之外。此外,当您将所有内容都归结到一张表中时,您就会迫使应用程序本身之外的用户(如果有的话,例如报告编写者)必须与您的“模型中的模型”作斗争,这只会让人们在使用“模型中的模型”时感到沮丧。编写查询等。您将陷入最低公分母 - 如果您想优化有关类型 X 的查询,并且在同一个表中大量存在类型 Y 和 Z,它们将影响查询 X 的性能
。需要明确的是,“所有内容都在一个表中”名称/值样式元数据方法有明显的好处。我自己也使用过它们,并出于类似的原因反对建模。然而,我的建议是限制自己真正需要这样做的时间(即,在正确建模所需事物的空间之前,您需要实现一些东西)。最典型的是,当我制作复杂系统的原型并且我需要尽快让某些事情开始时,我发现自己会这样做。
Simplicity and extensibility are indeed often attractive aspects of attribute-value and (as you say) "single table of things" approaches.
There's no 100% right answer here -- depending on your performance/throughput goals and extensibility needs, this approach might work for you too.
In most cases, however, where you know what kinds of data you will store, it's usually in your interest to model distinct entities into their own tables and relate the data accordingly. RDBMSes have been architected and refined over decades to cater to this use case and to simply use tables as generic dumping grounds doesn't typically buy you any distinct advantages, except the act of delaying the inevitable need to model your data properly. Furthermore, when you boil everything into one table, you then force users outside your app itself (if you have any, for example report writers) to have to struggle with your "model within a model", which can just make folks frustrated when they write queries, etc. And you will sink to your lowest common denominator -- if you want to optimize queries about type X and you have types Y and Z in that same table in droves, they will impact performance on querying X.
Again, to be clear, there is distinct benefit to the "all things in one table" name/value style metadata approaches. I have used them myself and turned against modeling for similar reasons. However, my advice is to limit yourself to times when you really need to do that (i.e., you need to implement something before you can correctly model the space of things you will need). Most typically, I find myself doing that when I'm prototyping complex systems and I need to get something going sooner than later.