我应该使用平面表还是标准化数据库?
我目前正在开发一个使用 MySQL 数据库作为后端的 Web 应用程序,在继续下一步之前,我需要知道什么更适合我的情况。
简而言之,在这个应用程序中,用户将能够使用任何数字字段(他们决定)构建自己的表单,现在我将其全部存储在通过外键链接的几个表中。我的一个朋友建议,为了保持事情“简单/快速”,我应该将每个用户的表单转换为平面表,以便从他们那里查询数据保持快速(在大幅增长的情况下)。
我应该保持数据库规范化,将所有内容汇集到带有外键(索引等)的关系表中,还是应该为用户创建的每个新表单构建平面表?
显然,创建平面表的一些优点是数据分离(安全性)并且查询速度会降低。但说实话,我能从中获得多少收益呢?我真的不想要 10000 个表并且一直删除、更改和添加,但如果它比我会更好......我只需要一些输入。
谢谢
I have a web application that I am currently working on that uses a MySQL database for the back-end, and I need to know what is better for my situation before I continue any further.
Simply put, in this application users will be able to construct their own forms with any number fields (they decide) and right now I have it all stored in a couple tables linked by foreign keys. A friend of mine suggests that to keep things "easy/fast" that I should convert each user's form to a flat table so that querying data from them stays fast (in case of large growth).
Should I keep the database normalized with everything pooled into relational tables with foreign keys (indexes, etc) or should I construct flat tables for every new form that a user creates?
Obviously some positives of creating flat tables is data separation (security) and query speeds would be cut down. But seriously how much gain would I get from this? I really don't want 10000 tables and to be dropping, altering, and adding all of the time, but if it will be better than I will do it... I just need some input.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
经验法则。从规范化到非规范化比反规范化更容易。
从合理水平的数据库规范化开始(合理是指可读、可维护和高效,但不是过早优化),然后,如果您在发展过程中遇到性能问题,您可以选择研究非规范化可以提高性能的方法。
Rule of thumb. It's easier to go from normalized to denormalized than the other way around.
Start with a reasonable level of database normalization (by reasonable I mean readable, maintainable, and efficient but not prematurely optimized), then if you hit performance issues as you grow, you have the option of looking into ways in which denormalization may increase performance.
保持数据标准化。如果索引正确,您将在很长一段时间内不会遇到性能问题。
关于安全性:扁平方法将要求您编写大量的创建/删除表、更改表等语句,即更多的代码和更多的故障点。
拥有平面文件的唯一原因是当您的用户可以直接连接到数据库时(您仍然可以选择行级安全性)。但在这种情况下,您实际上是在重新实现 phpmyadmin 的变体
Keep your data normalized. If you index properly, you will not encounter performance issues for a very long time.
Regarding security: The flat approach will require you to write lots of create/drop table, alter table etc statements, ie a lot more code and a lot more points of failure.
The only reason to have flat files would be when your users can connect to the DB directly (you could still go for row level security). But in that case, you are really reimplementing a variant of phpmyadmin
哎呀!那么当用户本质上为您做出数据库决策时,您如何可能进行任何类型的规范化。
我认为你要么需要一步一步地管理它,要么让你的怪胎旗帜飘扬,继续购买硬件,以跟上当用户真正开始使用它时你将受到的打击......举个例子,看看当用户开始了解如何在 SharePoint 中创建新表单和视图时会发生什么...CRIKY!谈论范围蔓延!
Yikes! Then how could you possibly do any sort of normalization when the users are, in essense, making the database decisions for you.
I think you either need to manage it step by step or let your freak flag fly and just keeping buying hardware to keep up with the thrashing you're going to get when the users really start to get into it....Case in point, look what happens when users start to understand how to make new forms and views in SharePoint...CRIKY!! Talk about scope creep!!
在运行时改变模式并不是一个好主意。您要考虑的是EAV(实体-属性-值)模型。
维基百科有关于优缺点以及实施细节的一些非常好的信息 。应尽可能避免 EAV,但对于像您这样每个表单的列数未知的情况,EAV 值得考虑。
Altering the schema during runtime is rarely a good idea. What you want to consider is the EAV (Entity-Attribute-Value) model.
Wikipedia has some very good info on the pros and cons, as well as implementation details. EAV is to be avoided when possible, but for situations like yours with an unknown number of columns for each form, EAV is woth considering.
保持数据标准化。如果您有适当的索引,系统将保持快速。
如果你真的想快点,那么将模式切换到键值数据库之一,如 bigDB /couchDB 等。这是完全非规范化的,并且非常非常快。
Keep your data normalized. The system will should stay fast provided you have proper indexing.
If you really want to go fast then switch the schema to one of the key value databases like bigDB /couchDB etc. That is totally denormalized and very very fast.
我处理这个问题的方法是使用规范化的、可扩展的“属性”表,如下所示:
上面只是一个示例,但我在很多情况下都使用过这种模式,而且它往往效果很好。唯一真正的“问题”是您需要将值序列化为 string/varchar,然后将其反序列化为所需的任何内容,因此客户端需要承担一些额外的责任。
The way I would handle this is to use a normalized, extensible "Property" table, such as below:
The above is just an example, but I've used this pattern in many cases, and it tends to work out pretty well. The only real "gotcha" is that you need to serialize the value as a string/varchar and then deserialize it to whatever it needs to be, so there is a little added responsibility on the client.
规范化==快速搜索,更容易维护索引,较慢的插入事务(在多行上)
非规范化==快速插入,通常在有大量插入时使用(收集和记录时间顺序数据的数据仓库)
Normalized == fast searches, easier to maintain indexes, slower insert transactions (on multiple rows)
Denormalized == fast inserts, ususally this is used when there are a lot of inserts (data warehouses that collect and record chronological data)