创建数据模型的最佳实践

发布于 2024-11-30 02:21:32 字数 1539 浏览 1 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

蓝颜夕 2024-12-07 02:21:32

我个人认为在开始数据库建模之前您应该阅读一本有关性能调优的书。正确的设计可以让世界变得不同。如果你不是性能调优方面的专家,你就没有资格设计数据库。

这些书是针对数据库的,这里是一本针对 SQl Server 的书。
http://www.amazon.com/Server-Performance-Tuning-Distilled-Experts/dp/1430219025/ref=sr_1_1?s=books&ie=UTF8&qid=1313603282&sr=1- 1

在开始设计之前你应该阅读的另一本书是关于反模式的。知道自己应该避免做什么总是很高兴的。
http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557/ref=sr_1_1?s=books&ie=UTF8&qid=1313603622&sr=1- 1

不要陷入灵活性设计的陷阱。人们用它来逃避正确设计的工作,而灵活的数据库几乎总是表现不佳。如果超过 5% 的数据库设计依赖于灵活性,那么我认为您的建模不正确。我使用过的所有最差的 COTS 产品在设计时都首先考虑了灵活性。

任何像样的数据库书籍都会讨论规范化。您还可以在网络上轻松找到该信息。确保真正建立 FK/PK 关系。

至于命名列,选择一个标准并始终坚持下去。一致性比实际标准更重要。不要命名列 ID(请参阅 SQL 反模式书籍)。如果列将位于多个不同的表中,请使用相同的名称和数据类型。您想要的是不必因为数据类型不匹配而使用函数来进行连接。

永远记住,数据库可以(并且将会)在应用程序之外进行更改。数据完整性所需的任何内容都必须位于数据库中,而不是应用程序代码中。应用程序被替换后很长时间内数据仍将存在。

数据库设计最重要的事情:

  • 所需数据的彻底定义(包括正确的数据类型)
    以及数据之间的关系(包括正确的规范化)
  • 数据完整性
  • 性能
  • 安全性
  • 一致性(数据类型、命名标准等)

Personally I think you should read a book on performance tuning before beginning to model a database. The right design can make a world of difference. If you are not expert in performance tuning, you aren't qualified to design a database.

These books are Database specific, here is one for SQl Server.
http://www.amazon.com/Server-Performance-Tuning-Distilled-Experts/dp/1430219025/ref=sr_1_1?s=books&ie=UTF8&qid=1313603282&sr=1-1

Another book that you should read before starting to design is about antipatterns. Always good to know what you should avoid doing.
http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557/ref=sr_1_1?s=books&ie=UTF8&qid=1313603622&sr=1-1

Do not get stuck in the trap of designing for flexibility. People use that as a way to get out of doing the work to design correctly and flexible databases almost always perform badly. If more than 5% of your database design depends on flexibility, you haven't modeled correctly in my opinion. All the worst COTS products I've had to work with were designed for flexibility first.

Any decent database book will discuss normalization. You can also find that information easily on the web. Be sure to actually create FK/PK relationships.

As far as naming columns, pick a standard and stick with it consistently. Consistency is more important than the actual standard. Don't name columns ID (see SQL antipatterns book). Use the same name and datatypes if columns are going to be in several different tables. What you are going for is to not have to use functions to do joins because of datatype mismatches.

Always remember that databases can (and will) be changed outside the application. Anything that is needed for data integrity must be in the database not the application code. The data will be there long after the application has been replaced.

The most important things for database design:

  • Thorough definition of the data needed (including correct datatypes)
    and the relationships between pieces of data (including correct normalization)
  • data integrity
  • performance
  • security
  • consistency (of datatypes, naming standards etc.)
红衣飘飘貌似仙 2024-12-07 02:21:32

尽管这不是一本书,但我建议阅读 查询评估技术大型数据库。它提供了查询处理的背景知识,这在很大程度上影响您的模式设计,特别是对于数据密集型(例如分析)工作负载。它不太需要实践,但我相信每个数据库设计人员都应该至少阅读一次:-)。

Even though it is not a book I recommend to read Query evaluation techniques for large databases. It gives a background on query processing which largely influences your schema design, especially for data intensive (e.g., analytical) workloads. It is less hands-on but I believe every database designer should read it at least once :-).

轻拂→两袖风尘 2024-12-07 02:21:32

我读过的关于数据库系统设计的最好的书是《数据库系统简介》。 Joe Celko 的 SQL for Smarties 书籍也值得一读。
假设您正在构建一个应用程序而不仅仅是一个数据库,并且假设您正在使用面向对象的语言,Craig Larman 的《应用 UML 和模式》对将数据库映射到对象进行了很好的讨论。

就“好”的定义而言,根据我的经验,“可维护”可能是最重要的。数据库设计中的可维护性意味着很多事情,例如坚持约定 - 我经常推荐 http://justinsomnia.org/2003/04/essential-database-naming-conventions-and-style/。标准化是另一个明显的可维护性策略。我经常建议对列类型保持慷慨 - 如果您发现不同国家/地区的邮政编码比美国的邮政编码长,则很难更改应用程序。我经常建议经验不足的开发人员使用视图来抽象复杂的数据关系。

可维护性的关键是测试和部署的能力。值得阅读有关持续数据库集成(http://www.codeproject.com/KB/architecture/Database_CI.aspx)的内容 - 虽然与数据库模式的设计没有严格关联,但它是重要的上下文。

至于性能 - 我认为您应该首先设计可维护性,并且只有在您知道有问题时才设计性能。有时,您事先知道性能将是一个主要问题 - 为 Facebook(或 Stack Exchange)设计数据库,设计包含大量数据(TB 及以上)或大量用户的数据库。大多数系统不属于这个阵营 - 因此我建议定期进行性能测试,并使用代表性数据来查找是否存在问题,并且仅在可以证明必须进行调整时进行调整。许多性能优化是以牺牲可维护性为代价的——例如非规范化。

哦,一般来说,如果可以的话,避免使用触发器和存储过程。这只是我的意见,不过...

The best book I've read on the design of database systems was "An Introduction to Database Systems". Joe Celko's SQL for Smarties books are also worth reading.
Assuming you're building an application and not just a database, and assuming you're using an Object Oriented language, Applying UML and Patterns by Craig Larman has a good discussion on mapping databases to objects.

In terms of defining "good", in my experience "maintainable" is probably top of the list. Maintainability in database design means many things, such as sticking to conventions - I often recommend http://justinsomnia.org/2003/04/essential-database-naming-conventions-and-style/. Normalization is another obvious maintainability strategy. I often recommend being generous with column types - it's hard to change an application if you find out that postal codes in different countries are longer than in the US. I often recommend using views to abstract complex data relations away for less experienced developers.

A key thing with maintainability is the ability to test and deploy. It's worth reading up about Continuous Database Integration (http://www.codeproject.com/KB/architecture/Database_CI.aspx) - whilst not strictly associated with the design of the database schema, it's important context.

As for performance - I believe you should design for maintainability first, and only design for performance if you know you have a problem. Sometimes, you know in advance that performance will be a major problem - designing a database for Facebook (or Stack Exchange), designing a database with huge amounts of data (terabytes and up), or huge numbers of users. Most systems don't fall into that camp - so I recommend regular performance tests, with representative data, to find if you have a problem, and only tune when you can prove you have to. Many performance optimizations are at the expense of maintainability - denormalization, for instance.

Oh, and in general, avoid triggers and stored procedures if you can. That's just my opinion, though...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文