您如何知道 SQL 数据库何时需要更多规范化?

发布于 2024-08-09 18:11:59 字数 74 浏览 7 评论 0原文

是否是当您尝试获取数据但没有明显的简单方法时?

当你发现某样东西应该是一个独立的桌子时?

法律是什么?

Is it when you're trying to get data and there is no apparent easy way of doing it?

When you find something should be a table on it's own?

What are the laws?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(13

微凉 2024-08-16 18:11:59

查看维基百科。本文讨论了数据库规范化和不同的形式(第一、第二、第三等)。大多数时候,您应该至少以第三范式为目标。有时您想稍微放宽规则(将多个表连接在一起可能成本太高,因此可能需要稍微反规范化),但在大多数情况下,第三范式是好的。

Check out Wikipedia. The article talks about database normalization and the different forms (first, second, third, etc.). Most times you should be aiming for at least third normal form. There are times when you want to relax the rules a bit (it may be too expensive to join multiple tables together so might want to de-normalize a bit) but for the most part third normal form is good.

向地狱狂奔 2024-08-16 18:11:59

当您发现必须重复相同的数据时,或者当您开始使用单个字段作为数组时。

When you notice you have to repeat the same data, or when you start using single fields as arrays.

年华零落成诗 2024-08-16 18:11:59

虽然这是一个有点尖锐的答案,当您发现数据没有充分标准化时。网络上有许多关于规范化级别(或更准确地说,“形式”)的资源,它们比我在这里更完整地描述了这些形式。第一范式和第二范式应该是非常需要的。如果你没有达到第三(或者实际上是第四)范式,你需要有一个强有力的理由来解释为什么。

查看有关数据库规范化的维基百科文章

While this is a somewhat snarky answer, when you discover that the data isn't sufficiently normalized. There are many resources on the web about the levels (or, more properly, "forms") of normalization, and they more completely describe the forms than I could here. First and second normal forms should be pretty much required. If you aren't at third (or, really, fourth) normal form, you need to have a strong justification as to why.

Check out the Wikipedia article on database normalization.

风情万种。 2024-08-16 18:11:59

当您开始质疑 SQL 数据库是否需要更多规范化时。

When you're starting to question whether an SQL database needs more normalization.

叫嚣ゝ 2024-08-16 18:11:59

每当你有一个关系数据库时......

不,实际上有法律,请查看这个维基百科 链接

它们被称为五范式或类似的东西。最初来自 20 世纪 50 年代/60 年代发明关系数据库的人 EF Codd。

“钥匙是整个钥匙,除了钥匙什么都没有,所以帮我科德”

这是一个概要:

  1. 第一范式(1NF)表
    忠实地代表一种关系并且
    没有重复组
  2. 第二范式 (2NF) 否
    表中的非素数属性是
    功能上依赖于零件
    候选键的(真子集)
  3. 第三范式 (3NF) Every
    非素数属性是
    非传递依赖于每一个
    表的键 表中的每个非平凡函数依赖项都是对超键的依赖项
  4. 第四范式 (4NF) 每个
    非平凡的多值依赖
    表中是对 a 的依赖
    超级键
  5. 第五范式 (5NF) 表中的每个重要连接依赖关系都由表的超级键隐含。域/键范式 (DKNF) Ronald Fagin (1981)[19] 表上的每个约束都是表的域约束和键约束的逻辑结果
  6. 第六范式 (6NF) 表特征 无
    根本不平凡的连接依赖关系
    (参考广义连接
    操作员)

Whenever you have a relational database.... <grin/>

No, actually there are laws, check out this Wikipedia link.

they are called the five normal forms or something like that. Originally from the guy who invented relational databases in the 50s/60s, E. F. Codd.

"The key the whole key and nothing but the Key, so help me Codd"

This is a synopsis:

  1. First normal form (1NF) Table
    faithfully represents a relation and
    has no repeating groups
  2. Second normal form (2NF) No
    non-prime attribute in the table is
    functionally dependent on a part
    (proper subset) of a candidate key
  3. Third normal form (3NF) Every
    non-prime attribute is
    non-transitively dependent on every
    key of the table Every non-trivial functional dependency in the table is a dependency on a superkey
  4. Fourth normal form (4NF) Every
    non-trivial multivalued dependency
    in the table is a dependency on a
    superkey
  5. Fifth normal form (5NF) Every non-trivial join dependency in the table is implied by the superkeys of the table. Domain/key normal form (DKNF) Ronald Fagin (1981)[19] Every constraint on the table is a logical consequence of the table's domain constraints and key constraints
  6. Sixth normal form (6NF) Table features no
    non-trivial join dependencies at all
    (with reference to generalized join
    operator)
滿滿的愛 2024-08-16 18:11:59

其他人已经向您指出了正常化的正式规则。以下是我使用的一些非正式准则:

  1. 如果表中有列,其名称仅因数字不同(例如 Phone1 和 PHone2)。

  2. 如果表格中有任何列,只有当表格中的另一列被填写时才应填写。

    如果表格中有任何列,只有

  3. 如果更新数据库中的“事实”(例如街道地址)需要多次更新。

  4. 如果同一问题可能会得到两个不同的答案,具体取决于您从哪个表获取信息。

  5. 如果可以从数据库中获得任何重要问题的答案,而无需连接至少两个表。

    如果

  6. 如果数据库中有任何基于数量的限制,而不是“只允许其中 1 个”(即“只允许一个地址”是可以的,但“只允许两个地址”则表明存在规范化问题) )。

Other people have pointed you to the formal rules for normalization. Here are some informal guidelines I use:

  1. If you have columns in a table the names of which differ only by a number (eg Phone1 and PHone2).

  2. If you have any columns in a table that should be filled in only when another column in the table is filled in.

  3. If updating a "fact" in the database (such as a street address) requires more than one UPDATE.

  4. If the same question could ever get two different answers depending on which table you get your information from.

  5. If the answer to any non-trivial question can be gotten from the database without JOINing at least two tables.

  6. If you have any quantity-based restrictions in the database other than "only 1 of something is allowed" (that is, "only one address is allowed" is okay, but "only two addresses are allowed" indicates a normalization problem).

So要识趣 2024-08-16 18:11:59

3NF 通常是您所需要的,它遵循三个规则:

表中的每一列都应该依赖于:

  • 键 (1NF)、
  • 整个键 (2NF)
  • 以及除了键 (3NF) 之外的任何内容(所以帮助我 Codd 是引用通常以这种方式结束)。

出于性能原因,您通常可以“降级”到 2NF,只要您了解其中的含义,并且仅在遇到问题时,但 3NF 应该是您所有设计的初始目标。

3NF is generally all you need and it follows three rules:

Every column in the table should be dependent on:

  • the key (1NF),
  • the whole key (2NF),
  • and nothing but the key (3NF) (so help me Codd is the way that quote usually ends).

You can often "downgrade" to 2NF for performance reasons, provided you understand the implications and only when you strike problems, but 3NF should be the initial goal for all your designs..

折戟 2024-08-16 18:11:59

正如其他人所说,您知道何时开始在多个表中出现(太多)重复列。

话虽这么说,有时跨多个表拥有冗余列很有用。这可以减少复杂查询中必须执行的 JOIN 数量。请小心保持所有表同步,否则您只是自找麻烦。

As everyone else has said, you know when you start having (too many) duplicate columns in multiple tables.

That being said, it is sometimes useful to have redundant columns across multiple tables. This can reduce the number of JOINs you have to do in complicated queries. Just be careful to keep all the tables in sync, or you're just asking for trouble.

姜生凉生 2024-08-16 18:11:59

这是一篇相当不错的文章。变得正常是一门科学,而不是一门艺术。现在知道何时去规范化......这是一门艺术。

http://www.alvechurchdata.co.uk/hints-and-提示/softnorm.html

This is a pretty good article. Getting normal is a science, not an art. Now knowing when to DEnormalize... that's an art.

http://www.alvechurchdata.co.uk/hints-and-tips/softnorm.html

东风软 2024-08-16 18:11:59

您目前处于什么正常化水平?如果你不能回答这个问题,我认为你的数据库一团糟。我总是在初始设计时达到第三常态,并在需要时进一步反规范化或规范化。

What level of normalization are you currently at? If you can't answer that I assume your database is a nasty mess. I always hit 3rd normal on initial design and de-normalize or normalize further if and when needed.

无边思念无边月 2024-08-16 18:11:59

我假设您正在谈论支持交互式应用程序的事务数据库,但就其价值而言...

专门用于报告且仅由 ETL 流程更新的 OLAP 数据库可能会受益于标准化程度较低的结构。在这些应用程序中,您接受冗余数据存储和复制的成本,以获得更少的连接的性能优势以及(有时技术性较低的)数据分析师和业务分析师的更高易用性。

事务数据库应始终在实用范围内规范化(至少 3NF),然后仅根据需要选择性地非规范化。理想情况下,非规范化的需要应该基于实际的性能测试结果。

I assume you're talking about a transactional database supporting an interactive application, but for what it's worth...

OLAP databases used exclusively for reporting and only updated by ETL processes may benefit from a less normalized structure. In these applications you accept the cost of redundant data storage and duplication for the performance benefit of fewer joins and the increased ease of use for (sometimes less technical) data analysts and business analysts.

Transactional databases should always be normalized to the extent practical (at least 3NF) and then selectively denormalized only as needed. And the need to denormalize should ideally be based on actual performance testing results.

§对你不离不弃 2024-08-16 18:11:59

当您必须搜索大量数据只是为了提取一些基本信息时 - 即有什么样的产品类别或类似的信息。

When you have to search trough huge amounts of data just to extract some basic info - i.e. what kind of Product categories are there or something like that.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文