第三范式适用于数据库吗?
在设计数据库时我应该始终以第四范式为目标吗?
我觉得第三范式更接近我的业务领域。
例如,我有一个带有 PartNumber
的表。在我的业务领域,它是唯一的密钥,任何两个部分都不能具有相同的编号。然而,这是一个 VARCHAR,将主键放入 VARCHAR 然后将其作为外键链接到其他表对我来说是一种巨大的味道。
我应该将自动增量 ID 放在各处吗?我并没有立即明白这一点,它使代码变得非常复杂。
Should I always aim for 4th normal form when designing databases?
I feel that 3rd normal form is closer to my business domain.
For example I have a table with PartNumber
. In my business domain, it is the unique key, no two parts shall ever have the same number. However this is a VARCHAR, putting a primary key to a VARCHAR and then linking it as a foreign key to other table is a huge smell to me.
Should I put Auto Increments IDs everywhere instead? I don't really see the point right away and it complicates code a lot.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
作为一般规则,您的数据库至少应采用 Boyce-Codd 范式或理想的第五范式。仅当存在必须保留且 5NF 无法满足的依赖关系时才考虑 3NF。
4NF 与您在键中使用 varchar 还是整数无关,我不确定为什么您似乎认为它与此有关。
As a general rule aim for your database to be in at least Boyce-Codd Normal Form or ideally Fifth Normal Form. Consider 3NF only where there are dependencies that it's important to preserve that wouldn't otherwise be satisfied by 5NF.
4NF has nothing to do with whether you use varchar or integers in your keys and I'm not sure why you seem to think it does.
这取决于数据库;使用 varchar 字段作为 PK 会浪费数据库存储中的空间和 MySQL 中的索引(例如)。为您的零件编号提供一个表格和唯一的整数 ID 可能看起来不那么纯粹,但在性能方面,这些东西有时是必要的。
(此外,它确实允许您在必要时在一个位置更新产品编号,尽管我怀疑这在您的业务逻辑中可能不太可能。)
It depends on the database; using varchar fields as your PK wastes space in your DB storage and indices in MySQL (for example). Having a table and unique integer id for your part numbers may seem less pure, but such things are sometimes necessary in the face of performance.
(And further, it does allow you to update a product number if necessary in one spot, though I suspect this is likely not probable in your business logic.)
虽然您的问题涉及 4NF,但您的具体情况却并非如此。使用 VARCHAR 字段作为主键并不是错误,如果它相对较小,那么我更愿意使用它而不是代理键。
回到 4NF - 从维基百科的示例来看,这似乎是在很多时候,自然而然地就位。争取 4NF 当然不会有什么坏处——同时非规范化也有其一席之地。
Whilst your question asks about 4NF, your particular situation doesn't. Using a VARCHAR field as a primary key isn't wrong, if it's relatively small then I would prefer to use that rather than a surrogate key.
Back to 4NF - from Wikipedia's example, this seems to be something that would, much of the time, just naturally fall into place. It certainly can't hurt to strive for 4NF - at the same time denormalization has its place.
您允许您的用户编辑零件号吗?如果是这样,这是反对将其用作主键的一个很好的论据。作为(一般)经验法则,PK 不应该是用户可编辑的,因为通过数据库复制更改的 PK 可能会变得相当昂贵。
另外,是否没有两个部件具有相同的编号,或者没有两个活动部件可以具有相同的编号?如果是后者,您是否需要保留足够长的历史记录,以至于最终可能会得到两个相同的零件号?
Do you allow your users to edit part numbers? If so, this is a good argument against using it as the primary key. As a (general) rule of thumb PKs shouldn't be user editable because replicating a changed PK through the DB can become quite expensive.
Also - is it that no two parts shall ever have the same number, or no two active parts can have the same number? If it's the latter, do you need to keep a long enough history that you might end up with two part numbers that are the same?
学习如何标准化数据需要几周的时间。需要几个月甚至几年的时间才能学会何时忽视正常化规则。当然,如果无视标准化规则,就会产生后果。如果你彻底学规范化,你就会知道后果是什么。
有时,与遵循其他设计模式的好处相比,不遵守给定规范形式的后果很轻。例如,在构建 OLAP 数据库或数据仓库时,星型模式或雪花模式通常比完全规范化的效率更高。
对于 OLTP 数据库,我倾向于以 Boyce-Codd 范式为目标,只处理由于偏离第四或第五范式而出现的任何修改异常。但这确实取决于具体情况。
It takes a few weeks to learn how to normalize data. It take months, maybe even years, to learn when to disregard the rules of normalization. Of course, if you disregard the rules of normalization, there will be consequences. If you learn normalization thoroughly, you will know that the consequences are.
Sometimes the consequences of not adhering to a given normal form are light, compared to the benefit of following some other design pattern. For example, when building an OLAP dataabase or a data warehouse, star schema or snowflake schema is often more productive than fully normalized.
For OLTP databases, I would tend to aim for Boyce-Codd normal form, and just deal with any modification anomalies that surface due to departure from 4th or 5th normal form. But it really depends on the case.