数据冗余
引用完整性约束可以帮助解决数据冗余问题吗?
Can referential integrity constraints help in addressing data redundancy problems?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
引用完整性约束可以帮助解决数据冗余问题吗?
Can referential integrity constraints help in addressing data redundancy problems?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(3)
参照完整性约束只是“一般数据库约束”的一个子集。
规范化和数据库约束是不同但又相互交织的概念。
假设您有一个表 CUSTOMERORDER (custID, custName, orderID),其中表示“由 #custID# 标识且名为 #custName# 的客户已下了由 #orderID# 标识的订单”。
该表不太可能位于 3NF 中,因为可能适用 FD custID->custName。但假设我们仍然保留这种单表设计。那么我们必须做什么来保证数据的一致性呢?我们必须执行上述 FD。我们必须确保如果同一客户下了第二个订单,则两行中的 custName 将相同。我们必须禁止 (1, Smith, 2) 和 (1, Jones, 7) 等行同时出现在表中。这是一种需要强制执行的数据库约束,以使我们的设计符合所有规定的业务规则。
但请注意,我们在这里没有任何“参考”约束。显然,因为没有第二个表可供参考。
顺便还要注意的是,这种单表设计“自动”强制执行一些其他可能不会立即明显的约束。例如,我们的单表设计使得 orderID 不可能在没有对应的 custID 和 custName 的情况下存在。 (如果你正在考虑 null,请停止这样做。在关系理论中,不存在“null”之类的东西。)“规则”是,如果注册了 orderID,则还必须存在相应的 custID PLUS custName ,由我们的设计“隐式”强制执行为单表设计,
但现在我们将设计分解为双表设计,正如传统规范化理论所规定的那样:
CUSTOMER(custID, custName) KEY custID;
ORDER(custID, orderID) KEY custID,orderID ;
我们必须执行的业务规则仍然相同,即:(a) 不能有两个具有相同 custID 但名称不同(即我们的 FD)的客户,以及 (b) 不能有任何订单没有相应的 custID该订单的 PLUS custName。
让我们看看我们的两表设计如何处理这些业务规则。 (a) 显然是通过将 custID 声明为 CUSTOMER 上的密钥来强制执行的。对于(b),显然如果不记录custID,就不可能在ORDER 中记录orderID。但这是否足以保证所有 ORDER 行都有相应的 custName ?显然不是。这就是为什么我们需要在 ORDER 和 CUSTOMER 之间引入明显的引用完整性规则。
因此,RI 约束确实“有助于解决数据冗余问题”,从某种意义上说,通过分解表,并向整体设计引入 RI 约束,它们可以消除某些类型的冗余,同时保留数据完整性的某些保证。如果无法在设计中引入 RI 约束,我们只能以牺牲数据一致性为代价来消除冗余。
Referential integrity constraints are only a subset of "database constraints in general".
Normalization and database constraints are distinct-but-intertwined concepts.
Say you have a table CUSTOMERORDER (custID, custName, orderID), which says that "the customer identified by #custID# and who is named #custName# has placed the order identified by #orderID#".
This table is unlikely to be in 3NF because of the FD custID->custName that probably applies. But say we keep this one-table design nonetheless. What do we then have to do to enforce consistency of the data ? We have to enforce the mentioned FD. We have to see to it that if the same customer places a second order, then the custName in the two rows will be identical. We have to prohibit rows such as (1, Smith, 2) and (1, Jones, 7) to appear both in the table. That is a kind of database constraint to be enforced, in order to make our design match all the stated business rules.
But note that we do not have any "referential" constraint here. Obviously, because there is no second table to reference.
Also note in passing that this one-table design "automatically" enforces some other constraints that might not be immediately obvious. For example, our one-table design makes it impossible for an orderID to exist without a corresponding custID AND custName to also exist. (If you are thinking about nulls, stop doing so. In relational theory, there does not exist a thing such as 'null".) The "rule" that if an orderID is registered, then there must also exist a corresponding custID PLUS custName, is enforced "implicitly" by our design being a one-table one.
But now we decompose our design into a two-table one, as traditional normalization theory prescribes it :
CUSTOMER(custID, custName) KEY custID;
ORDER(custID, orderID) KEY custID,orderID ;
The business rules we have to enforce are still the same, namely : (a) there cannot be two customers with the same custID but with a different name (that's our FD), and (b) there cannot be any order without a corresponding custID PLUS custName for that order.
Let's see how our two-table design handles these business rules. (a) is obviously enforced by declaring custID as being a key on CUSTOMER. As for (b), it is obvious that it will be impossible to record an orderID in ORDER without also recording a custID. But is that sufficient to guarantee that there will also be a corresponding custName for all ORDER rows ? Obviously no. That's why we need to introduce the obvious referential integrity rule between ORDER and CUSTOMER.
Thus, RI constraints indeed "help addressing data redundancy problems", in the sense that by decomposing a table, and introducing a RI constraint to the overall design, they make it possible to eliminate certain kind of redundancies while preserving certain guarantees of data integrity. Without the possibility to introduce RI constraints in a design, we'd only be eliminating redundancy at the expense of data consistency.
它可以提供帮助,但如果数据库设计未标准化则无济于事。您可以在设计中使用参照完整性约束来减少/消除数据冗余。
为了获得最佳效果,请确保标准化为 BCNF 而不是 3NF。这可能仍然有一些冗余,但对于大多数用途来说都没有问题。
It can help, but not if the database design isn't normalized. Referential integrity constraints can be used in your design to reduce/remove data redundancy.
For best effect, ensure you normalize to BCNF over 3NF. This may still have some redundancy, but for most uses will be fine.
参照完整性仅保证参照完整性。
这就是如何布局数据库来防止冗余(请参阅 Oded 指出的规范化)。
Referential integrity guarantees only referential integrity.
It's how you lay out your database that prevents redundancy (see normalization as Oded pointed out).