数据库:表是否应该始终规范化并具有主键?
我有一个数据库,存储客户对产品的询问。
查询参考(文本)、产品编号(整数)和修订号(整数)共同唯一地标识销售人员和客户之间的单一讨论。
因此,有许多表,每个表都针对单个查询的特定详细信息,最终由 enq、pdt 和 rev 值组合来标识。
CREATE TABLE 不会对任何字段使用任何自动增量唯一主键。
我的问题是,这样的数据库设计可以接受吗? 表是否应该始终标准化?
谢谢你的建议。
I have a database storing customer enquiries about products.
The enquiry reference (text), product number (int) and revision number (int) together uniquely identifies a single discussion between sales and customer.
As a result, there are many tables each for a specific detail about a single enquiry, uqniuely idenified by enq, pdt and rev values combined.
The CREATE TABLE does not use any AUTO INCREMENT UNIQUE PRIMARY KEY for any field.
My question is, is this database design acceptable?
Should tables always be normalized?
Thanks for advise.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
规范化理论中没有任何内容涉及表应该具有简单主键还是复合主键。不管你相信与否,“主键”的概念并不是数据关系模型的组成部分。
话虽如此,表几乎总是应该使用主键来定义。主键不必是单个列,也不必通过自动增量填充。就您而言,这可能是三列组合在一起唯一标识一个查询。
如果表没有声明主键,则最终可能会出现重复的行。具有重复行的表代表一包元组,而不是一组元组。一旦您处理的是包而不是集合,关系模型预测的结果就不需要应用。这就是为什么防止重复行如此重要。
There is nothing in normalization theory that deals with whether a table should have a simple or compound primary key. Believe it or not, the concept of "primary key" is not a component of the relational model of data.
Having said that, tables should nearly always be defined with a primary key. The primary key need not be a single column, and it need not be filled in by an autoincrement. In your case, it could be the three columns that taken together uniquely identify an enquiry.
If a table has no declared primary key, it could end up with duplicate rows. A table with duplicate rows represents a bag of tuples, not a set of tuples. Once you are dealing with bags instead of sets, the results predicted by the relational model need not apply. That is why preventing duplicate rows is so important.
没有必要使用自动增量,但每个表都应该有某种类型的主键。主键可以是多个字段的组合,这些字段共同唯一地标识记录。
根据您告诉我们的情况,是的,只要您明确声明查询参考(文本)、产品编号(int)和修订号(int)的组合作为主键,共同唯一标识一个产品,该设计是可以接受的。单一讨论。
人们有时会出于性能原因对数据库进行非规范化。如果选择查询比插入和更新频繁得多,并且感兴趣的选择查询由于必须连接的表数量而返回缓慢,那么请考虑非规范化。
如果您提供一个运行缓慢的特定查询,您将获得很多具体建议。
There's no need to use AUTOINCREMENT, but every table should have a PRIMARY KEY of some kind. A primary key can be a combination of several fields that together identify the record uniquely.
Based on what you've told us, yes, the design is acceptable, provided you explicitly declare the combination of the enquiry reference (text), product number (int) and revision number (int) as a primary key that together uniquely identifies a single discussion.
People sometimes denormalize a database for performance reasons. If select queries are far more frequent than inserts and updates, and the select query of interest is slow to return because of the number of tables it has to join, then consider denormalizing.
If you supply a specific query that is running slow for you, you'll get lots of specific advice.
拥有
PRIMARY KEY
(或UNIQUE
约束)首先将确保这些值确实是唯一的,其次将极大地改进对给定查询的搜索。PRIMARY KEY
意味着在(enq, pdt, rev)
上创建索引,并且此查询:将在单个索引查找中完成。
如果没有索引,此查询将需要扫描整个表,并且不能保证最终不会出现重复项。
除非有非常非常特殊的情况(比如大量插入的日志表),否则表上应该始终有一个
PRIMARY KEY
。Having a
PRIMARY KEY
(or aUNIQUE
constraint) will, first, ensure that these values are really unique, and, second, will greatly improve the searches for a given enquiry.A
PRIMARY KEY
implies creating an index over(enq, pdt, rev)
, and this query:will complete in a single index seek.
Without the index, this query will require scanning the whole table, and there is no guarantee that you won't end up with the duplicates.
Unless for very, very, very special conditions (like heavily inserted log tables), you should always have a
PRIMARY KEY
on your tables.就我个人而言,我总是在所有表上都有某种主键,即使它是一个不用于其他目的的自动递增数字
至于规范化,我认为应该努力实现规范化表,但实际上有很多充分的理由表格设计很好,但没有规范化。这就是数据库设计的“理论”与现实的结合 - 但最好知道什么是标准化,努力实现它,并在偏离规则时有充分的理由(而不是仅仅忽视规则或更糟糕的是忽略良好的设计规则)。
Personally, I ALWAYS always have some sort of primary key on all tables, even if it is an auto-incrment number used for nothing else
As to normalization, I think one should strive for normalized tables, but in reality there are many good reasons when a table design is good, but not normalized. This is where the 'theory' of DB design meets the reality - but it is good to know what normalization is, strive for it, and have good reasons when you are deviating from the rules (as opposed to just being ignorant of the rules or worse ignoring good design rules).
这是两个问题。
(1) 并不总是需要有一个自动递增密钥。不过它很实用,因为您可以使用它来轻松操作数据。没有重复项也不是必须的。
(2) 当你在学校做作业时,规范化是必须的,但如果事情变得困难,你可以打破它,以便在不危及数据完整性的情况下让你的生活更轻松。
These are two questions.
(1) It is not required to have an auto increment key always. It is practical though, since you can use it for easy manipulation of your data. Also having no duplicates is not a must.
(2) Normalization is a must when you do homework for school, but if things get tough you can break it in order to make your life easier if you do not endanger your data integrity.
我在这一点上与众人不同。请勿将您的询价参考号 (text)、产品编号 (int) 和修订号 (int) 作为主键。您表示查询参考是文本类型,您的意思是它的宽度为 25、50 或 500 个字符?如果主键是由这些字段组成的,在我看来它将太宽,因为它将附加到为该表创建的每个索引中,从而通过三个字段和任何需要使用的表的大小来增加每个索引行的大小返回该表的外键也需要这三个字段。
将这三个字段设为唯一索引。将自增值作为主键并使其成为聚集索引。将链接回该主表的表将在内存中占用较小的空间,以将数据从表一链接到表二。
就归一化而言,如果您的数据只有几千行,甚至 50,000 或 500,000 行,那么无论是否归一化都没有关系。当数据开始大于可用 RAM 缓存时,就会出现问题。
设计一个视图以将数据呈现给应用程序以满足业务规则。设计存储过程来接受要存储的数据。设计表结构以满足 SLA 中的响应时间。如果您必须规范化或非规范化、分区或索引或获取更大的服务器来满足 SLA,应用程序将永远不会知道,因为您始终通过满足业务规则的视图提供数据。
I am splitting from the herd on this one. Do NOT make your enquiry reference (text), product number (int) and revision number (int) the primary key. You indicated the enquiry reference was a text type and did you mean it would be 25 or 50 or 500 characters wide? If the primary key is made from those fields it will be too wide in my view as it will be appended to every index created for that table increasing the size of every index row by the size of the three fields and any table which needs to use a foreign key back to this table will also need the three fields.
Make the three fields a unique index. Place an auto-increment value as the primary key and make it the clustered index. The tables which will link back to this master table will have a small footprint in memory to link the data from table one to table two.
As far as normalized goes it does not matter, normalized or not, if your data is only a few thousand rows, or even 50,000 or 500,000. When the data starts getting bigger than the available RAM cache then it is an issue.
Design a view to present the data to the application to fulfill the business rule. Design stored procedures to accept data to store. Design the table stucture to meet the response time in the SLA. If you have to normalize or denormalize or patrtition or index or get a bigger server to meet the SLA the app will never know because you are always supplying the data via the view which meets the business rule.