数据库规范化问题
我刚刚开始学习数据库规范化,我对我的一个表有疑问。我的数据库现在的结构非常糟糕,原因之一是我有一个如下所示的表。
客户表
ID | Date_Entered | First_Name | Middle_Name | Last_Name | Maiden_Name
...
Address__street_dmv | Address_city_dmv | Address_state_dmv | Address_zip_dmv
...
Address__street_source2 | Address_state_source2 | Address_city_source2 | etc
.
这些地址不断出现,因为我的公司从多个来源获取地址数据。但是,当然,对于我们的某些客户来说,其中一些地址将为空。所以我认为我需要一个像这样连接到客户表的单独地址表。
。
地址
ID | Number | Street | State | Zip | Source (drop down menu)
但后来我认为来源将是冗余数据。那么,我需要一个像这样的单独的源表吗?
来源
Source_ID | Source
并像这样更改地址表?
ID | Number | Street | State | Zip | Source _ID (drop down)
这似乎不对,因为现在 Source_ID 是多余的......请帮忙。
如果您能告诉我是否应该在客户表中包含婚前姓名和中间名,那么我会加分,因为这些也可能为空(如果不是,新表将如何构建?)
抱歉,我是个菜鸟。
I just started learning about database normalization and I have a question about one of my tables. My database right now is structured horribly, and one of the reasons is because I have a table that looks like this.
Customers Table
ID | Date_Entered | First_Name | Middle_Name | Last_Name | Maiden_Name
...
Address__street_dmv | Address_city_dmv | Address_state_dmv | Address_zip_dmv
...
Address__street_source2 | Address_state_source2 | Address_city_source2 | etc
.
The addresses keeping going on and on because my company obtains address data from multiple sources. But, of course, some of these address will be Null for some of our customers. So I think I need a separate addresses table like this that connects to the Customers table.
.
Addresses
ID | Number | Street | State | Zip | Source (drop down menu)
But then I was thinking the source would be redundant data. So, do I need a separate sources table like this?
Sources
Source_ID | Source
And change the addresses table like this?
ID | Number | Street | State | Zip | Source _ID (drop down)
It doesn’t seem right because now the Source_ID is redundant… Please help.
Bonus points if you can tell me whether or not I should include Maiden and Middle names in the Customer table since these too could possibly be Null (If not, how would the new table be structured?)
Sorry for being a noob.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会选择类似
Customer
Addresses
Customers_Address
这样您就可以从多个来源获得相同的地址。您可能还希望有单独的街道表,可能就像这样
,然后在
Addresses
表中您将只有Street_ID
而不是Street
和State_ID
。这还允许您在用户选择状态时显示街道选择列表。我想说,在客户表中包含婚前姓名和中间姓名是可以的,即使它们很少使用。
I would go with something like
Customer
Addresses
Customers_Address
This allows you to have same address from multiple sources. You might also want to have separate table for streets, possibly like
and then in the
Addresses
table you would only haveStreet_ID
instead of bothStreet
andState_ID
. This also allows you to show a selection list of streets when user has selected state.I'd say it is OK to have Maiden and Middle names in the Customer table, even if they are rarely used.
您的问题部分与标准化有关,部分则与标准化无关。这并不意味着你的问题的一部分不重要。这只是意味着它的重要性与标准化无关。
从该术语的某种意义上来说,您的地址本质上是一个重复组。因此,将它们从客户中删除确实有意义。 (这与标准化有关;重复组违反了 1NF。)
“源”不是冗余数据,决定是否用 ID 号替换文本与标准化无关。
当您将表从较低范式移至较高范式时,原始表的列数会减少。用 ID 号替换文本不会更改列数。
并且,用无意义的 ID 号替换文本的每一列都需要进行联接才能恢复有意义的文本。按照相同的逻辑,您还可以用无意义的 ID 号替换街道、州和邮政编码,但这需要四个联接才能取回有意义的数据。
Part of your question has to do with normalization, and part of it does not. That doesn't mean part of your question isn't important. It just means it's important for reasons that have nothing to do with normalization.
Your addresses are essentially a repeating group, in one sense of that term. So it does make sense to remove them from Customers. (This has to do with normalization; repeating groups violates 1NF.)
"Source" is not redundant data, and deciding whether to substitute an ID number for text has nothing to do with normalization.
When you move a table from a lower normal form to a higher normal form, the original table ends up with fewer columns. Substituting an ID number for text doesn't change the number of columns.
And every column in which you substitute a meaningless ID number for text requires a join to get the meaningful text back. Following your same logic, you could also substitute meaningless ID numbers for street, state, and zip, but that would then require four joins to get meaningful data back.
我不是 SQL 专家,但这就是我认为您想要描述的内容。
作为唯一实体的客户有一个当前地址,并且可以有许多其他地址,如果这是正确的,那么您应该将其他地址分隔到它们自己的表中。
其次,您发现客户拥有 x 个地址的方式是,您为每个客户获取不同公司的信息,如果是这种情况,我将为公司提供一个单独的表格,并按您的计划记录下来,是的,您将拥有重复 source_id 行,但情况会是这样,因为它们提供有关许多不同客户的信息。
关于婚前姓名和中间名,这些是您的业务规则所要求的,如果需要的话请存储它们。
同样,我的 SQL 开发实际上只是学生级别,但据我了解,这就是我将如何进行的。
希望这会有所帮助,如果有人可以提供更多的专家信息,那就一起去吧。
I'm not an SQL expert but here is what I think you are trying to describe.
A customer which is an unique entity has a current address and can have many other addresses, if this is correct yes you should separate the additional addresses into their own table.
Secondly the way you discover that a customer has x amount of addresses is that you obtain this information different companies for each customer, if this is the case I would have a separate table for the companies and record this as you planned, yes you will have repeating rows of source_id but this would be the case as they supply information about many different customers.
In relation to the maiden and middle names are these required by your business rules if so store them when required.
Again my SQL development is only student level really but from what I understand this is how I would go about it.
Hope this helps and if anyone can give more expert information go with it.
您还可以尝试以下方法:
客户
地址
这假定客户和地址之间存在一对多关系。它还完全消除了 Customer_Address 表,转而使用两个表(Customer 和 Addresses),并将 Addresses 表的复合主键定义为 CustomerId 和 SourceID。在此模型中,CustomerId 和 SourceId 唯一确定 Number、Street、State 和 Zip。它还通过确保每个客户只能拥有来自每个来源的一个地址来强制数据完整性。让我知道这是否有帮助或者我是否偏离了基地。我还在学习!
you could also try the following approach:
Customer
Addresses
This assumes a one-to-many relationship between Customer and Addresses. It also eliminates the Customer_Address table altogether, in favor of using two tables (Customer and Addresses), and defining a compound primary key for the Addresses table as the CustomerId and the SourceID. In this model, the CustomerId and the SourceId uniquely determines the Number, Street, State, and Zip. It also enforces data integrity by ensuring that each customer can only have one address from each source. Let me know if this helps or if I'm way off base. I'm still learning!