用简单的英语进行规范化

发布于 2024-08-23 14:50:27 字数 344 浏览 5 评论 0原文

我理解数据库规范化的概念,但总是很难用简单的英语解释它 - 特别是对于工作面试。我已阅读 wikipedia 帖子,但仍然发现很难向非开发人员解释这个概念。 “以不获取重复数据的方式设计数据库”是首先想到的。

有谁有一个很好的方法来用简单的英语解释数据库规范化的概念?有哪些很好的例子来展示第一、第二和第三范式之间的差异?

假设您去参加工作面试,面试官问:解释规范化的概念以及如何设计规范化数据库。

面试官在寻找哪些关键点?

I understand the concept of database normalization, but always have a hard time explaining it in plain English - especially for a job interview. I have read the wikipedia post, but still find it hard to explain the concept to non-developers. "Design a database in a way not to get duplicated data" is the first thing that comes to mind.

Does anyone has a nice way to explain the concept of database normalization in plain English? And what are some nice examples to show the differences between first, second and third normal forms?

Say you go to a job interview and the person asks: Explain the concept of normalization and how would go about designing a normalized database.

What key points are the interviewers looking for?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

美男兮 2024-08-30 14:50:27

好吧,如果我必须向我的妻子解释的话,它会是这样的:

主要思想是避免大数据的重复。

让我们看一下人员名单以及他们来自的国家/地区。我们不是为每个人保存与“波斯尼亚和黑塞哥维那”一样长的国家名称,而是简单地保存一个引用国家表的数字。因此,我们不再持有 100 个“波斯尼亚和黑塞哥维那”,而是持有 100 个#45。现在,将来,就像巴尔干国家经常发生的那样,他们分裂成两个国家:波斯尼亚和黑塞哥维那,我只需在一处进行更改。嗯,有点。

现在,为了解释 2NF,我会更改示例,假设我们保存了每个人访问过的国家/地区列表。
我不会创建这样的表:

Person   CountryVisited   AnotherInformation   D.O.B.
Faruz    USA              Blah Blah            1/1/2000
Faruz    Canada           Blah Blah            1/1/2000

我会创建三个表,一个包含国家/地区列表的表,一个包含人员列表的表,以及另一个连接它们的表。这给了我最大的自由来更改个人信息或国家/地区信息。这使我能够按照规范化的预期“删除重复行”。

Well, if I had to explain it to my wife it would have been something like that:

The main idea is to avoid duplication of large data.

Let's take a look at a list of people and the country they came from. Instead of holding the name of the country which can be as long as "Bosnia & Herzegovina" for every person, we simply hold a number that references a table of countries. So instead of holding 100 "Bosnia & Herzegovina"s, we hold 100 #45. Now in the future, as often happens with Balkan countries, they split to two countries: Bosnia and Herzegovina, I will have to change it only in one place. well, sort of.

Now, to explain 2NF, I would have changed the example, and let's assume that we hold the list of countries every person visited.
Instead of holding a table like:

Person   CountryVisited   AnotherInformation   D.O.B.
Faruz    USA              Blah Blah            1/1/2000
Faruz    Canada           Blah Blah            1/1/2000

I would have created three tables, one table with the list of countries, one table with the list of persons and another table to connect them both. That gives me the most freedom I can get changing person's information or country information. This enables me to "remove duplicate rows" as normalization expects.

老娘不死你永远是小三 2024-08-30 14:50:27

一对多关系应表示为通过外键连接的两个单独的表。如果您尝试将逻辑一对多关系推入单个表中,那么您就违反了规范化,从而导致危险的问题。

假设您有一个关于您的朋友和他们的猫的数据库。由于一个人可能拥有不止一只猫,因此人与猫之间存在一对多的关系。这需要两个表:(

Friends
Id | Name | Address
-------------------------
1  | John | The Road 1
2  | Bob  | The Belltower


Cats
Id | Name   | OwnerId 
---------------------
1  | Kitty  | 1
2  | Edgar  | 2
3  | Howard | 2

Cats.OwnerIdFriends.Id 的外键)

上述设计已完全规范化,并符合所有已知的规范化级别。

但是假设我尝试在一个表中表示上述信息,如下所示:(

Friends and cats
Id | Name | Address       | CatName
-----------------------------------
1  | John | The Road 1    | Kitty     
2  | Bob  | The Belltower | Edgar  
3  | Bob  | The Belltower | Howard 

如果我习惯使用 Excel 工作表而不是关系数据库,我可能会做出这种设计。)
如果我希望数据保持一致,单表方法迫使我重复一些信息。这种设计的问题在于,一些事实,例如鲍勃的地址是“钟楼”的信息重复了两次,这是多余的,并且使得查询和更改数据变得困难,并且(最糟糕的)可能会引入逻辑不一致。

例如。如果鲍勃搬家,我必须确保更改两行中的地址。如果鲍勃又养了一只猫,我必须确保完全按照其他两行中输入的内容重复姓名和地址。例如,如果我在其中一行中输入了鲍勃的地址,突然数据库中关于鲍勃居住地点的信息不一致。未规范化的数据库无法防止不一致和自相矛盾的数据的引入,因此数据库是不可靠的。这显然是不可接受的。

标准化不能防止您输入错误的数据。标准化可以防止数据不一致的可能性。

值得注意的是,正常化取决于业务决策。如果您有一个客户数据库,并且决定只记录每个客户的一个地址,那么表设计 (#CustomerID, CustomerName, CustomerAddress) 就可以了。但是,如果您决定允许每个客户注册多个地址,则同一表设计不会标准化,因为现在客户和地址之间存在一对多关系。因此,你不能只看数据库来确定它是否规范化,你必须了解数据库背后的业务模型。

One-to-many relationships should be represented as two separate tables connected by a foreign key. If you try to shove a logical one-to-many relationship into a single table, then you are violating normalization which leads to dangerous problems.

Say you have a database of your friends and their cats. Since a person may have more than one cat, we have a one-to-many relationship between persons and cats. This calls for two tables:

Friends
Id | Name | Address
-------------------------
1  | John | The Road 1
2  | Bob  | The Belltower


Cats
Id | Name   | OwnerId 
---------------------
1  | Kitty  | 1
2  | Edgar  | 2
3  | Howard | 2

(Cats.OwnerId is a foreign key to Friends.Id)

The above design is fully normalized and conforms to all known normalization levels.

But say I had tried to represent the above information in a single table like this:

Friends and cats
Id | Name | Address       | CatName
-----------------------------------
1  | John | The Road 1    | Kitty     
2  | Bob  | The Belltower | Edgar  
3  | Bob  | The Belltower | Howard 

(This is the kind of design I might have made if I was used to Excel-sheets but not relational databases.)
A single-table approach forces me to repeat some information if I want the data to be consistent. The problem with this design is that some facts, like the information that Bob's address is "The belltower" is repeated twice, which is redundant, and makes it difficult to query and change data and (the worst) possible to introduce logical inconsistencies.

Eg. if Bob moves I have to make sure I change the address in both rows. If Bob gets another cat, I have to be sure to repeat the name and address exactly as typed in the other two rows. E.g. if I make a typo in Bob's address in one of the rows, suddenly the database has inconsistent information about where Bob lives. The un-normalized database cannot prevent the introduction of inconsistent and self-contradictory data, and hence the database is not reliable. This is clearly not acceptable.

Normalization cannot prevent you from entering wrong data. What normalization prevents is the possibility of inconsistent data.

It is important to note that normalization depends on business decisions. If you have a customer database, and you decide to only record a single address per customer, then the table design (#CustomerID, CustomerName, CustomerAddress) is fine. If however you decide that you allow each customer to register more than one address, then the same table design is not normalized, because you now have a one-to-many relationship between customer and address. Therefore you cannot just look at a database to determine if it is normalized, you have to understand the business model behind the database.

你是年少的欢喜 2024-08-30 14:50:27

这就是我问受访者的问题:

为什么我们在应用程序中不使用单个表,而是使用多个表?

答案当然是常态化。正如已经说过的,它是为了避免冗余和更新异常。

This is what I ask interviewees:

Why don't we use a single table for an application instead of using multiple tables ?

The answer is ofcourse normalization. As already said, its to avoid redundancy and there by update anomalies.

倦话 2024-08-30 14:50:27

这不是一个彻底的解释,但正常化的一个目标是允许增长而不尴尬。

例如,如果您有一个 user 表,并且每个用户都有一个且仅有一个电话号码,则可以在该表中包含一个 phonenumber 列。

但是,如果每个用户的电话号码数量不同,那么拥有 phonenumber1phonenumber2 等列就会很尴尬。这有两个原因:

  • 如果您的列最多为 phonenumber3 并且有人需要添加第四个号码,则您必须向表中添加一列。
  • 对于电话号码少于 3 个的所有用户,其行上都有空列。

相反,您需要一个 phonenumber 表,其中每行包含一个电话号码和一个外键引用,指向它所属的 user 表中的行。不需要空白列,每个用户可以根据需要拥有尽可能多的电话号码。

This is not a thorough explanation, but one goal of normalization is to allow for growth without awkwardness.

For example, if you've got a user table, and every user is going to have one and only one phone number, it's fine to have a phonenumber column in that table.

However, if each user is going to have a variable number of phone numbers, it would be awkward to have columns like phonenumber1, phonenumber2, etc. This is for two reasons:

  • If your columns go up to phonenumber3 and someone needs to add a fourth number, you have to add a column to the table.
  • For all the users with fewer than 3 phone numbers, there are empty columns on their rows.

Instead, you'd want to have a phonenumber table, where each row contains a phone number and a foreign key reference to which row in the user table it belongs to. No blank columns are needed, and each user can have as few or many phone numbers as necessary.

凉城已无爱 2024-08-30 14:50:27

关于规范化需要注意的一点是:完全规范化的数据库具有空间效率,但不一定是最时间高效的数据排列,具体取决于使用模式。

跳到多个表以从其非规范化位置查找所有信息需要时间。在高负载情况下(每秒数百万行飞来飞去,数千个并发客户端,例如信用卡交易处理),时间比存储空间更有价值,适当的非规范化表可以比完全规范化的表提供更好的响应时间。

有关这方面的更多信息,请查找 Ken Henderson 撰写的 SQL 书籍。

One side point to note about normalization: A fully normalized database is space efficient, but is not necessarily the most time efficient arrangement of data depending on use patterns.

Skipping around to multiple tables to look up all the pieces of info from their denormalized locations takes time. In high load situations (millions of rows per second flying around, thousands of concurrent clients, like say credit card transaction processing) where time is more valuable than storage space, appropriately denormalized tables can give better response times than fully normalized tables.

For more info on this, look for SQL books written by Ken Henderson.

一梦等七年七年为一梦 2024-08-30 14:50:27

我想说,标准化就像记笔记以高效地做事,可以这么说:

如果您有一张说明说您必须这样做
去买冰淇淋而不带
标准化,那么你就会有
另一张纸条,说你必须走了
买冰淇淋,只需一份
每个口袋。

现在,在现实生活中,你永远不会这样做
那么为什么要在数据库中进行呢?

对于设计和实现部分,那时您可以回到“行话”并使其远离外行术语,但我认为您可以简化。您会首先说出您需要什么,然后当标准化开始时,您会说您将确保以下内容:

  1. 表中不得有重复的信息组
  2. 任何表都不应包含不依赖于功能的数据在该表的主键上
  3. 对于 3NF,我喜欢 Bill Kent 的看法:每个非键属性都必须提供有关键、整个键的事实,并且除了键之外什么都没有。

我认为如果你也谈到非规范化,以及你不可能总是拥有最好的结构并处于规范形式这一事实,可能会更令人印象深刻。

I would say that normalization is like keeping notes to do things efficiently, so to speak:

If you had a note that said you had to
go shopping for ice cream without
normalization, you would then have
another note, saying you have to go
shopping for ice cream, just one in
each pocket.

Now, In real life, you would never do
this, so why do it in a database?

For the designing and implementing part, thats when you can move back to "the lingo" and keep it away from layman terms, but I suppose you could simplify. You would say what you needed to at first, and then when normalization comes into it, you say you'll make sure of the following:

  1. There must be no repeating groups of information within a table
  2. No table should contain data that is not functionally dependent on that tables primary key
  3. For 3NF I like Bill Kent's take on it: Every non-key attribute must provide a fact about the key, the whole key, and nothing but the key.

I think it may be more impressive if you speak of denormalization as well, and the fact that you cannot always have the best structure AND be in normal forms.

怀里藏娇 2024-08-30 14:50:27

规范化是一组用于设计通过关系连接的表的规则。

它有助于避免重复条目,减少所需的存储空间,无需重组现有表以容纳新数据,从而提高查询速度。

第一范式:数据应分解为最小的单位。表不应包含重复的列组。每一行都用一个或多个主键标识。
例如,“自定义”表中有一个名为“名称”的列,应将其分解为“名字”和“姓氏”。此外,“自定义”应该有一个名为“CustiomID”的列来标识特定的自定义。

第二范式:每个非键列应该与整个主键直接相关。
例如,如果“自定义”表有一个名为“城市”的列,则该城市应该有一个单独的表,其中定义了主键和城市名称,在“自定义”表中,将“城市”列替换为“CityID”,让“CityID”成为故事中的外键。

第三范式:每个非键列不应依赖于其他非键列。
例如,在订单表中,“总计”列依赖于“单价”和“数量”,因此应删除“总计”列。

Normalization is a set of rules that used to design tables that connected through relationships.

It helps in avoiding repetitive entries, reducing required storage space, preventing the need to restructure existing tables to accommodate new data, increasing speed of queries.

First Normal Form: Data should be broken up in the smallest units. Tables should not contain repetitive groups of columns. Each row is identified with one or more primary key.
For example, There is a column named 'Name' in a 'Custom' table, it should be broken to 'First Name' and 'Last Name'. Also, 'Custom' should have a column named 'CustiomID' to identify a particular custom.

Second Normal Form: Each non-key column should be directly related to the entire primary key.
For example, if a 'Custom' table has a column named 'City', the city should has a separate table with primary key and city name defined, in the 'Custom' table, replace the 'City' column with 'CityID' and make 'CityID' the foreign key in the tale.

Third normal form: Each non-key column should not depend on other non-key columns.
For example, In an order table, the column 'Total' is dependent on 'Unit price' and 'quantity', so the 'Total' column should be removed.

最佳男配角 2024-08-30 14:50:27

我在 Access 课程中教授标准化,并通过几种方式将其分解。

在讨论了故事板的前身或规划数据库之后,我然后深入研究标准化。我这样解释规则:

每个字段都应包含最小的有意义值:

我在黑板上写下一个姓名字段,然后像 Bill Lumbergh 一样在其中放置名字和姓氏。然后,我们向学生询问,当名字和姓氏都在一个字段中时,我们会遇到什么问题。我以我的名字为例,吉姆·理查兹。如果学生们不带我走,我就会拉着他们的手,带着他们一起走。 :) 我告诉他们,我的名字对某些人来说是一个艰难的名字,因为我有一些人认为的两个名字,有些人叫我理查德。如果您尝试搜索我的姓氏,那么对于普通人(没有通配符)来说会更困难,因为我的姓氏被埋在字段的末尾。我还告诉他们,他们在按姓氏轻松排序字段时会遇到问题,因为我的姓氏又被埋在了最后。

然后我让他们知道有意义是基于也将使用该数据库的受众。如果我们要存储人们的地址,我们在工作中将不需要单独的公寓或套房号码字段,但像 UPS 或 FEDEX 这样的运输公司可能需要将其分开,以便在他们需要去的地方轻松找到公寓或套房。他们在路上,从一个送货到另一个送货。所以对我们来说没有什么意义,但是对他们来说绝对有意义。

避免空白:

我用一个类比来向他们解释为什么他们应该避免空白。我告诉他们 Access 和大多数数据库不像 Excel 那样存储空白。 Excel 并不关心您是否在单元格中键入任何内容,也不会增加文件大小,但 Access 会保留该空间,直到您实际使用该字段的时间点。因此,即使它是空白的,它仍然会占用空间,并向他们解释这也会减慢他们的搜索速度。
我用的比喻是衣柜里的空鞋盒。如果您的衣柜里有鞋盒,并且您正在寻找一双鞋子,则需要打开并在每个鞋盒中查找一双鞋子。如果有空鞋盒,那么您不仅浪费了衣柜里的空间,而且在您需要翻阅鞋盒寻找某双鞋时也浪费了时间。

避免数据冗余:

我向他们展示一个包含大量客户信息重复值的表格,然后告诉他们我们要避免重复,因为我有香肠手指,如果我必须输入相同的内容,我会输入错误的值一遍又一遍。这种数据的“误指”将导致我的查询找不到正确的数据。相反,我们将数据分解到一个单独的表中,并使用主键和外键字段创建关系。这样我们就可以节省空间,因为我们不需要多次输入客户的姓名、地址等,而只是在客户字段中使用客户的 ID 号。然后我们将讨论下拉列表/组合框/查找列表或微软稍后想要命名的任何其他内容。 :) 作为用户,您不想每次在该客户字段中查找并输入客户号码,因此我们将设置一个下拉列表,为您提供客户列表,您可以在其中选择他们的姓名和它会为您填写客户的 ID。这将是一对多的关系,而 1 个客户将有许多不同的订单。

避免重复的字段组:

我在谈论多对多关系时演示了这一点。首先,我绘制 2 个表,1 个用于保存员工信息,1 个用于保存项目信息。桌子的摆放方式与此类似。

(Table1)
tblEmployees
* EmployeeID
First
Last
(Other Fields)….
Project1
Project2
Project3
Etc.
**********************************
(Table2)
tblProjects
* ProjectNum
ProjectName
StartDate
EndDate
…..

我向他们解释说,这不是在员工和他们从事的所有项目之间建立关系的好方法。首先,如果我们有一个新员工,那么他们不会有任何项目,所以我们会浪费所有这些字段,第二,如果一个员工在这里工作很长时间,那么他们可能已经从事了 300 个项目,所以我们会包括300个项目领域。那些新的并且只有 1 个项目的人将有 299 个浪费的项目字段。这种设计也有缺陷,因为我必须在每个项目字段中搜索才能找到参与某个项目的所有人员,因为该项目编号可能位于任何项目字段中。

我涵盖了相当多的基本概念。如果您还有其他问题或需要帮助以简单的英语进行澄清/分解,请告诉我。维基页面读起来并不简单,可能会让一些人望而生畏。

I teach normalization in my Access courses and break it down a few ways.

After discussing the precursors to storyboarding or planning out the database, I then delve into normalization. I explain the rules like this:

Each field should contain the smallest meaningful value:

I write a name field on the board and then place a first name and last name in it like Bill Lumbergh. We then query the students and ask them what we will have problems with, when the first name and last name are all in one field. I use my name as an example, which is Jim Richards. If the students do not lead me down the road, then I yank their hand and take them with me. :) I tell them that my name is a tough name for some, because I have what some people would consider 2 first names and some people call me Richard. If you were trying to search for my last name then it is going to be harder for a normal person (without wildcards), because my last name is buried at the end of the field. I also tell them that they will have problems with easily sorting the field by last name, because again my last name is buried at the end.

I then let them know that meaningful is based upon the audience who is going to be using the database as well. We, at our job will not need a separate field for apartment or suite number if we are storing people's addresses, but shipping companies like UPS or FEDEX might need it separated out to easily pull up the apartment or suite of where they need to go when they are on the road and running from delivery to delivery. So it is not meaningful to us, but it is definitely meaningful to them.

Avoiding Blanks:

I use an analogy to explain to them why they should avoid blanks. I tell them that Access and most databases do not store blanks like Excel does. Excel does not care if you have nothing typed out in the cell and will not increase the file size, but Access will reserve that space until that point in time that you will actually use the field. So even if it is blank, then it will still be using up space and explain to them that it also slows their searches down as well.
The analogy I use is empty shoe boxes in the closet. If you have shoe boxes in the closet and you are looking for a pair of shoes, you will need to open up and look in each of the boxes for a pair of shoes. If there are empty shoe boxes, then you are just wasting space in the closet and also wasting time when you need to look through them for that certain pair of shoes.

Avoiding redundancy in data:

I show them a table that has lots of repeated values for customer information and then tell them that we want to avoid duplicates, because I have sausage fingers and will mistype in values if I have to type in the same thing over and over again. This “fat-fingering” of data will lead to my queries not finding the correct data. We instead, will break the data out into a separate table and create a relationship using a primary and foreign key field. This way we are saving space because we are not typing the customer's name, address, etc multiple times and instead are just using the customer's ID number in a field for the customer. We then will discuss drop-down lists/combo boxes/lookup lists or whatever else Microsoft wants to name them later on. :) You as a user will not want to look up and type out the customer's number each time in that customer field, so we will setup a drop-down list that will give you a list of customer, where you can select their name and it will fill in the customer’s ID for you. This will be a 1-to-many relationship, whereas 1 customer will have many different orders.

Avoiding repeated groups of fields:

I demonstrate this when talking about many-to-many relationships. First, I draw 2 tables, 1 that will hold employee information and 1 that will hold project information. The tables are laid similar to this.

(Table1)
tblEmployees
* EmployeeID
First
Last
(Other Fields)….
Project1
Project2
Project3
Etc.
**********************************
(Table2)
tblProjects
* ProjectNum
ProjectName
StartDate
EndDate
…..

I explain to them that this would not be a good way of establishing a relationship between an employee and all of the projects that they work on. First, if we have a new employee, then they will not have any projects, so we will be wasting all of those fields, second if an employee has been here a long time then they might have worked on 300 projects, so we would have to include 300 project fields. Those people that are new and only have 1 project will have 299 wasted project fields. This design is also flawed because I will have to search in each of the project fields to find all of the people that have worked on a certain project, because that project number could be in any of the project fields.

I covered a fair amount of the basic concepts. Let me know if you have other questions or need help with clarfication/ breaking it down in plain English. The wiki page did not read as plain English and might be daunting for some.

紫南 2024-08-30 14:50:27

我已经多次阅读有关标准化的 wiki 链接,但我从中找到了更好的标准化概述 文章。这是对第四范式规范化的简单易懂的解释。读一读!

预览:

什么是标准化?

标准化的过程是
有效地组织数据
数据库。该组织有两个目标
标准化过程:消除
冗余数据(例如,存储
多个表中的相同数据)
并确保数据依赖性
sense(仅将相关数据存储在
桌子)。这两个都是有价值的目标
因为它们减少了空间量
数据库消费并保证数据
逻辑存储。

http://databases.about.com/od/specicproducts/a/normalization。嗯

I've read the wiki links on normalization many times but I have found a better overview of normalization from this article. It is a simple easy to understand explanation of normalization up to fourth normal form. Give it a read!

Preview:

What is Normalization?

Normalization is the process of
efficiently organizing data in a
database. There are two goals of the
normalization process: eliminating
redundant data (for example, storing
the same data in more than one table)
and ensuring data dependencies make
sense (only storing related data in a
table). Both of these are worthy goals
as they reduce the amount of space a
database consumes and ensure that data
is logically stored.

http://databases.about.com/od/specificproducts/a/normalization.htm

北音执念 2024-08-30 14:50:27

数据库规范化是设计数据库以消除冗余数据的正式过程。设计包括:

  • 规划数据库将存储哪些信息
  • 概述用户将从数据库请求哪些信息
  • 记录供审查的假设

使用 或其他一些元数据表示来验证设计。

规范化的最大问题是您最终会得到多个表,这些表在概念上代表单个项目,例如用户配置文件。不必担心表中将插入但未更新记录的数据标准化,例如历史日志或金融交易。

参考

Database normalization is a formal process of designing your database to eliminate redundant data. The design consists of:

  • planning what information the database will store
  • outlining what information users will request from it
  • documenting the assumptions for review

Use a or some other metadata representation to verify the design.

The biggest problem with normalization is that you end up with multiple tables representing what is conceptually a single item, such as a user profile. Don't worry about normalizing data in table that will have records inserted but not updated, such as history logs or financial transactions.

References

一片旧的回忆 2024-08-30 14:50:27

+1 用于与你妻子交谈的类比。我发现与任何没有技术头脑的人交谈都需要轻松地进行此类对话。

但是......

为了补充这个对话,还有硬币的另一面(这在采访中可能很重要)。

规范化时,您必须观察数据库的索引方式以及查询的编写方式。

当在真正规范化的数据库中时,我发现在某些情况下,由于错误的连接操作、表上的索引错误以及表本身的明显错误设计,编写速度较慢的查询会更容易。

坦率地说,在高级规范化表中编写错误查询更容易。

我认为对于每个应用程序都有一个中间立场。在某些时候,您希望轻松地从几个表中获取所有内容,而不必加入大量表来获取一个数据集。

+1 for the analogy of talking to your wife. I find talking to anyone without a tech mind needs some ease into this type of conversation.

but...

To add to this conversation, there is the other side of the coin (which can be important when in an interview).

When normalizing, you have to watch how the databases are indexed and how the queries are written.

When in a truly normalized database, I have found that in situations it's been easier to write queries that are slow because of bad join operations, bad indexing on the tables, and plain bad design on the tables themselves.

Bluntly, it's easier to write bad queries in high level normalized tables.

I think for every application there is a middle ground. At some point you want the ease of getting everything out a few tables, without having to join to a ton of tables to get one data set.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文