该表违反了哪种范式?

发布于 2024-12-07 11:37:14 字数 1151 浏览 0 评论 0原文

考虑这个表:

   +-------+-------+-------+-------+  
   | name  |hobby1 |hobby2 |hobby3 |  
   +-------+-------+-------+-------+   
   | kris  | ball  | swim  | dance |  
   | james | eat   | sing  | sleep |  
   | amy   | swim  | eat   | watch |  
   +-------+-------+-------+-------+

爱好的类型没有优先级,因此所有爱好都属于同一个域。也就是说,表中的爱好可以移动到任何 hobby# 列上。无论在哪个栏目中,特定的爱好都可以在任何栏目中。

此表违反了哪条数据库规范化规则?


编辑

问:“兴趣爱好列表 [...] 的顺序是任意的”吗?

答:是的。

问:表有主键吗?

答:是的,假设键是名为 user_idAUTO_INCRMENT 列类型。

问题是hobby#列是否是重复组。


旁注:这不是作业。这是一场辩论,始于问题的评论 SQL - 根据多个列将一个表中的记录匹配到另一个表。我相信这个问题是违反 1NF 的一个明显例子。

然而,另一个人认为我“已经陷入了 1NF 的谬误之一。”该论点基于文章 关于第一范式的事实和谬误

我写这篇文章并不是为了羞辱他、我或任何人。我写这篇文章是因为我可能是错的,而且我显然遗漏了一些东西,也许这个人没有向我解释得足够好。

Consider this table:

   +-------+-------+-------+-------+  
   | name  |hobby1 |hobby2 |hobby3 |  
   +-------+-------+-------+-------+   
   | kris  | ball  | swim  | dance |  
   | james | eat   | sing  | sleep |  
   | amy   | swim  | eat   | watch |  
   +-------+-------+-------+-------+

There is no priority on the types of hobbies, thus all the hobbies belong to the same domain. That is, the hobbies in the table can be moved on any hobby# column. It doesn't matter on which column, a particular hobby can be in any column.

Which database normalization rule does this table violate?


Edit

Q. Is "the list of hobbies [...] in an arbitrary order"?

A. Yes.

Q. Does the table have a primary key?

A. Yes, suppose the key is an AUTO_INCREMENT column type named user_id.

The question is if the columns hobby# are repeating groups or not.


Sidenote: This is not a homework. It's kind of a debate, which started in the comments of the question SQL - match records from one table to another table based on several columns. I believe this question is a clear example of the 1NF violation.

However, the other guy believes that I "have fallen fowl of one of the fallacies of 1NF." That argument is based on the section "The ambiguity of Repeating Groups" of the article Facts and Fallacies about First Normal Form.

I am not writing this to humiliate him, me, or whomever. I am writing this, because I might be wrong, and there is something I am clearly missing and maybe this guy is not explaining it good enough to me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

薄凉少年不暖心 2024-12-14 11:37:14

你说爱好属于同一个领域,可以在列中移动。如果你的意思是,对于任何特定的name,爱好列表是任意顺序的,并且 kriss 可以轻松地拥有舞蹈、球、游泳和球、游泳、舞蹈,那么我会说你有一个重复组并且该表违反了 1NF。

另一方面,如果特定人的第一和第二爱好之间存在一些基本的语义差异,那么可能会出现这样的争论:这些爱好不是重复的组,并且该表可能处于 3NF 中(假设爱好列是FK 到爱好表)。我认为这个论点即使存在,也是薄弱的。

另一个需要考虑的因素是为什么恰好有 3 个爱好,以及更多或更少的爱好是否是一个潜在的问题。这个因素对于规范化来说并不重要,而对于设计的灵活性来说更重要。这是我将爱好分成几行的原因之一,即使它们在语义上彼此不同。

You say that the hobbies belong to the same domain and that they can move around in the columns. If by this you mean that for any specific name the list of hobbies is in an arbitrary order and kriss could just as easily have dance, ball, swim as ball, swim, dance, then I would say you have a repeating group and the table violates 1NF.

If, on the other hand, there is some fundamental semantic difference between a particular person's first and second hobbies, then there may be an argument for saying that the hobbies are not repeating groups and the table may be in 3NF (assuming that hobby columns are FK to a hobby table). I would suggest that this argument, if it exists, is weak.

One other factor to consider is why there are precisely 3 hobbies and whether more or fewer hobbies are a potential concern. This factor is important not so much for normalization as for flexibility of design. This is one reason I would split the hobbies into rows, even if they are semantically different from one-another.

变身佩奇 2024-12-14 11:37:14

你的三爱好表设计可能违反了我通常所说的原始 1NF 的精神可能出于 dportas 和其他人给出的原因)。

然而事实证明,要找到[一套]正式且精确的“可衡量”标准来准确地表达最初的“精神”是极其困难的。这就是你的另一个人试图解释的“重复组的模糊性”。

这里强调“正式”、“精确”和“可衡量”。所有其他范式的定义都满足“形式”、“精确”和“可测量”(即客观可观察)。对于 1NF 来说,这很难(/不可能???)。如果你想知道为什么,试试这个:

你说问题是“这三个爱好栏是否构成一个重复组”。用“是”回答这个问题,然后为你的答案提供严格的正式基础。

您不能只是说“列名相同,除了编号后缀”。要使违反此类规则的行为客观地可观察/可测量,需要枚举所有可能的后缀方式。

您不能只说“游泳,网球”同样可以是“网球,游泳”,因为要了解这一点确实需要检查表的外部谓词。如果这只是“人<名字>有爱好<爱好1>并且也有<爱好2>” ,那么实际上两者都是同样有效的(此外:并且由于封闭世界假设,实际上需要所有可能的爱好排列都出现在表中!!!)。然而,如果该外部谓词是“人上花费最多的时间,在上花费最少的时间”,那么“游泳、网球”可能同样适合“网球、游泳”。但是如何对表目标的外部谓词进行这样的解释(对于所有可能的谓词)???

等等等等

Your three-hobby table design probably violates what I usually call the spirit of the original 1NF (probably for the reasons given by dportas and others).

It turns out however, that it is extremely difficult to find [a set of] formal and precise "measurable" criteria that accurately express that original "spirit". That's what your other guy was trying to explain talking about "the ambiguity of repeating groups".

Stress "formal", "precise" and "measurable" here. Definitions for all other normal forms exist that satisfy "formal", "precise" and "measurable" (i.e. objectively observable). For 1NF it's just hard (/impossible ???) to do. If you want to see why, try this :

You stated that the question was "whether those three hobby columns constitute a repeating group". Answer this question with "yes", and then provide a rigorous formal underpinning for your answer.

You cannot just say "the column names are the same, except for the numbered suffix". Making a violation of such a rule objectively observable/measurable would require to enumerate all the possible ways of suffixing.

You cannot just say "swim, tennis" could equally well be "tennis, swim", because getting to know that for sure requires inspecting the external predicate of the table. If that is just "person <name> has hobby <hobby1> and also has <hobby2>" , then indeed both are equally valid (aside : and due to the closed world assumption it would in fact require all possible permutations of the hobbies to be present in the table !!!). However, if that external predicate is "person <name> spends the most time on <hobby1> and the least on <hobby2>", then "swim, tennis" could NOT equally well be "tennis,swim". But how do you make such interpretations of the external predicate of the table objective (for ALL POSSIBLE PREDICATES) ???

etc. etc.

鸠魁 2024-12-14 11:37:14

这显然“看起来”像是一个设计错误。

当这些数据只是存储和检索时,这并不是一个设计错误。您只需要 3 个爱好,并且您不打算以检索以外的任何其他方式使用此数据。

让我们考虑一下这种关系:

  • Hobby1是一个人一生中某个时刻(例如18岁之前)的主要爱好,
  • Hobby2是另一个时刻(19-30岁)的主要爱好,
  • Hobby3是她在另一个时刻的爱好。

那么这张表看起来绝对设计得很好,虽然 1NF 约定受到尊重,但命名可以说是“糟糕”。

在不加区别地存储爱好的情况下,在我现在能想到的大多数(如果不是全部)情况下,这显然是错误的。您的表有重复的行,这违反了 1NF 原则。

当您需要对分页结果进行排序或任何其他实际原因时,我们不要考虑从该表访问数据的 SQL 请求的效率降低。

让我们考虑一下当其他开发人员或团队使用您的数据库时处理您的数据所需的工作量:

  • 这里的数据是“分散的”。您必须查看多个列来聚合相关数据。
  • 你的爱好仅限于其中 3 种。
  • 您不能使用简单的规则来建立唯一性(每个用户只能有一次相同的爱好)。

你基本上会制造挫败感、愤怒和仇恨,原力就会受到干扰。

This clearly "looks" like a design error.

It's not not a design error when this data is simply stored and retrieved. You need only 3 of the hobbies and you don't intend to use this data in any other way than retrieve.

Let's consider this relationship:

  • Hobby1 is the main hobby at some point in a person's life (before 18 years of age for example)
  • Hobby2 is the hobby at another point (19-30)
  • Hobby3 is her hobby at a another one.

Then this table seems definitely well designed and while the 1NF convention is respected the naming arguably "sucks".

In the case of an indiscriminate storage of hobbies this is clearly wrong in most if not all cases I can think of right now. Your table has duplicate rows which goes against the 1NF principles.

Let's not consider the reduced efficiency of SQL requests to access data from this table when you need to sort the results for paging or any other practical reason.

Let's take into consideration the effort required to work with your data when your database will be used by another developer or team:

  • The data here is "scattered". You have to look in multiple columns to aggregate related data.
  • You are limited to only 3 of the hobbies.
  • You can't use simple rules to establish unicity (same hobby only once per user).

You basically create frustration, anger and hatred and the Force is disturbed.

淡淡绿茶香 2024-12-14 11:37:14

好吧,

重点是,只要所有 hobby1、hobby2 和 hobby3 值都不为空,并且名称是唯一的,该表或多或少可以被视为遵守 1NF 规则(请参阅这里为例 ...)

但是每个人都有3个爱好吗?当然不是!不要忘记数据库基本上应该保存数据作为现实的表示!因此,抛开所有理论,我们不能说每个人都有 3 种爱好,除非……我们的表是用来保存与拥有三种爱好的人相关的数据,并且它们之间没有任何偏好

这就是说,假设我们处于一般情况,正确的模型可能适合

+------------+-------+
| id_person  |name   |
+------------+-------+  

个人(不要忘记唯一的密钥。我不认为“名字”

+------------+-------+
| id_hobby   |name   |
+------------+-------+ 

对于爱好来说是一个好的模型。id_hobby密钥理论上不是强制性的,因为爱好名称可以是人与爱好之间的联系的关键

+------------+-----------+
| id_person  |id_hobby   |
+------------+-----------+  

,因为人与其爱好之间存在的多对多联系的物理表示

我的建议是基本的,并且可以满足理论。是在很多方面都得到了改善...

Well,

The point is that, as long as all hobby1, hobby2 and hobby3 values are not null, AND names are unique, this table could be considered more or less as abbiding by 1NF rules (see here for example ...)

But does everybody has 3 hobbies? Of course not! Do not forget that databases are basically supposed to hold data as a representation of reality! So, away of all theories, one cannot say that everybody has 3 hobbies, except if ... our table is done to hold data related to people that have three hobbies without any preference between them!

This said, and supposing that we are in the general case, the correct model could be

+------------+-------+
| id_person  |name   |
+------------+-------+  

for the persons (do not forget to unique key. I do not think 'name' is a good one

+------------+-------+
| id_hobby   |name   |
+------------+-------+ 

for the hobbies. id_hobby key is theoretically not mandatory, as hobby name can be the key ...

+------------+-----------+
| id_person  |id_hobby   |
+------------+-----------+  

for the link between persons and hobbies, as the physical representation of the many-to-many link that exists between persons and their hobbies.

My proposal is basic, and satisfies theory. It can be improved in many ways ...

星星的軌跡 2024-12-14 11:37:14

如果不知道存在哪些键以及表应该满足哪些依赖关系,就不可能确定它满足什么范式。我们所能做的就是根据您的属性名称进行猜测。

桌子有钥匙吗?举例来说,假设 Name 是候选键。如果每个元组的其他每个属性都允许有一个值(这意味着没有属性可以为空),则该表至少处于第一范式。

Without knowing what keys exist and what dependencies the table is supposed to satisfy it's impossible to determine for sure what Normal Form it satisfies. All we can do is make guesses based on your attribute names.

Does the table have a key? Suppose for the sake of example that Name is a candidate key. If there is exactly one value permitted for each of the other attributes for each tuple (which means that no attribute can be null) then the table is in at least First Normal Form.

晨光如昨 2024-12-14 11:37:14

如果表中的任何列接受空值,则该表违反第一范式。假设没有空值,@dportas 已经提供了正确的答案。

If any of the columns in the table accept nulls then then the table violates first normal form. Assuming no nulls, @dportas has already provided the correct answer.

雪落纷纷 2024-12-14 11:37:14

该表违反第一范式。

第一范式没有禁止同一类型的多个列。只要它们具有不同的列名称就可以了。

对“重复组”的禁止涉及嵌套记录 - 这种结构在分层数据库中很常见,但通常在关系数据库中不可能。

使用重复组的表看起来像这样:

+-------+--------+  
| name  |hobbies |  
+-------+--------+
| kris  |+-----+ |  
|       ||ball | |
|       |+-----+ |
|       ||swim | |
|       |+-----+ |
|       ||dance| |
|       |+-----+ |
+-------+--------+
| james |+-----+ |  
|       ||eat  | |
|       |+-----+ |
|       ||sing | |
|       |+-----+ |
|       ||sleep| |
|       |+-----+ |
+-------+--------+
| amy   |+-----+ |  
|       ||swim | |
|       |+-----+ |
|       ||eat  | |
|       |+-----+ |
|       ||watch| |
|       |+-----+ |
+-------+--------+

在符合 1NF 的表中,可以通过表名、主键和列名找到所有值。但这对于重复的组是不可能的,需要进一步的导航。

The table does not violate first normal form.

First normal form does not have any prohibition against multiple columns of the same type. As long as they have distinct column names, it is fine.

The prohibition against "Repeating Groups" concerns nested records - a structure which is common in hierarchical databases, but typically not possible in relational databases.

The table using repeating groups would look something like this:

+-------+--------+  
| name  |hobbies |  
+-------+--------+
| kris  |+-----+ |  
|       ||ball | |
|       |+-----+ |
|       ||swim | |
|       |+-----+ |
|       ||dance| |
|       |+-----+ |
+-------+--------+
| james |+-----+ |  
|       ||eat  | |
|       |+-----+ |
|       ||sing | |
|       |+-----+ |
|       ||sleep| |
|       |+-----+ |
+-------+--------+
| amy   |+-----+ |  
|       ||swim | |
|       |+-----+ |
|       ||eat  | |
|       |+-----+ |
|       ||watch| |
|       |+-----+ |
+-------+--------+

In a table conforming to 1NF all values can be located though table name, primary key, and column name. But this is not possible with repeated groups, which require further navigation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文