从逻辑上来说,是的。如果表 X 具有列 {A, B, C} 并且 A 是主键,则 {A} 、 {A, B}、{A, C} 和 {A, B, C} 都是超级键,因为如果您有任何一个在这些集合中,您知道该行中的所有值(如果存在)。
但是出于某些目的,它不会被视为 SQL 中的键,例如,如果表 Y 有A 和 B 通常您可以不定义外键 Y(A, B) REFERENCES X(A, B),因为 {A, B} 不是主键。如果您希望能够声明外键,则必须在 X(A, B) 上添加另一个 UNIQUE 约束,这是低效的,因为它复制了主键的一部分。
在我看来,这是 SQL 的众多缺陷之一。
Logically speaking, yes. If table X has columns {A, B, C} and A is the primary key then {A}, {A, B}, {A, C} and {A, B, C} are all superkeys because if you have any one of those sets, you know all the values in the row (if it exists.)
However it's not treated as a key in SQL for some purposes, e.g. if table Y has A and B you can not usually define a foreign key Y(A, B) REFERENCES X(A, B), because {A, B} is not a primary key. If you want to be able to declare that foreign key you must add another UNIQUE constraint on X(A, B) which is inefficient as it duplicates part of the primary key.
In my opinion this is one of the many flaws of SQL.
“基数”只是说,当我查询单个键值的表时,返回的平均数量是多少记录。对于正确的主键,对于表中存在的所有值来说,该值始终且仅是 1。 (显然,对于表中不存在的数字,它为零。)对于非唯一键,基数将大于 1。另一个极端的常见示例是对于大小为 N 的表,使用“性别”作为键将为您提供 N/2 的基数。请记住,由于它是平均返回大小,因此表的分布不必是完美的。
希望有帮助。
A superkey can contain non-unique identifiersor Primary Key. But like a compound primary key, the combination must be unique. A normalized dataset has tables that hold data about a particular entity relevant to the purpose of the database. So, e.g., a company with an employee database has a table of employees and might have another table of employee action history (promotions, reviews, salary adjustments, etc.)
A key is a candidate key if it qualifies to uniquely select a specific row in the table such that the data in that table depend completely on that key and nothing else is required.
Suppose the employee table is USA based. It might have two candidates - one might be an employee number, the other the employee's social security number. If a company requires employees to have SSNs..., this could happen.
OK, two candidates - employee number and SSN.
You must pick one to be the primary key, in which case the other becomes just another data field dependent on the choice of key.
In the USA, based on two factors, employee number would probably be picked as the PK leaving SSN as a secondary field. First, there is a legal restriction on use of an SSN because it is personally-identifying information that promotes identity theft. Second, it is possible that a person might have applied for an SSN but not yet gotten it. So SSN can't be relied upon, thus invalidating it as a candidate.
Therefore, the difference between primary keys and candidate keys is that the other candidates lost the race.
A SUPERKEY is essentially an overdetermined key, one that is guaranteed to be a unique selector - but that isn't the smallest it could be. In my employee table example, the combination of (employee number, SSN) would be a superkey. Note that the superkey will often include the primary key.
There is more to it than that. The definition speaks of a superkey that has the same cardinality as that of the whole record as a key. (The ultimate key, totally impractical since you have to know the whole record beforehand to use it as a key...)
"Cardinality" simply says, when I query a table on a single key value, what is the average number of returned records. For a proper prime key, this is always and only 1 for all values that exist in the table at all. (Obviously, it is zero for numbers that don't exist in the table.) Cardinality will be greater than one for non-unique keys. The usual example for the other end of the spectrum is for a table of size N, using "Gender" as a key will give you a cardinality of N/2. Remember, since it is the AVERAGE return size, the table's distribution doesn't have to be perfect.
发布评论
评论(2)
从逻辑上来说,是的。如果表
X
具有列{A, B, C}
并且A
是主键,则{A}
、{A, B}
、{A, C}
和{A, B, C}
都是超级键,因为如果您有任何一个在这些集合中,您知道该行中的所有值(如果存在)。但是出于某些目的,它不会被视为 SQL 中的键,例如,如果表
Y
有A
和B
通常您可以不定义外键Y(A, B) REFERENCES X(A, B),因为
{A, B}
不是主键。如果您希望能够声明外键,则必须在X(A, B)
上添加另一个UNIQUE
约束,这是低效的,因为它复制了主键的一部分。在我看来,这是 SQL 的众多缺陷之一。
Logically speaking, yes. If table
X
has columns{A, B, C}
andA
is the primary key then{A}
,{A, B}
,{A, C}
and{A, B, C}
are all superkeys because if you have any one of those sets, you know all the values in the row (if it exists.)However it's not treated as a key in SQL for some purposes, e.g. if table
Y
hasA
andB
you can not usually define a foreign keyY(A, B) REFERENCES X(A, B)
, because{A, B}
is not a primary key. If you want to be able to declare that foreign key you must add anotherUNIQUE
constraint onX(A, B)
which is inefficient as it duplicates part of the primary key.In my opinion this is one of the many flaws of SQL.
超级键可以包含非唯一标识符或主键。但与复合主键一样,该组合必须是唯一的。
标准化数据集具有保存与数据库用途相关的特定实体的数据的表。因此,例如,拥有员工数据库的公司有一个员工表,并且可能有另一个员工操作历史记录表(晋升、评论、薪资调整等)。
如果某个键有资格唯一选择特定行,则该键是候选键使得该表中的数据完全依赖于该键,不需要其他任何东西。
假设员工表位于美国。它可能有两个候选者 - 一个可能是员工编号,另一个可能是员工的社会安全号码。如果公司要求员工拥有 SSN……,这种情况就有可能发生。
好的,有两个候选人 - 员工编号和 SSN。
您必须选择一个作为主键,在这种情况下,另一个将成为取决于键选择的另一个数据字段。
在美国,基于两个因素,员工编号可能会被选为 PK,而 SSN 则为次要字段。首先,社会安全号的使用受到法律限制,因为它是个人身份信息,会助长身份盗窃。其次,一个人可能已经申请了 SSN 但尚未获得。因此 SSN 不可靠,因此其作为候选人的资格无效。
因此,主键和候选键之间的区别在于其他候选键输掉了比赛。
SUPERKEY 本质上是一个超定的键,保证是唯一的选择器 - 但这并不是最小的。在我的员工表示例中,(员工编号、SSN)的组合将是一个超级键。请注意,超级键通常包含主键。
事情远不止如此。该定义谈到了一个超级密钥,它与作为密钥的整个记录具有相同的基数。 (终极键,完全不切实际,因为你必须事先知道整个记录才能将其用作键......)
“基数”只是说,当我查询单个键值的表时,返回的平均数量是多少记录。对于正确的主键,对于表中存在的所有值来说,该值始终且仅是 1。 (显然,对于表中不存在的数字,它为零。)对于非唯一键,基数将大于 1。另一个极端的常见示例是对于大小为 N 的表,使用“性别”作为键将为您提供 N/2 的基数。请记住,由于它是平均返回大小,因此表的分布不必是完美的。
希望有帮助。
A superkey can contain non-unique identifiersor Primary Key. But like a compound primary key, the combination must be unique.
A normalized dataset has tables that hold data about a particular entity relevant to the purpose of the database. So, e.g., a company with an employee database has a table of employees and might have another table of employee action history (promotions, reviews, salary adjustments, etc.)
A key is a candidate key if it qualifies to uniquely select a specific row in the table such that the data in that table depend completely on that key and nothing else is required.
Suppose the employee table is USA based. It might have two candidates - one might be an employee number, the other the employee's social security number. If a company requires employees to have SSNs..., this could happen.
OK, two candidates - employee number and SSN.
You must pick one to be the primary key, in which case the other becomes just another data field dependent on the choice of key.
In the USA, based on two factors, employee number would probably be picked as the PK leaving SSN as a secondary field. First, there is a legal restriction on use of an SSN because it is personally-identifying information that promotes identity theft. Second, it is possible that a person might have applied for an SSN but not yet gotten it. So SSN can't be relied upon, thus invalidating it as a candidate.
Therefore, the difference between primary keys and candidate keys is that the other candidates lost the race.
A SUPERKEY is essentially an overdetermined key, one that is guaranteed to be a unique selector - but that isn't the smallest it could be. In my employee table example, the combination of (employee number, SSN) would be a superkey. Note that the superkey will often include the primary key.
There is more to it than that. The definition speaks of a superkey that has the same cardinality as that of the whole record as a key. (The ultimate key, totally impractical since you have to know the whole record beforehand to use it as a key...)
"Cardinality" simply says, when I query a table on a single key value, what is the average number of returned records. For a proper prime key, this is always and only 1 for all values that exist in the table at all. (Obviously, it is zero for numbers that don't exist in the table.) Cardinality will be greater than one for non-unique keys. The usual example for the other end of the spectrum is for a table of size N, using "Gender" as a key will give you a cardinality of N/2. Remember, since it is the AVERAGE return size, the table's distribution doesn't have to be perfect.
Hope that helps.