识别功能依赖性 II

发布于 2024-11-03 09:59:45 字数 847 浏览 6 评论 0原文

这是一个示例，应该可以澄清上一篇文章的内容。

在此处输入图像描述

雇佣日期& carReg 是主键。除了我已经确定的功能依赖项之外，是否还存在额外的功能依赖项 (FD) 以下？也欢迎修改：

fd1 carReg -> make, model, outletNo, outletLoc
fd2 custNo -> custName
fd3 outletNo -> outletLoc
fd4 model -> make (only if we assume a model name is unique to a make)
fd5 carReg, hireDate -> make, model, custNo, custName, outletNo, outletLoc

我不确定以上内容是否正确，但我确信还有更多。

基于Mike Sherrill Cat Recall的回答 ...我的问题是这样的： custName -> 怎么样？客户没有有效的FD？对于上述关系，当然，一个客户名称恰好映射到一个客户编号，但凭直觉，我们知道可以将多个 J SMith 添加到表中。如果是这种情况，则该 FD 无效，因为它形成 1..* 关系。我们真的可以说 custName ->客户不知道这个事实吗？我们是否仅仅将 FD 建立在样本数据的基础上？或者我们是否考虑到可以添加的可能值？

原文

Here is an example which should clear things up for the last post.

Enter image description here

hireDate & carReg are the primary keys. Are there extra functional dependencies (FDs) other than the ones I have identified
below? Modifications also welcome:

fd1 carReg -> make, model, outletNo, outletLoc
fd2 custNo -> custName
fd3 outletNo -> outletLoc
fd4 model -> make (only if we assume a model name is unique to a make)
fd5 carReg, hireDate -> make, model, custNo, custName, outletNo, outletLoc

I'm not sure if the above are correct and I am sure there are more.

Based on Mike Sherrill Cat Recall's answer... My question is this: How is custName -> custNo a valid FD? For the above relation, sure, a customer name maps onto exactly one customer number, but by intuition, we know more than one J SMith could be added to the table. If this is the case, this FD is void as it forms a 1..* relationship. Can we really say that custName -> custNo knowing this fact? Do we merely base FDs on the sample data? Or do we take into account the possible values that can be added?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撞了怀 2024-11-10 09:59:45

一目了然。。。

custName -> custNo
model -> make
outletLoc -> outletNo
carReg, custNo -> hireDate
carReg, custName -> hireDate

我确信还有其他人。示例数据不具有代表性，当您尝试从数据确定函数依赖关系时，这就是一个问题。假设您的示例数据只有一行。

carReg    hireDate make  model  custNo  custName  outletNo  outletLoc
--
MS34 0GD  14/5/03  Ford  Focus  C100    Smith, J  01        Bearsden

FD 回答这一问题：“给定‘x’的一个值，我是否知道‘y’的一个且仅有一个值？”基于该单行样本数据集，每个属性都决定其他所有属性。 custNo 确定雇佣日期。雇佣日期决定了outletLoc。 custName 确定型号。

当样本数据不具有代表性时，很容易发现无效的 FD。您需要更具代表性的示例数据来清除一些无效的函数依赖性。

custName -> custNo isn't valid ('C101', 'Hen, P')
carReg, custNo -> hireDate isn't valid ('MS34 0GD', 'C100', '15/7/04')
carReg, custName -> hireDate isn't valid ('MS34 0GD', 'Hen, P', '15/8/03')

您可以使用 SQL 研究示例数据中的函数依赖性。

create table reg (
  CarReg char(8) not null,
  hireDate date not null,
  Make varchar(10) not null,
  model varchar(10) not null,
  custNo char(4) not null,
  custName varchar(10) not null,
  outletNo char(2) not null,
  outletLoc varchar(15) not null
);

insert into reg values
('MS34 OGD', '2003-05-14', 'Ford', 'Focus', 'C100', 'Smith, J', '01', 'Bearsden'),
('MS34 OGD', '2003-05-15', 'Ford', 'Focus', 'C201', 'Hen, P', '01', 'Bearsden'),
('NS34 TPR', '2003-05-16', 'Nissan', 'Sunny', 'C100', 'Smith, J', '01', 'Bearsden'),
('MH34 BRP', '2003-05-14', 'Ford', 'Ka', 'C313', 'Blatt, O', '02', 'Kelvinbridge'),
('MH34 BRP', '2003-05-20', 'Ford', 'Ka', 'C100', 'Smith, J', '02', 'Kelvinbridge'),
('MD51 OPQ', '2003-05-20', 'Nissan', 'Sunny', 'C295', 'Pen, T', '02', 'Kelvinbridge');

型号决定品牌吗？

select distinct model 
from reg
order by model;

model
--
Focus
Ka
Sunny

三种不同的型号。。。

select model, make
from reg
group by model, make
order by model;

model   make
--
Focus   Ford
Ka      Ford
Sunny   Nissan

是的。每个型号都有一个品牌。根据样本数据，模型->制作。

carReg、custName 是否 ->雇用日期？

select distinct carReg, custName
from reg
order by custName;

carReg
--
MH34 BRP  Blatt, O
MS34 OGD  Hen, P
MD51 OPQ  Pen, T
MS34 OGD  Smith, J
NS34 TPR  Smith, J
MH34 BRP  Smith, J

carReg 和 custName 的六种不同组合。

select carReg, custName, hireDate
from reg
group by carReg, custName, hireDate
order by custName;

carReg  custName  hireDate
--
MH34 BRP  Blatt, O  2003-05-14
MS34 OGD  Hen, P    2003-05-15
MD51 OPQ  Pen, T    2003-05-20
MH34 BRP  Smith, J  2003-05-20
NS34 TPR  Smith, J  2003-05-16
MS34 OGD  Smith, J  2003-05-14

是的。 carReg 和 custName 的每个组合都有一个雇佣日期。所以根据样本数据，{carReg, custName} ->雇用日期。

At a glance . . .

custName -> custNo
model -> make
outletLoc -> outletNo
carReg, custNo -> hireDate
carReg, custName -> hireDate

And I'm sure there are others. The sample data isn't representative, and that's a problem when you try to determine functional dependencies from data. Let's say your sample data had only one row.

carReg    hireDate make  model  custNo  custName  outletNo  outletLoc
--
MS34 0GD  14/5/03  Ford  Focus  C100    Smith, J  01        Bearsden

FDs answer the question, "Given one value for 'x', do I know one and only one value for 'y'?" Based on that one-row set of sample data, every attribute determines every other attribute. custNo determines hireDate. hireDate determines outletLoc. custName determines model.

When sample data isn't representative, it's easy to turn up FDs that aren't valid. You need more representative sample data to weed out some invalid functional dependencies.

custName -> custNo isn't valid ('C101', 'Hen, P')
carReg, custNo -> hireDate isn't valid ('MS34 0GD', 'C100', '15/7/04')
carReg, custName -> hireDate isn't valid ('MS34 0GD', 'Hen, P', '15/8/03')

You can investigate functional dependencies in sample data by using SQL.

create table reg (
  CarReg char(8) not null,
  hireDate date not null,
  Make varchar(10) not null,
  model varchar(10) not null,
  custNo char(4) not null,
  custName varchar(10) not null,
  outletNo char(2) not null,
  outletLoc varchar(15) not null
);

insert into reg values
('MS34 OGD', '2003-05-14', 'Ford', 'Focus', 'C100', 'Smith, J', '01', 'Bearsden'),
('MS34 OGD', '2003-05-15', 'Ford', 'Focus', 'C201', 'Hen, P', '01', 'Bearsden'),
('NS34 TPR', '2003-05-16', 'Nissan', 'Sunny', 'C100', 'Smith, J', '01', 'Bearsden'),
('MH34 BRP', '2003-05-14', 'Ford', 'Ka', 'C313', 'Blatt, O', '02', 'Kelvinbridge'),
('MH34 BRP', '2003-05-20', 'Ford', 'Ka', 'C100', 'Smith, J', '02', 'Kelvinbridge'),
('MD51 OPQ', '2003-05-20', 'Nissan', 'Sunny', 'C295', 'Pen, T', '02', 'Kelvinbridge');

Does model determine make?

select distinct model 
from reg
order by model;

model
--
Focus
Ka
Sunny

Three distinct models . . .

select model, make
from reg
group by model, make
order by model;

model   make
--
Focus   Ford
Ka      Ford
Sunny   Nissan

Yup. One make for each model. Based on the sample data, model -> make.

Does carReg, custName -> hireDate?

select distinct carReg, custName
from reg
order by custName;

carReg
--
MH34 BRP  Blatt, O
MS34 OGD  Hen, P
MD51 OPQ  Pen, T
MS34 OGD  Smith, J
NS34 TPR  Smith, J
MH34 BRP  Smith, J

Six distinct combinations of carReg and custName.

select carReg, custName, hireDate
from reg
group by carReg, custName, hireDate
order by custName;

carReg  custName  hireDate
--
MH34 BRP  Blatt, O  2003-05-14
MS34 OGD  Hen, P    2003-05-15
MD51 OPQ  Pen, T    2003-05-20
MH34 BRP  Smith, J  2003-05-20
NS34 TPR  Smith, J  2003-05-16
MS34 OGD  Smith, J  2003-05-14

Yup. One hireDate for each combination of carReg and custName. So based on the sample data, {carReg, custName} -> hireDate.

回复收藏 0 原文

似梦非梦 2024-11-10 09:59:45

好吧，既然你征求了第二个意见，我就给你一个。

第二种意见是第一种意见（CatCall 的）是完全正确的。

样本数据不足以识别/确定数据中的功能依赖性。识别/确定数据中的功能依赖性所需的是用户需求、数据库旨在支持的业务环境的描述/定义……

只有您的用户可以以一种或另一种方式告诉您应用了哪些功能依赖性。（不要将此解释为您应该告诉用户他们应该告诉您“适用的 FD 是什么”，因为您的用户通常不知道该术语的含义。但是，适用的 FD 是什么，可以仍然只能从用户向您提供的业务规范中得出。）

（相反，PS 样本数据可能确实足以证明某个给定的 FD 肯定不适用。但这不是您的问题。）

回复收藏 0 原文

我一直都在从未离去 2024-11-10 09:59:45

FD（函数依赖）表示关系值或变量的某个属性。我们可以说它对于给定关系值成立或不成立（满足或不满足）（正确或不正确）。当我们说它对关系变量成立或不成立时，我们的意思是它对应用程序中可能出现的变量的每个可能值都成立或不成立。

另外，如果我们给定一个值并且我们被告知它满足的FD是可以保存它的变量满足的FD那么根据该假设该变量的FD是值的 FD。（这有时被称为变量的“代表性数据”。）但是，如果我们只是给定一个变量可能出现的值，那么我们只知道

该值中不包含的 FD也不要在变量中
保存两者的琐碎 FD
（形式为 S -> S 的子集）
（无论值如何，都必须保留，仅基于属性）
（值和变量必须相同）

从我的答案到我做错了什么？（从表中查找FD）：

我们说FD（函数依赖）表达式S -> T 有一个
属性 S 的“决定性”集合和属性 S 的“确定”集合
属性 T。它表示 S 的给定子元组值出现在
给定的关系值或变量/模式始终具有相同的子元组
对于T的值。对于S-> {A} 我们可以说 S -> A.对于{A}->我们可以说 A
->
给定一个关系，我们说 FD“持有”它或“满足于”
它或其中“是真实的”或（草率地）“在其中”或（草率地）它“具有”
当 FD 所言属实时，即为 FD。每个 FD 都可以
使用关系值/变量/模式的属性来表示
要么持有，要么不持有。
我们可以找到所有的FD S ->通过检查保持关系的 T
属性集的每个子集作为 S，其中每个子集
属性为T。还有算法。其中 S 是超集的 FD
T 必须成立并被称为“平凡”。
我们可以找到所有的FD S ->通过检查保持关系的 A
属性集的每个子集作为 S，每个属性作为 A。
还有算法。（然后找到所有满足以下条件的FD：FDs S ->
{} 简单地持有 &是否S-> T 对于具有多个元素的 T 可以是
从FD中发现S-> A.)
以下是一些捷径：集合决定自身。如果S->然后每个
S 的超集决定 T 的每个子集。如果 S 不能决定 T
那么 S 的任何子集都不能确定 T 的任何超集。如果一个集合有
每个元组中值的不同子元组（即它是“唯一的”，即它
是一个超级键）（包括它是否是一个候选键）然后它确定
每一套。 {}-> T 当/当且仅当每个元组具有相同的 T 子元组值时。
给定一些成立的 FD，阿姆斯特朗公理生成所有满足以下条件的 FD：
还必须持有。后者被称为前者的“封闭”。一个
生成某个闭包的 FD 集合称为“覆盖”。一个
当从中删除任何 FD 时，覆盖是“最小的”或“不可简化的”
一套不是封面的套装。每个的最小/不可简化的覆盖
行列式唯一性是“规范的”。
通常我们不会被要求关闭所有持有 a 的 FD。
模式，我们被要求为它们提供一个规范的封面。一般来说如果
我们只知道一些 FD 保存在一个模式中，但我们不知道
它的闭包是所有持有的 FD。

假设没有给出表变量的每个可能的表值，则确定表变量的FD需要其含义/谓词&给出的业务规则。

请参阅我对识别功能依赖项 (FD)。

A FD (functional dependency) expresses a certain property of a relation value or variable. We can say that it holds for or doesn't hold for (is satisfied by or isn't satisfied by) (is true of or is not true of) a given relation value. When we say it holds or doesn't hold for a relation variable we mean it holds or doesn't hold for every possible value for the variable that can arise in an application.

Also if we are given a value and we are told that the FDs it satisfies are the FDs that a variable that could hold it satisfies then by that assumption the variable's FDs are the value's FDs. (This is sometimes called "representative data" for the variable.) But if we are just given a value that might arise for a variable then we only know that

the FDs that don't hold in the value also don't in the variable
the trivial FDs of both hold
(the ones of the form S -> subset of S)
(the ones that must hold regardless of the value, based only on the attributes)
(which must be the same for the value & the variable)

From my answer to What did I do wrong? (Find FD from table):

We say that a FD (functional dependency) expression S -> T has a
"determinant" set of attributes S and a "determined" set of
attributes T. It says that a given subtuple value for S appears in a
given relation value or variable/schema always with the same subtuple
value for T. For S -> {A} we can say S -> A. For {A} -> T we can say A
-> T.
Given a relation, we say that a FD "holds in" it or "is satisfied by"
it or "is true" in it or (sloppily) "is in" it or (sloppily) it "has"
a FD when what the FD says is true about it. Every FD that can be
expressed using attributes of a relation value/variable/schema will
either hold or not hold.
We can find all the FDs S -> T that hold in a relation by checking
every subset of the set of attributes as S with every subset of
attributes as T. There are also algorithms. FDs where S is a superset
of T must hold and are called "trivial".
We can find all the FDs S -> A that hold in a relation by checking
every subset of the set of attributes as S with every attribute as A.
There are also algorithms. (Then to find all FDs that hold: FDs S ->
{} hold trivially & whether S -> T for T with multiple elements can be
found from the FDs S -> A.)
Here are some shortcuts: A set determines itself. If S -> T then every
superset of S determines every subset of T. If S doesn't determine T
then no subset of S determines any superset of T. If a set has a
different subtuple of values in every tuple (ie it is "unique", ie it
is a superkey) (including if it is a candidate key) then it determines
every set. {} -> T when/iff every tuple has the same T subtuple value.
Given some FDs that hold, Armstrong's axioms generate all FDs that
must also hold. The latter is called the "closure" of the former. A
set of FDs that generates a certain closure is called a "cover". A
cover is "minimal" or "irreducible" when removing any FD from it gives
a set that is not a cover. A minimal/irreducible cover with every
determinant unique is "canonical".
Usually we are not asked to give a closure for all FDs that hold in a
schema, we are asked to give a canonical cover for them. In general if
we only know some FDs that hold in a schema then we don't know that
its closure is all the FDs that hold.

Assuming not every possible table value for a table variable is given, determining FDs for a table variable requires its meaning/predicate & the business rules to be given.

See my answer to Identifying functional dependencies (FDs).

回复收藏 0 原文