数据库设计:解释这个模式
完全披露...在这里狂热地尝试了解有关数据库的更多信息,因此我投入了时间,也尝试从源头获得这个答案,但无济于事。
来自databaseanswers 的Barry Williams 发布了此模式。
我试图理解此架构中地址表的分割。我很清楚,地址表包含给定地址的详细信息。 Client_Addresses 和 Staff_Addresses 表让我着迷。
1)我理解主外键的使用,如图所示,但我假设当使用这些外键时,您在同一个表中没有驻留主键(在本例中为 date_address_from )。有人可以解释两者的原因并用语言表达它实际上是如何运作的吗?
2)为什么要使用 date_address_from 作为主键而不是像 client_address_id 这样的东西作为主键?如果有人一天输入两个地址,他的设计会不会有冲突呢?如果是或不是,又怎样?
3)沿着规范化的思路...由于在 Client_Addresses 和 Staff_Addresses 表中 date_address_from 和 date_address_to 是相同的,这些字段是否应该不包含在主 Address 表中?
Full disclosure...Trying feverishly here to learn more about databases so I am putting in the time and also tried to get this answer from the source to no avail.
Barry Williams from databaseanswers has this schema posted.
I am trying to understand the split of address tables in this schema. Its clear to me that the Addresses table contains the details of a given address. The Client_Addresses and Staff_Addresses tables are what gets me.
1) I understand the use of Primary Foreign Keys as shown but I was under the assumption that when these are used you don't have a resident Primary Key in that same table (date_address_from in this case). Can someone explain the reasoning for both and put it into words how this actually works out?
2) Why would you use date_address_from as the primary key instead of something like client_address_id as the PK? What if someone enters two addresses in one day would there be conflicts in his design? If so or if not, what?
3) Along the lines of normalization...Since both date_address_from and date_address_to are the same in the Client_Addresses and Staff_Addresses table should those fields just not be included in the main Address table?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
评估
首先是审核,然后是具体答案。
这不是数据模型。这不是数据库。这是一桶鱼,每条鱼都画成矩形,一条鱼的鱼鳍卡在另一条鱼的鳃里,有一条线。存在大量重复,以及大量缺失元素。它完全不值得用作学习任何有关数据库设计的示例。
根本没有标准化;这些文件非常不完整(请参阅迈克的回答,还有一百多个类似的问题)。
other_details
和eg.s
让我大吃一惊。每个元素都需要被识别和存储:StreetNo、ApartmentNo、StreetName、StreetType
等,而不是line_1_number_street
,它是一个组。客户和员工应标准化为人员表,并标识所有元素。
是的,如果客户可以是个人或组织,则需要超类型-子类型结构来正确支持这一点。
所以这实际上是什么,技术上准确的术语,是一堆平面文件,带有字段组的描述。与数据库或关系数据库相距数光年。还没有准备好进行评估或检查,更不用说构建一些东西了。在关系数据模型中,这将是大约 35 个规范化表,没有重复的列。
Barry 在网络上拥有(等等)超过 500 个“模式”。当您尝试使用第二个“模式”时,您会发现(a)它们在用途和目的方面完全不同(b)它们之间没有共同点(c)假设两者都有一个客户文件;它们将是不同形式的客户档案。
他需要首先规范化整个单一“模式”,
然后在 500 个部分或主题领域中呈现单一标准化数据模型。
我已就此写信给他。没有回应。
还需要注意的是,他使用了一些无法识别的图表约定。这些漂亮有趣的图片的问题在于它们传达了一些事物,但它们没有传达关于数据库或设计的重要事物。学习者感到困惑并不奇怪。经验丰富的数据库专业人员并不清楚。关系数据库建模和数据模型中的符号有一个标准是有原因的:它们传达了设计的所有细节和微妙之处。
巴里还没有读过很多内容:命名约定;关系;基数;等等,太多了,无法列出。
网络上充斥着垃圾,任何人都可以“发布”。那里有数以百万计的好看或难看的“设计”,它们不值得一看。或者更糟糕的是,如果你看的话,你会学到完全错误的“设计”方法。在学习数据库和数据库设计方面,最好找有资格、有能力的人并向他们学习。
回答
他正在使用复合键,但没有拼写出来。
client_addresses
的 PK 为client_id
、address_id、date_address_from)
。这不是一个坏钥匙,显然他希望永远记录地址。other_details
放回客户和员工文件中,并删除三个除了占用磁盘空间之外毫无用处的文件。李>您正在考虑关联表,它解决数据库中的多对多关系。是的,这些列仅是两个父表的主键。这些不是关联表或文件;而是关联表或文件。它们包含数据字段。
这不是PK,这是PK的第三个要素。
一个人在一天之内在多个地址注册的想法是不合理的;只数他们睡得最多的一个地址。
其他人已经回答了这个问题。
不要期望在此图中识别任何数据库或设计或规范化的证据。
Evaluation
First an Audit, then the specific answers.
This is not a Data Model. This is not a Database. It is a bucket of fish, with each fish drawn as a rectangle, and where the fins of one fish are caught in the the gills of another, there is a line. There are masses of duplication, as well as masses of missing elements. It is completely unworthy of using as an example to learn anything about database design from.
There is no Normalisation at all; the files are very incomplete (see Mike's answer, there are a hundred more problem like that). The
other_details
andeg.s
crack me up. Each element needs to be identified and stored:StreetNo, ApartmentNo, StreetName, StreetType
, etc. notline_1_number_street
, which is a group.Customer and Staff should be normalised into a Person table, with all the elements identified.
And yes, if Customer can be either a Person or an Organisation, then a supertype-subtype structure is required to support that correctly.
So what this really is, the technically accurate terms, is a bunch of flat files, with descriptions for groups of fields. Light years distant from a database or a relational one. Not ready for evaluation or inspection, let alone building something with. In a Relational Data Model, that would be approximately 35 normalised tables, with no duplicated columns.
Barry has (wait for it) over 500 "schemas" on the web. The moment you try to use a second "schema", you will find that (a) they are completely different in terms of use and purpose (b) there is no commonality between them (c) let's say there was a customer file in both; they would be different forms of customer files.
He needs to Normalise the entire single "schema" first,
then present the single normlaised data model in 500 sections or subject areas.
I have written to him about it. No response.
It is important to note also, that he has used some unrecognisable diagramming convention. The problem with these nice interesting pictures is that they convey some things but they do not convey the important things about a database or a design. It is no surprise that a learner is confused; it is not clear to experienced database professionals. There is a reason why there is a standard for modelling Relational databases, and for the notation in Data Models: they convey all the details and subtleties of the design.
There is a lot that Barry has not read about yet: naming conventions; relations; cardinality; etc, too many to list.
The web is full of rubbish, anyone can "publish". There are millions of good- and bad-looking "designs" out there, that are not worth looking at. Or worse, if you look, you will learn completely incorrect methods of "design". In terms of learning about databases and database design, you are best advised to find someone qualified, with demonstrated capability, and learn from them.
Answer
He is using composite keys without spelling it out. The PK for
client_addresses
isclient_id
,address_id, date_address_from)
. That is not a bad key, evidently he expects to record addresses forever.other_details
, and remove three files that serve absolutely no purpose other than occupying disk space.You are thinking about Associative Tables, which resolve the many-to-many relations in Databases. Yes, there, the columns are only the PKs of the two parent tables. These are not Associative Tables or files; they contain data fields.
It is not the PK, it is the third element of the PK.
The notion of a person being registered at more than one address in a single day is not reasonable; just count the one address they slept the most at.
Others have answered that.
Do not expect to identify any evidence of databases or design or Normalisation in this diagram.
1) 在每个表中,主键都是由三个属性组成的复合键:(staff_id、address_id、date_address_from) 和 (client_id、address_id、date_address_from)。这大概意味着客户/员工到地址的映射预计会随着时间的推移而改变,并且这些改变的历史记录会被保留。
2) 没有明显的理由在这些表中创建新的“id”属性。复合键足以完成这项工作。为什么您要在同一日期为同一客户创建两次同一地址?如果您这样做了,那么这可能是修改设计的原因,但这似乎是一个不太可能的要求。
3) 不。明显的目的是它们是将地址映射到客户/员工的适用日期——而不是仅适用于地址的日期。
1) In each of those tables the primary key is a compound key consisting of three attributes: (staff_id, address_id, date_address_from) and (client_id, address_id, date_address_from). This presumably means that the mapping of clients/staff to addresses is expected to change over time and that the history of those changes is preserved.
2) There's no obvious reason to create a new "id" attribute in those tables. The compound key does the job adequately. Why would you want to create the same address twice for the same client on the same date? If you did then that might be a reason to modify the design but that seems like an unlikely requirement.
3) No. The apparent purpose is that they are the applicable dates for the mapping of address to client/staff - not dates applicable to the address alone.
不,但你确实发现了一个问题。
设计师认为客户和员工是完全不同的两件事。我所说的“完全不同”是指它们没有共同的属性。
这不是真的,是吗?客户和员工都有地址。我确信他们中的大多数人也有电话。
想象一下,员工也是客户。该人的名字存储在多少个位置?那个人的地址?你能听到罗杰斯先生在后台说:“你能拼写‘更新异常’吗?……我知道你能。”
问题在于设计师将客户和员工视为不同类型的人。他们不是。 “客户”描述了服务提供商(通常不是零售商)和客户(可以是个人或公司)之间的业务关系。 “员工”描述的是公司与个人之间的雇佣关系。不是不同种类的人,而是不同种类的关系。
你能看出如何解决这个问题吗?
No. But you did find a problem.
The designer has decided that clients and staff are two utterly different things. By "utterly different", I mean they have no attributes in common.
That's not true, is it? Both clients and staff have addresses. I'm sure most of them have telephones, too.
Imagine that someone on staff is also a client. How many places is that person's name stored? That person's address? Can you hear Mr. Rogers in the background saying, "Can you spell 'update anomaly'? . . . I knew you could."
The problem is that the designer was thinking of clients and staff as different kinds of people. They're not. "Client" describes a business relationship between a service provider (usually, that is, not a retailer) and a customer, which might be either a person or a company. "Staff" describes a employment relationship between a company and a person. Not different kinds of people--different kinds of relationships.
Can you see how to fix that?
这 2 个额外的表使您能够拥有每个人的地址历史记录。
您可以将它们都放在一张表中,但由于员工和客户是分开的,因此最好也将它们分开(b/c 客户 id =1 和员工 id =1 不能在同一个地址表上使用) 。
设计问题没有“单一”解决方案,您可以使用 1 人表,然后在员工和客户之间添加一列。但主要思想是数据库应该清晰、可读、高效,而不是保存表。
大约 2 - pk 是组合的,包括 clientID、AddressID 和 from。
因此,如果有人在美国生活了 6 个月,然后在以色列生活了 6 个月,然后回到美国,到达相同的地址 - 您只需要地址表中的 2 个地址,以及 client_address 中的 3 个地址。
将 from_Date 作为密钥的一部分的想法是正确的,尽管它不能保证数据完整性 - 因为您还需要手动检查同一个人的记录之间没有重叠的日期。
大约 3 - 否(看 2)。
This 2 extra tables enables you to have address history per one person.
You can have them both in one table, but since staff and client are separated, it is better to separate them as well (b/c client id =1 and staff id =1 can't be used on the same table of address).
there is no "single" solution to a design problem, you can use 1 person table and then add a column to different between staff and client. BUT The major Idea is that the DB should be clear, readable and efficient, and not to save tables.
about 2 - the pk is combined, both clientID, AddressID and from.
so if someone lives 6 month in the states, then 6 month in Israel, and then back to the states, to the same address - you need only 2 address in address table, and 3 in the client_address.
The idea of heaving the from_Date as part of the key is right, although it doesn't guaranty data integrity - as you also need manually to check that there isn't overlapping dates between records of the same person.
about 3 - no (look at 2).
从数据模型来看,我认为:
1)PF表示该字段既是表主键的一部分,又是与其他表的外键的一部分。
2) 同样,Staff_Addresses 的主键是 {staff_id,address_id,date_adderess_from} 而不仅仅是 date_adderess_from
3) 与 2) 相同
Viewing the data model, i think:
1) PF means that the field is both part of the primary key of the table and foreign key with other table.
2) In the same way, the primary key of Staff_Addresses is {staff_id,address_id,date_adderess_from} not just date_adderess_from
3) The same that 2)
参考 Staff_Addresses 表,date_address_from 上的主键基本上可以防止多次输入具有相同 Staff_id/address_id 的记录。现在,我不是 DBA,但出于性能原因/更快的索引,我喜欢我的 PK 是整数或 guid。如果我要这样做,我会创建一个新列,例如 Staff_Address_Id 并将其设为 PK 列,并对 Staff_id/address_id/date_address_from 设置唯一约束。
至于你最后关心的问题,地址表实际上是一个通用的地址存储结构。它不应该关心某人居住在那里的日期范围。最好留给地址的特定实现,例如客户/员工地址。
希望这会有所帮助。
In reference to Staff_Addresses table, the Primary Key on date_address_from basically prevents a record with the same staff_id/address_id entered more than once. Now, i'm no DBA, but i like my PKs to be integers or guids for performance reasons/faster indexing. If i were to do this i would make a new column, say, Staff_Address_Id and make it the PK column and put a unique constraint on staff_id/address_id/date_address_from.
As for your last concern, Addresses table is really a generic address storage structure. It shouldn't care about date ranges during which someone resided there. It's better to be left to specific implementations of an address such as Client/Staff addresses.
Hope this helps a little.