SQL Server 2005数据库设计——多对多的层次关系
注意
我已经完全重写了原来的帖子,以更好地解释我想要理解的问题。我试图尽可能概括这个问题。
另外,我还要感谢最初回复的人。希望这篇文章能让事情变得更清楚一些。
上下文
简而言之,我正在努力理解设计小型数据库来处理(我认为的)多个多对多关系的最佳方法。
想象一下公司组织结构的以下场景:
Textile Division Marketing Division
| |
---------------------- ----------------------
| | | |
HR Dept Finance Dept HR Dept Finance Dept
| | | |
---------- ---------- ---------- ---------
| | | | | | | |
Payroll Hiring Audit Tax Payroll Hiring Audit Accounts
| | | | | | | |
Emps Emps Emps Emps Emps Emps Emps Emps
注意:Emps
表示在该领域工作的员工列表
当我第一次开始处理这个问题时,我制作了四个单独的表格:
-
部门
->纺织、营销 (PK = DivisionID) -
部门
->人力资源、财务(PK = 部门 ID) -
职能
->薪资、招聘、审计、税务、账户(PK = FunctionID) -
员工
->所有员工的列表(PK = EmployeeID)
我看到的问题是存在多个多对多关系,即许多部门有许多部门,许多职能有许多部门。
问题
给出上面的数据库结构,假设我想要执行以下操作:
- 获取在营销部门的薪资职能部门工作的所有员工
为此,我需要能够区分两个薪资部门但我不确定如何做到这一点?
我知道我可以在部门和职能之间建立一个“链接/连接”表,以便我可以检索哪些职能属于哪些部门。不过,我还是需要区分一下他们所属的部门。
研究工作
正如你所看到的,在数据库设计方面我是一个初学者。我花了两天的时间研究这个问题,遍历嵌套集模型,邻接模型,读到这个问题已知不是 NP 完全的等等。我确信有一个简单的解决方案吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
根据更新的帖子,并根据所使用的名称做出一些(相当明显的)假设,我得出以下结论。有四个实体:
这些实体之间存在许多关系。其中很少有层次结构,大多数都是简单的关联:
选项 A2:职能由部门“拥有”。任何职能不能由两个或多个部门执行。 (情况似乎确实如此,因为人力资源部门负责薪资和招聘,财务部门负责审计、税务和会计。)
职能由(代表)部门的部门执行。 (人力资源部负责纺织和营销部门的工资和招聘;财务部负责纺织部门的审计和税务,但不负责会计;营销部门负责审计和会计,但不负责税务。)也许有一点更准确地说,部门为与其关联的选定部门执行选定的职能,并且该关联是由其执行该职能定义的。
除了履行职能之外,部门和部门之间似乎没有任何关系。它们之间不存在等级关系,因为一个不“拥有”或包含另一个。
这导致了这些粗略的表格:(
您可以选择包含一个代理键来唯一标识每一行,但 DivisionId + FunctionId 也可以。)
这里没有足够的材料来完全描述如何“员工”适合该模型。鉴于员工承担多项职能的工作:一名员工是否可以承担多项职能的工作,还是只承担一项职能?一名员工是否执行该职能的工作,而不管该工作是为哪个部门完成的,还是被分配为一个或多个部门执行该工作?这里有两个明显的选择,尽管可能有更复杂的变体:
鉴于这些,表格可能如下所示:
...因此,所有能够执行某项功能的员工将为所有需要该功能的部门执行该功能。或者,
(或者,包括来自 DivisionFunctions 的可选代理键,而不是 DivisionId 和 FunctionId。) ...因此,员工被单独分配给部门的部门执行的职能。
但这仍然留下了很多“如果/何时”的问题:员工是否“属于”部门?员工可以属于(为)多个部门工作吗?也许员工属于部门?您是否跟踪员工可以执行哪些职能,即使他们当前没有执行?同样,您是否跟踪员工在哪个部门工作,即使他们目前处于“职能之间”?如果一名员工可以执行职能 A 和 B,并且某个部门需要这两种职能,那么该员工是否可以被分配只执行该部门的 A 而不执行 B?
这里还有更多的需求研究要做,但我认为这是一个好的开始。
Based on the updated post, and making some (fairly obvious) assumptions based on the names used, I come up with the following. There are four entities:
There are many relationships between these entities. Few of them are hierarchical, most are simple associations:
Option A2: Functions are “owned” by departments. No function can be performed by two or more departments. (This appears to be the case, as the HR Dept has Payroll and Hiring, and the Finance Dept has Audit, Tax, and Accounts.)
Functions are performed by departments for (on behalf of) divisions. (HR Dept does Payroll and Hiring for both Textile and Marketing divisions; Finance Dept does Audit and Tax--but not Accounts--for Textile division, and Audit and Accounts--but not Tax--for Marketing division.) Perhaps a bit more precisely, departments perform selected functions for selected divisions that they are associated with, and that association is defined by their performance of that function.
Beyond performing the work of functions, there appears to be no relationship between departments and divisions. There is no hierarchical relationship between them, as one does not “own” or contain the other.
This leads to these roughly sketched out tables:
(You could optionally include a surrogate key to uniquely identify each row, but DivisionId + FunctionId would work.)
There isn’t enough material here to fully describe how "employees" fit into the model. Given that employees do the work of functions: can an employee do the work of more than one function, or do they only do the one? Does an employee do the work of the function regardless of the division(s) it is being done for, or are they assigned to do the work for one or more divisions? Two obvious options here, though more complex variants are possible:
Given these, tables might look like:
... and thus all employees that can perform a function will perform it for all divisions requiring it. Or,
(Or, instead of DivisionId and FunctionId, include the optional surrogate key from DivisionFunctions.) ... and thus employees are assigned individually to functions to be performed by the department for a division.
But that still leaves a lot of “what if/when” questions: Do employees “belong to” departments? Can employees belong to (work for) multiple departments? Perhaps employees belong to divisions? Do you track what functions an employee can do, even if they are not currently doing it? Similarly, do you track what department an employee works for, even if they are currently “between functions”? If an employee can perform functions A and B, and a division requires both these functions, might an employee be assigned to only perform A and not B for that division?
There’s a more requirements research to be done here, but I’d like to think this is a good start.
好吧,您不会将所有内容都放入一张表中。您需要阅读规范化数据和连接。 (并且永远不要将任何内容存储在逗号分隔的列表中。)
任何值得一试的数据库在处理一百万条记录时都不会遇到任何问题,这是一个很小的数据库。
您需要功能、课程、位置、人员、组织的表,可能还需要一些连接表来适应多对多关系。但这些都不难,甚至超出了非常基本的设计范围。我建议您在做任何事情之前,先获取一本有关您选择的数据库的书并阅读基础知识。
Well you wouldn't put it all into one table. You need to read up on normalizing data and joins. (And never store anything in a comma delimted list.)
No database worth it's salt would have the slightest problem handling a million records, that is a tiny database.
You need tables for functions, courses, locations, people, organization and possibly some joining tables to accommodate many to many relationships. But none of this is hard or even beyond very basic design. I recommend that before you do anything, you get a book on your chosen database and read up on the basics.
由于您是“初学者”:),在尝试熟悉数据库设计之前要做的一件事是阅读规范化,并完全理解 5NF
如果你想建模
1.部门分工
2.职能由部门履行
3. 员工执行职能
,并且并非所有职能都在所有部门中执行,也不是所有部门都在所有部门中,那么您必须将该事实存储在某个地方。
在进行逻辑设计时,为您的表提供描述性名称,因此某些部门在部门中,
然后您在某些部门中拥有一些职能
,然后员工在某些部门和部门中拥有一些职能
之后(或在此之前),您还有 3 个实体职能、部门和部门将列出上述表格也将引用的所有可能的部门、部门和职能(这可能未完全标准化)。
此外,实体(表)的名称可以变得更适合您(只有您才能知道数据模型的完整语义)。特别是如果您注意到需要为它们分配其他属性(字段)。
部门、部门和职能的值是它们的名称,在上述分析中还没有人工 ID。您可以在下一步中引入它们,在逻辑建模之后进行物理建模,或者您可以保留自然键。如果您使用人工键,可以将复合键的使用量减少到最多 2 个,但它确实会混淆关系以及您存储在表中的事实的含义。
(示例 functionID 可以是函数名称的 ID 或在某些部门/部门组合中执行的函数的 id - 不清楚它是什么,并且它们不可互换;有点像实例和实例之间的区别班级)。
As you are "abecedarian" :), one thing to do before any attempt to feel at home with database design is read about normalization, and to completely understand all normal forms up to 5NF
If you want to model that
1. departments are in divisions
2. functions are performed in departments
3. employees perform functions
and that not all functions are performed in all of the departments, nor all the departments are in all divisions then you have to store that fact somewhere.
While doing logical design, give your tables descriptive names, so some departments are in divisions
then you have some functions in some departments
then employees have some functions from some departments and divisions
After (or before this) you have 3 more entities functions, departments and divisions which would list all the possible departments, divisions and functions that would also be referenced by the above tables (this might not be completely normalized).
Also the names of the entities (tables) can become something more appropriate to you (only you can know the full semantics of the model of your data). Especially if you notice that you need to assign other attributes (fields) to them.
The values for departments, divisions and functions are their names, there are no artificial ids yet in the above analysis. You can introduce them in the next step, after the logical modelling comes physical modelling, or you can keep the natural keys. If you go with artificial keys that can cut down the usage of composite keys to max 2, but it does obfuscate the relationships and the meaning of the facts that you are storing in your tables.
(Example functionID can be and ID of a function name or an id of a function that is performed in certain division/department combination - it is not clear what it is and these are not interchangeable; sort of like the difference between an instance and a class).
您需要一个简单的明星关系。 Position(事实表)只有相关主表(Department、Division 等)的 ID。这允许使用主表的任意组合。
主表可以根据需要在每个表中内置简单的层次结构。并且可以根据需要相互关联。但此细节不会影响针对 Position 的查询
您可以将 Position 中的 ID 设置为可为空,以实现可选关系
您可以将 StartDate 和 EndDate 列添加到 Position< /em> 跟踪随时间的变化
一个简单的例子是:
You need a simple star relationship. The Position (fact table) has just ID's of related master tables (Department, Division etc). This allows for any combination of the master tables to be used
The master tables can have simple hierarchy built into each of them as needed. And can relate to each other as needed. But the detail of this does not effect the queries against Position
You can make ID's in Position nullable for optional relationships
You could add a StartDate and EndDate columns to Position to track changes over time
A simple example of this is:
尝试给每个实体一个自己的表,例如
一些样本选择
Try giving each entity a table of its own e.g
And the some sample selects
通常,当我设置数据库时,我会想出我需要什么实体以及它们如何相互关联(即多一,一对一,...)。你似乎已经做到了。接下来我会弄清楚每个实体需要什么。例如,位置可能有:locationid、地址... 然后,分部 假设每个分部有一个分部,您可以让分部实体具有分部 ID、位置 ID,即每个分部需要的信息。所以基本上,如果它是一对多关系,例如一个位置与多个部门,您可以将位置的 id 放入部门表中。但是,如果是多对多关系,最好有一个中间表来连接两者,这样您就不需要仅更改 id 的重复记录。
Usually when I am setting up a db, I come up with what entities I need and how they are related to each other (ie many-one, one-one,...). Which you seem to have done. So next I figure out what each entity will need. For example, Location may have: locationid, address, ... Then, Divisions Assuming each that there are one location for many divisions, you could have the division entity have a divisionid, locationid, the information each division needs. So basically, if its a one-many relationship like one location to many divisions, you could just put the id of location in the division table. However, if it is a many-many relation, it is probably better to have an intermediary table to connect the two so you do not need to have duplicate records with only an id changing.
也许(可能)您应该将纺织部门的人力资源部门视为与营销部门的人力资源部门不同的部门。
Perhaps (probably) you should consider the HR department of the Textile division as a different department than the HR department of the Marketing division.