您将如何避免此数据库模式中的数据冲突?
我正在开发一个多用户互联网数据库驱动的网站,使用 SQL Server 2008 / LinqToSQL / 定制存储库作为 DAL。 我遇到了一个规范化问题,如果正确利用,该问题可能会导致数据库状态不一致,我想知道如何处理该问题。
问题:有几家不同的公司可以访问我的网站。 他们应该能够在我的网站上跟踪他们的项目和客户。 一些(但不是全部)项目应该可以分配给客户。
这会产生以下数据库模式:
**Companies:**
ID
CompanyName
**Clients:**
ID
CompanyID (not nullable)
FirstName
LastName
**Projects:**
ID
CompanyID (not nullable)
ClientID (nullable)
ProjectName
这会导致以下关系:
Companies-Clients (1:n)
Companies-Projects (1:n)
Clients-Projects(1:n)
现在,如果用户是恶意的,他可能会插入一个具有自己的 CompanyID 的项目,但具有属于另一个用户的 ClientID,从而使数据库处于不一致的状态状态。
该问题以类似的方式出现在我的整个数据库架构中,因此如果可能的话,我想以通用方式解决此问题。 我有以下两个想法:
检查可能导致 DAL 不一致的数据库写入。 这将是通用的,但在执行更新和创建查询之前需要一些额外的数据库查询,因此会导致性能下降。
为客户-项目关系创建一个附加表,并确保以此方式创建的关系一致。 这还需要一些额外的选择查询,但远少于第一种情况。 另一方面,它不是通用的,因此从长远来看,更容易错过某些内容,尤其是在向数据库添加更多表/依赖项时。
你会怎么办? 我错过了更好的解决方案吗?
编辑:您可能想知道为什么“项目”表有一个 CompanyID。 这是因为我希望用户能够添加有或没有客户端的项目。 我需要跟踪无客户端项目属于哪家公司(以及哪个网站用户),这就是项目需要 CompanyID 的原因。
I'm working on a multi-user internet database-driven website with SQL Server 2008 / LinqToSQL / custom-made repositories as the DAL. I have run across a normalization problem which can lead to an inconsistent database state if exploited correctly and I am wondering how to deal with the problem.
The problem: Several different companies have access to my website. They should be able to track their Projects and Clients at my website. Some (but not all) of the projects should be assignable to clients.
This results in the following database schema:
**Companies:**
ID
CompanyName
**Clients:**
ID
CompanyID (not nullable)
FirstName
LastName
**Projects:**
ID
CompanyID (not nullable)
ClientID (nullable)
ProjectName
This leads to the following relationships:
Companies-Clients (1:n)
Companies-Projects (1:n)
Clients-Projects(1:n)
Now, if a user is malicious, he might for example insert a Project with his own CompanyID, but with a ClientID belonging to another user, leaving the database in an inconsistent state.
The problem occurs in a similar fashion all over my database schema, so I'd like to solve this in a generic way if any possible. I had the following two ideas:
Check for database writes that might lead to inconsistencies in the DAL. This would be generic, but requires some additional database queries before an update and create queries are performed, so it will result in less performance.
Create an additional table for the clients-Projects relationship and make sure the relationships created this way are consistent. This also requires some additional select queries, but far less than in the first case. On the other hand it is not generic, so it is easier to miss something in the long run, especially when adding more tables / dependencies to the database.
What would you do? Is there any better solution I missed?
Edit: You might wonder why the Projects table has a CompanyID. This is because I want users to be able to add projects with and without clients. I need to keep track of which company (and therefore which website user) a clientless project belongs to, which is why a project needs a CompanyID.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
请注意,您所拥有的参考文献中没有循环,因此标题具有误导性。
你所拥有的是数据冲突的可能性,这是不同的。
为什么项目表中有“CompanyID”? 所涉及公司的 ID 由您链接到的客户隐式给出。 你不需要它。
删除该列,您就解决了问题。
此外,客户表中“名称”列的用途是什么? 您的客户可以有一个与公司名称不同的名称吗?
或者“客户”就是该公司的人员?
编辑:好吧,关于没有公司的项目的澄清,我会将引用分开,但是如果没有阻止进行多个引用的约束,您将无法摆脱您所描述的问题。
对现有表的一个简单约束是项目行的 CompanyID 和 ClientID 字段不能同时为非空。
Note, there's no circularity in the references you have, so the title is misleading.
What you have is the possibility of conflicting data, that's different.
Why do you have "CompanyID" in the project table? The ID of the company involved is implicitly given by the client you link to. You don't need it.
Remove that column and you've removed your problem.
Additionally, what is the purpose of the "name" column in the client table? Can you have a client with one name, differing from the name of the company?
Or is "client" the person at that company?
Edit: Ok with the clarification about projects without companies, I would separate out the references, but you're not going to get rid of the problem you're describing without constraints that prevent multiple references being made.
A simple constraint for your existing tables would be that not both the CompanyID and ClientID fields of the project row could be non-null at the same time.
我会选择后者,拥有一个或多个表来定义实体之间允许的关系。
I'd go with with the latter, having one or more tables that define the allowable relationships between entities.
如果您想像这样使用表并避免所有新查询,只需在表上放置触发器,当用户尝试插入包含错误数据的行时,触发器就会阻止他。
此致,
约丹
If you want to use the table like this and avoid the all the new queries just put triggers on the table and when user tries to insert row with wrong data the trigger with stop him.
Best Regards,
Iordan
我的第一个想法是为每个公司创建一个名为“No client”的特殊客户记录。 然后从 Project 表中删除 CompanyId,如果项目没有客户,则使用“无客户”记录而不是“正常”客户记录。 如果对此类无客户端的处理是特殊的,请向无客户端记录添加一个标志以明确标识它。 (我不想依赖“无客户端”或类似的名称 - 太模糊了。)
那么就无法存储不一致的数据,因此问题就会消失。
My first thought would be to create a special client record for each company with name "No client". Then eliminate the CompanyId from the Project table, and if a project has no client, use the "No client" record rather than a "normal" client record. If processing of such no-client's is special, add a flag to the no-client record to explicitly identify it. (I'd hate to rely on the name being "No Client" or something like that -- too fuzzy.)
Then there would be no way to store inconsistent data so the problem would go away.
最后,我实现了一个完全通用的解决方案,它解决了我的问题,没有太多的运行时开销,也不需要对数据库进行任何更改。 我会在这里描述,以防其他人遇到同样的问题。
首先,该方法之所以有效,是因为其他表通过多个路径引用的唯一表是 Companies 表。 由于我的数据库就是这种情况,我只需检查要创建/更新/删除的每个实体的所有 n:1 引用实体是否引用同一家公司(或根本没有公司)。
我通过从以下类型之一派生所有 Linq 实体来强制执行此操作:
SingleReferenceEntityBase - 标准。 仅检查(通过反射)是否确实只有一个对 Companies 表的引用(无论是传递还是不传递)。 如果是这种情况,对公司表的引用就不会变得不一致。
MultiReferenceEntityBase - 适用于特殊情况,例如上面的项目表。 询问所有直接引用的实体它们引用的公司 ID。 如果存在不一致,则引发异常。 每个 CRUD 操作都会花费一些选择查询的成本,但由于 MultiReferenceEntities 比 SingleReferenceEntities 少得多,因此可以忽略不计。
这两种类型都实现了“CheckReferences”,每当将 linq 实体写入数据库时,我都会通过部分实现为所有 Linq 实体自动生成的 OnValidate(System.Data.Linq.ChangeAction action) 方法来调用它。
In the end I implemented a completely generic solution which solves my problem without much runtime overhead and without requiring any changes to the database. I'll describe it here in case someone else has the same problem.
First off, the approach only works because the only table that other tables are referencing through multiple paths is the Companies table. Since this is the case in my database, I only have to check whether all n:1 referenced entities of each entity that is to be created / updated / deleted are referencing the same company (or no company at all).
I am enforcing this by deriving all of my Linq entities from one of the following types:
SingleReferenceEntityBase - The norm. Only checks (via reflection) if there really is only one reference (no matter if transitive or intransitive) to the Companies table. If this is the case, the references to the companies table cannot become inconsistent.
MultiReferenceEntityBase - For special cases such as the Projects table above. Asks all directly referenced entities what company ID they are referencing. Raises an exception if there is an inconsistency. This costs me a few select queries per CRUD operation, but since MultiReferenceEntities are much rarer than SingleReferenceEntities, this is negligible.
Both of these types implement a "CheckReferences" and I am calling it whenever the linq entity is written to the database by partially implementing the OnValidate(System.Data.Linq.ChangeAction action) method which is automatically generated for all Linq entities.