保持引用完整性——好还是坏?
我们计划在数据库中引入简单的审计跟踪,使用触发器和每个需要审计的表的单独历史表。
例如,考虑表 StudentScore,它具有很少的外键(例如 StudentID、CourseID)将其链接到相应的父表(学生和课程)。
Table StudentScore (
StudentScoreID, -- PK
StudentID ref Student(StudentID), -- FK to Student
CourseID ref Course(CourseID), -- FK to Course
)
如果 StudentScore 需要审核,我们计划创建审核表 StudentScoreHistory -
Table StudentScoreHistory (
StudentScoreHistoryID, -- PK
StudentScoreID,
StudentID,
CourseID,
AuditActionCode,
AuditDateTime,
AuditActionUserID
)
如果修改 StudentScore 中的任何行,我们会将旧行移动到 StudentScoreHistory。
设计讨论中提出的要点之一是使 StudentHistory 表中的 StudentID 和 CourseID 成为 FK,以保持引用完整性。支持这一点的论点是因为我们总是主要执行软(逻辑布尔标志)删除而不是硬删除,保持引用完整性有利于确保我们在审核表中没有任何孤立ID 。
Table StudentScoreHistory (
StudentScoreHistoryID, -- PK
StudentScoreID,
StudentID ref Student(StudentID), -- FK to Student
CourseID ref Course(CourseID), -- FK to Course
AuditActionCode,
AuditDateTime,
AuditActionUserID
)
对我来说,这似乎是一个有点奇怪的设计。我确实同意 @Jonathan Leffler 的评论,即审核记录不应停止删除父数据。相反,如果需要,应通过主表中的外键而不是审计表中的外键进行处理。我想征求您的意见,以确保我不会错过将外键扩展到审计表的一些价值。
现在我的问题是: 在历史表中包含这些外键是一个好的设计吗?
有关关键参数(例如性能、最佳实践、设计灵活性等)的任何详细信息都会受到高度赞赏。
为了使任何寻求特定目的和我们的环境的人受益:
目的:
- 维护关键数据历史记录
- 允许审核用户活动并支持重新创建场景
- 在有限范围内允许回滚用户活动
环境:
- 事务数据库
- 并非每个表都需要审核
- 使用软件尽可能删除,特别是静态/参考数据
- 很少有高度事务性的表使用硬删除
We are planning on introducing simple Audit Trail in our database using triggers and separate history table for each table that requires auditing.
For example consider table StudentScore, it has few foreign keys (eg. StudentID, CourseID) linking it to corresponding parent tables (Student & Course).
Table StudentScore (
StudentScoreID, -- PK
StudentID ref Student(StudentID), -- FK to Student
CourseID ref Course(CourseID), -- FK to Course
)
If StudentScore requires auditing, we are planning to create audit table StudentScoreHistory -
Table StudentScoreHistory (
StudentScoreHistoryID, -- PK
StudentScoreID,
StudentID,
CourseID,
AuditActionCode,
AuditDateTime,
AuditActionUserID
)
If any row in StudentScore is modified we'll move old row to StudentScoreHistory.
One of the points raised during design discussion was to make StudentID and CourseID in StudentHistory table a FK, to maintain referential integrity. Argument made in favour of this was as we always mostly do a soft (logical Boolean flag) delete rather than hard delete, its good to maintain referential integrity to ensure we do not have any orphan ids in audit table.
Table StudentScoreHistory (
StudentScoreHistoryID, -- PK
StudentScoreID,
StudentID ref Student(StudentID), -- FK to Student
CourseID ref Course(CourseID), -- FK to Course
AuditActionCode,
AuditDateTime,
AuditActionUserID
)
This seems to be a bit odd design to me. I do agree with @Jonathan Leffler's comment that audit record should not stop the deletion of parent data. Instead, if this is required, should be handled via foreign keys in main table and not in audit table. I want to get your opinion, to make sure I'm not missing some value in extending foreign keys to audit tables.
Now my question is: Is it a good design to have these foreign keys in History tables?
Any details on key arguments (e.x. performance, best practice, design flexibility etc) would be highly appreciated.
For benefit of anyone looking for specific purpose and our environment:
Purpose:
- Maintain critical data history
- Allow auditing of user activity with support to recreate scenario
- To limited extent allow roll-back of user activity
Environment:
- Transactional database
- Not every table requires auditing
- Uses soft-delete to the extent possible, specifically for static/reference data
- Few highly transactional tables do use hard deletes
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
当讨论审计时,我会回到审计背后的目的。它并不是真正的备份,而是过去的历史。例如,对于
StudentScore
,您希望确保不会丢失学生最初得分为 65% 而现在得分为 95% 的事实。通过此审计跟踪,您可以回顾所做的更改,了解发生了什么以及是谁做的。由此,您可以识别特定用户滥用系统的行为。在某些方面,这可能是一种备份,因为您可以将这些更改回滚到以前的状态,而无需回滚整个表。考虑到这一点(如果我对您使用它的用途的假设是正确的),您唯一需要 FK/PK 关系的地方是历史表与其“实时”对应表之间。您的审计(历史)表不应引用任何其他表,因为它不再是该系统的一部分。相反,它只是一张表中所发生事件的记录。时期。您可能需要考虑的唯一引用完整性是历史表和活动表之间的引用完整性(因此可能存在 FK/PK 关系)。如果允许从实时表中删除记录,则不要在历史表中包含 FK。然后历史表可以包含已删除的记录(如果允许删除,这就是您想要的)。
不要与此历史表的主数据库中的关系完整性相混淆。历史表都是独立的。它们仅充当一个表(而不是一组表)的历史记录。
将两个历史表关联在一起是可能的,甚至可以在实时表和历史表之间建立更高级的关系(例如,同时具有实时和历史的学生和课程),因此您甚至可以处理学生被删除的可能性(不寒而栗)因为该记录仍会在历史表中。这里唯一的问题是,如果您不保留特定表的历史记录,在这种情况下您将选择丢失该数据(如果允许删除)。
When discussing auditing, I would go back to the purpose behind it. It isn't really a backup but rather a history of what has been. For example, for
StudentScore
, you would want to be sure not to lose the fact that the student originally had a 65% when they now have a 95%. This audit trail would allow you to walk back through the changes to see what happened and who did it. From this, you could identify what a particular user did to abuse the system. In some ways this could be a type of backup since you could roll back these changes to their previous states without rolling back entire tables.With this in mind (if my assumptions about what you are using this for are correct), the only place you would want a FK/PK relationship is between the history table and its "live" counterpart. Your audit (history) table should not refer to any other table because it is no longer a part of that system. Instead, it is simply a record of what has happened in one table. Period. The only referential integrity you might want to consider is between the history table and the live table (thus the possible FK/PK relationship). If you allow records to be deleted from the live table, don't include the FK in the history table. Then the history table could include deleted records (which is what you want if you allow deletions).
Don't get confused with relational integrity in the main database with this history table. The history tables are all stand-alone. They only serve as a history of one table (not a set of tables).
The relating of two history tables together is possible and even more advanced relations between the live and history tables together (Students and Courses with both live and history, for example) so you can deal with even the possibility that a student was deleted (shudder) since the record would still be in the history table. The only issue here would be if you don't keep the history for a particular table, in which case you are choosing to lose that data (if you allow deletes).
我建议不要将外键扩展到审计表。我的建议是将审计中的数据扩展到外键值。
它将存储为“HTML4”,而不是将 CourseID 存储为“1”。这样,如果删除外键值,审计表仍然有效。如果将来随时将外键值从“HTML4”更改为“HTML5”,这也将成立。如果您只存储外键,那么您将告诉审核员以前的学生做了“HTML5”,这是不正确的。
另一个巨大的好处是能够将审计跟踪发送到另一台服务器进行数据挖掘,而不会出现任何问题。
我已经使用上面的设置一段时间了,它对我有用。
I would suggest not extending foreign keys to audit tables. My recommendation is that the data in an audit be expanded to what the Foreign Key values are.
Instead of storing CourseID as "1", it will be "HTML4". That way if the Foreign Key value is removed, the audit table is still valid. This will also hold true if the Foreign Key value is changed from "HTML4" to "HTML5" anytime in the future. If you only stored the Foreign Key, then you would be telling the auditor that the previous students did "HTML5", which is not correct.
Another great benefit is the ability to ship off the audit trails to another server for data mining without any issues.
I have used the above setup for a while now and it works for me.
如果您需要重新创建场景,那么我会说是的,您需要 FK,并且我认为拥有它们将是跟踪相关详细记录的更简单的方法。然而,这使得删除成为一个问题,并且主键表中的信息可能会发生变化。在这种情况下,我想说您不想删除其他表中具有FK的记录,而是使用软删除,正如您已经指出的那样。
至于PK表中的信息变化,买者自负。设置 FK 是获得一些回溯能力的简单方法,但它并不完美。这是需要权衡的。为了获得绝对完美的历史记录,您基本上需要在审核候选记录发生任何情况时创建所有相关记录的备份副本。您需要找出合适的粒度级别并遵循它,因为完美的事件记录可能很难设置,并且在此过程中占用大量空间。
另外,这可能是也可能不是您的选择,但我强烈考虑使用 ApexSQL Audit 等工具的组合 + ApexSQL Log,而不是自行开发的审核解决方案。根据您的需求,这两个工具与定期归档事务日志相结合将满足您需要做的事情。审计工具可以将数据存储在同一个数据库或其他地方,日志工具可以有选择地恢复数据。只是一个想法。
If you need to re-create the scenario, then I'd say yes you need the FKs, and having them I think would be an easier way to track through to the relevant related detail records. However, this makes deletions an issue, as well as info that may change in your primary key tables. In this case I'd say you don't want to delete records that have FKs in other tables, but rather use a soft delete as you've already indicated.
As far as info in the PK tables changing, caveat emptor. Setting up FKs would be a simple way to get some traceback ability, but it won't be perfect. There are trade-offs. To get an absolutely perfect history, you basically would need to create backup copies of all related records, any time an audit candidate record has something happen on it. You need to figure out the level of granularity that's appropriate and go with it, because a perfect record of events could be complicated to set up, and eat up a lot of space in the process.
Also, this may or may not be an option for you, but I would strongly consider a combination of tools like ApexSQL Audit + ApexSQL Log as opposed to a homegrown auditing solution. Based on your needs, those two tools combined with archiving your transaction logs periodically would cover what you need to do. The audit tool can store data in the same db or elsewhere, and the log tool can selectively recover data. Just a thought.
您的情况显然会因情况而异,但根据我的经验,请保持原始表主键的引用完整性,仅此而已。这可以避免历史记录中的孤立 ID,同时允许与相关表进行流畅的交互。
假设您有这样的情况:
在这种情况下,在 Score_logs 上有一个引用分数(score_id)的删除级联 fkey 是有意义的。这是物体;如果它被硬删除,也可能会丢弃历史记录。
相比之下,根据我的经验,student_id 和 course_id 上的外键就没那么有意义了。它们意味着您无法对学生和课程进行(硬)删除 - 即使不存在引用它们的实时行。这可能是您想要实现的目标,在这种情况下请忽略该提示。就我而言,我发现自己需要修剪用户、评论、产品、订单等;历史日志中的外键使这变得不方便。
另请注意,有时 fkey 对您不利。如果您的订单上有订单行,并且该订单行被删除,您仍然需要该订单行上的历史记录。在这种情况下使用的正确 pkey 是 order_id,而不是 order_line_id。
最后一点,如果您最终选择保留 fkey:请考虑它们应该指向什么。对于解耦的数据(例如学生和课程),可以合理地假设实时行没问题。然而,对于强耦合的数据(例如产品和促销),您真正想要的是引用 fkey 及其版本。
对于前两点,您可能会发现这个相关的线程和答案很有趣:
如何为聚合根创建审计跟踪?
Your milage will obviously vary with the situation, but in my experience, keep referential integrity with the originating table's primary key and no more. This allows to avoid orphan IDs in the history able, while allowing fluid interactions with the related tables.
Suppose, for instance that you've something like this:
In that case, having an on delete cascade fkey referencing scores(score_id) on score_logs makes sense. It's the object; if it gets hard-deleted, might as well discard the history as well.
The foreign keys on student_id and course_id, by contrast, make less sense in my experience. They mean that you cannot do a (hard) delete on students and courses -- even when no live rows that reference them exist. This might be what you want to achieve, in which case ignore the tip. In my case, I find myself in need pruning users, comments, products, orders and so forth; foreign keys in the history logs make this inconvenient.
Also, note that there is a case where fkeys work against you. If you've an order line on an order, and the order line gets deleted, you still want the history on that order line. The correct pkey to use in this case is the order_id, not the order_line_id.
One last note, in case you end up opting to keep the fkeys: consider what they should be pointing to. With decoupled pieces of data (e.g. students and courses), it's reasonable to assume that the live row is fine. With strongly coupled pieces of data, however, (e.g. products and promos) what you'll really want is to be referencing both the fkey and its version.
Re the two previous points, you might find this related thread and answer interesting:
How do you create an audit trail for aggregate roots?
如果您的系统确实专注于事务处理,那么我的答案可能不太适合您,但在数据仓库/BI 世界中,这个问题通常可以通过使用“星型模式”来解决。在这种方法中,您可以对链接表中的重要指示信息以及审计记录进行非规范化。这可能包括父表的 PK 值(即审核表上的 FK 值)。但是,您不会保留实际的引用完整性约束本身。
因此,对于您的示例,您的 StudentScoreHistory 表可以保留其 StudentID 列,而无需 FK 约束,也可能保留 StudentName(或您认为可能需要 Student 提供的任何内容)。这样,您就可以返回审计跟踪来拼凑发生的事情以及时间,而不必担心是硬删除还是软删除父记录。这样做的另一个优点(或缺点,取决于您的观点)可以跟踪可更改的父表属性,就像最初记录子记录时一样。例如,了解学生 123456(现在是已婚夫人)在授予其生物学学位时曾经是单身女孩小姐可能会很有用。
If your system is really focused on transaction processing then my answer may not apply well to you, but in the datawarehouse/BI world, this problem is often solved by using a "star schema". In this approach, you would denormalize the important indicative information from the linked tables along with your audit records. This could include the PK values of the parent tables (i.e. the FK values on your audited table). However, you wouldn't preserve the actual referential integrity contstraints themselves.
So for your example, your StudentScoreHistory table could retain its StudentID column, without the FK constraint, as well as maybe the StudentName (or whatever you think you might need from Student). This way you can go back to your audit trail to piece together what has happened and when without worrying about whether you are hard or soft deleting parent records. This has the further advantage (or disadvantage, depending on your perspective) of keeping track of the changable parent table attributes as they were when the child record was originally recorded. For example, it might be useful to know that Student 123456, who is now Mrs. Marriedlady used to be Miss Singlegirl when her biology degree was conferred.
您的实时模式强制执行关系完整性,因此您不需要历史模式中的外键。或者换句话说:在历史模式中的表之间强制执行外键的唯一原因是是否存在某种针对历史模式执行 DML 的机制,而不是根据实时模式中的更改来填充它。在这种情况下,您的历史模式作为审计跟踪几乎毫无用处。
您提出了软删除的问题,这使问题变得混乱。仅当您考虑在两个模式之间使用外键(例如
StudentScoreHistory
引用StudentScore
)时,这才有意义。这可能是一个有效的设计,但同样,它表明您不信任您的审核机制。就我个人而言,我更喜欢在活动表中进行硬删除,并在历史表中记录删除的事实。软删除只是悲伤的另一个来源。但无论如何,这是一个不同的问题。每个表的实时版本和历史版本之间完全可以有外键,例如
StudentScoreHistory -> StudentScore
也没有在历史模式中强制执行关系完整性,例如,StudentScoreHistory ->学生历史
。Your live schema enforces relational integrity so you don't need foreign keys in the History schema. Or put it another way: the only reason to enforce foreign keys between tables in the History schema is if there is some mechanism for executing DML against the History schema other than populating it from changes in the live schema. In which case your History schema is pretty useless as an audit trail.
You raise the question of soft deletes, which confuses the issue. That would only be relevant if you're considering having foreign keys between the two schemas e.g.
StudentScoreHistory
referencesStudentScore
. That can be a valid design, but again, it suggests you don't trust your audit mechanism. Personally I would prefer to have hard deletes in the live tables, and record the fact of deletion in the History table. Soft deletes are just another source of grief.But anyway this is a different question. It is perfectly possible to have foreign keys between the live and history versions of each table e.g.
StudentScoreHistory -> StudentScore
without also enforcing the relational integrity within the History schema e.g,StudentScoreHistory -> StudentHistory
.由于我是第一次实施非常相似的审计系统,因此我目前也面临着同样的担忧。我的观点与 BiggsSTRC 的观点相呼应 - 您的“实时”表维护与课程记录的 FK 关系,而您的历史表仅维护与其“实时”对应项(StudentScore)的关系。我认为,这可以实现审计表中没有孤儿。
现在,还有一些我在答案中没有看到提到的内容:在我们当前的项目中,我们看到了在历史表中维护 CourseHistory 表中的 FK 的价值,以便我们知道课程的“状态”是什么StudentScoreHistory 审核条目时的记录。当然,这对您来说可能重要也可能不重要,具体取决于您的系统逻辑。
对于您担心的问题(在您对 BiggsSTRC 的回答中),您可能多次拥有相同的 CourseId,我们的解决方案是不引用实际的 CourseId,而是引用 CourseHistory 表的 PK 列。我们仍然没有确定如何实现这一目标 - 即使没有更改,我们是否要创建课程记录的审核条目,或者尝试引入一些逻辑来查找与相关课程匹配的 CourseHistory 记录StudentScoreHistory 条目时的状态。
Being in the midsts of implementing very similar auditing system for the first time, I am currently facing that same concern. My opinion echoes that of BiggsTRC - your "live" table maintains the FK relation to the Course record and your history table only maintains relation to its "live" counterpart (StudentScore). This, I think, achieves not having orphans in the audit table.
Now, there is something else I did not see mentioned in the answers: in our current project, we saw value of maintaining a FK in the history table to the CourseHistory table, so that we know, what was the "state" of the Course record at time of the StudentScoreHistory audit entry. Of course that may or may not matter to you, depending on your system logic.
Our solution to your concern (in your answer to BiggsTRC), that you might have the same CourseId several times was to reference not the actual CourseId, but the PK column of CourseHistory table. We still don't have a firm decision how to accomplish this - whether we want to create audit entry of the Course record even if there was not a change, or try to introduce some logic to look-up the CourseHistory record that matches relevant Course state at time of StudentScoreHistory entry.
如果您只打算按照您的描述进行软删除,那么我认为您没有理由不使用外键。
If you only plan to do soft deletes as you describe then i see no reason why you shouldn't use foreign keys.
我不会为“已审核”行创建第二组表,只需将审核功能集成到现有的生产模式中即可。听起来您的目的不是备份和恢复给定日期/灾难,而是跟踪每个用户或学生的更改历史记录,这是您的应用程序的功能。我认为您的附加字段很好,只是不需要将它们添加到另一组表中。
备份和恢复过程的一个问题是架构更改。架构往往会随着时间的推移而改变,这意味着您可能无法直接从备份进行恢复。如果您将审核功能内置到生产模式中,那么当您需要支持其他功能时,您不必担心会破坏任何内容。
I wouldn't create a second set of tables for the 'audited' rows, just integrate your auditing functions into your existing production schema. It sounds like your purpose isn't backup and restore as of a given date/disaster, but trace history of changes per user or student, and that's a function of your application. I think your additional fields are fine, they just don't need to be added to another set of tables.
One problem with backup and restore processes is schema changes. Schemas tend to change over time which means you may not be able to restore directly from a backup. If you keep your auditing functions built into your production schema, you don't have to worry about breaking anything when you need to support additional functions.