如何管理同一条记录的多个版本
我正在为一家公司做短期合同工作,该公司试图为其数据库记录实施签入/签出类型的工作流程。
它应该是这样工作的...
- 用户在应用程序中创建一个新实体。除了主实体表之外,还将填充大约 20 个相关表。
- 创建实体后,用户会将其标记为主实体。
- 另一个用户只能通过“签出”实体来对主实体进行更改。多个用户可以同时结帐该实体。
- 一旦用户对实体进行了所有必要的更改,他们就会将其置于“需要批准”状态。
- 授权用户审查实体后,他们可以将其提升为主实体,这会将原始记录置于逻辑删除状态。
他们当前完成“签出”的方式是复制所有表中的实体记录。主键包括 EntityID + EntityDate,因此它们使用相同的 EntityID 和更新的 EntityDate 复制所有相关表中的实体记录,并为其赋予“已签出”状态。当记录进入下一个状态(需要批准)时,重复会再次发生。最终它将被提升为master,此时最终记录被标记为master,而原始master被标记为死亡。
这个设计对我来说似乎很可怕,但我理解他们为什么这样做。当有人从应用程序内查找实体时,他们需要查看该实体的所有当前版本。这是实现这一目标的一种非常简单的方法。但它们在同一个表中多次表示同一实体的事实并不适合我,而且它们复制每条数据而不是仅存储增量的事实也不适合我。
我很想听听您对设计的反应,无论是积极的还是消极的。
我也将不胜感激您可以向我提供的任何资源,这些资源可能有助于了解其他人如何实现这种机制。
谢谢!
达尔维斯
I am doing short-term contract work for a company that is trying to implement a check-in/check-out type of workflow for their database records.
Here's how it should work...
- A user creates a new entity within the application. There are about 20 related tables that will be populated in addition to the main entity table.
- Once the entity is created the user will mark it as the master.
- Another user can make changes to the master only by "checking out" the entity. Multiple users can checkout the entity at the same time.
- Once the user has made all the necessary changes to the entity, they put it in a "needs approval" status.
- After an authorized user reviews the entity, they can promote it to master which will put the original record in a tombstoned status.
The way they are currently accomplishing the "check out" is by duplicating the entity records in all the tables. The primary keys include EntityID + EntityDate, so they duplicate the entity records in all related tables with the same EntityID and an updated EntityDate and give it a status of "checked out". When the record is put into the next state (needs approval), the duplication occurs again. Eventually it will be promoted to master at which time the final record is marked as master and the original master is marked as dead.
This design seems hideous to me, but I understand why they've done it. When someone looks up an entity from within the application, they need to see all current versions of that entity. This was a very straightforward way for making that happen. But the fact that they are representing the same entity multiple times within the same table(s) doesn't sit well with me, nor does the fact that they are duplicating EVERY piece of data rather than only storing deltas.
I would be interested in hearing your reaction to the design, whether positive or negative.
I would also be grateful for any resoures you can point me to that might be useful for seeing how someone else has implemented such a mechanism.
Thanks!
Darvis
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我曾经开发过这样的系统,该系统支持一家非常大的银行的静态交易数据。在这种情况下,静态数据是交易对手的详细信息、标准结算指令、货币(不是外汇汇率)等。数据库中的每个实体都是版本化的,更改实体涉及创建新版本、更改该版本并获取版本已批准。然而,他们并没有让多人同时创建版本。
这导致数据库极其复杂,每个连接都必须考虑版本和批准状态。事实上,我为他们编写的软件是中间件,它将这种复杂的版本化数据抽象为最终用户应用程序可以实际使用的东西。
唯一可能使情况变得更糟的是存储增量而不是完整的版本化对象。所以这个答案的要点是——不要尝试实现增量!
I've worked on a system like this which supported the static data for trading at a very large bank. The static data in this case is things like the details of counterparties, standard settlement instructions, currencies (not FX rates) etc. Every entity in the database was versioned, and changing an entity involved creating a new version, changing that version and getting the version approved. They did not however let multiple people create versions at the same time.
This lead to a horribly complex database, with every join having to take version and approval state into account. In fact the software I wrote for them was middleware that abstracted this complex, versioned data into something that end-user applications could actually use.
The only thing that could have made it any worse was to store deltas instead of complete versioned objects. So the point of this answer is - don't try to implement deltas!
这看起来像时态数据库模式的示例 - 通常,在这种情况下,实体的键(在您的情况下为 EntityID)和数据库中的行主键(在您的情况下为 {EntityID , date},但通常是一个简单的整数)。您必须接受同一实体在其历史记录的不同点在数据库中多次出现。每个数据库行仍然有一个唯一的 ID;只是您的数据库正在跟踪版本,而不是实体。
您可以像这样管理数据,它可以非常擅长跟踪数据更改并提供责任(如果需要),但它使您的所有查询变得更加复杂。
您可以在维基百科上了解时态数据库背后的基本原理和设计
This looks like an example of a temporal database schema -- Often, in cases like that, there is a distinction made between an entity's key (EntityID, in your case) and the row primary key in the database (in your case, {EntityID, date}, but often a simple integer). You have to accept that the same entity is represented multiple times in the database, at different points in its history. Every database row still has a unique ID; it's just that your database is tracking versions, rather than entities.
You can manage data like that, and it can be very good at tracking changes to data, and providing accountability, if that is required, but it makes all of your queries quite a bit more complex.
You can read about the rationale behind, and design of temporal databases on Wikipedia
您正在描述一个自制内容管理系统,它可能随着时间的推移被黑客攻击在一起,是 - 用于你所说的原因是冗余和低效,并且考虑到公司中此类系统的性质,如果没有大规模的组织努力,不太可能被取代。
You are describing a homebrew Content Management System which was probably hacked together over time, is - for the reasons you state - redundant and inefficient, and given the nature of such systems in firms is unlikely to be displaced without massive organizational effort.