如何设计数据仓库模式以实现BPMS系统的高效查询?
现状:
我们拥有 BPMS(业务流程管理套件)。对历史和操作报告的需求不断增加。 BPMS 中的数据模型不是为历史查询而设计的。所以我们正在分析可能的解决方案。
解决方案:
这个想法是将流中事件的数据推送到外部数据库。 BPM 中的典型事件有:创建新流程实例、更改状态、执行流程中的步骤或更改流程实例的状态。数据仓库是除了星型模式之外有趣的替代方案之一。假设有两个集线器:PI(流程项目实例)和 OU(组织单位)以及一个链接表 LINK_PI_OU。每次将流程项目分配给组织单位时,链接表中都会添加一个新行。链接表中的 LOAD_DATE 包含添加此记录时的日期时间。链接表中具有最新 LOAD_DATE 的记录显示了流程实例的当前分配。
问题:
假设业务部门想知道当前按组织单位分组的所有开放流程实例都分配给了谁。
此报告的查询会是什么样子?真的可以高性能吗? 或者我完全错了?
Current situation:
We have a BPMS (business process management suite) in place. There is increasing demand on historical and operative reports. The data model in the BPMS is not designed for historical queries. So we are analysing the possible solutions.
Solution in mind:
The idea is to push data on events in flow to an external database. Typical events in BPM are: new process instance was created, status changed, a step in the process was performed or status of the process instance was changed. Data vault is besides the star schema one of the interesting alternatives. Let’s assume there are two Hubs: PI (processitem instances) and OU (organisational unit) and a Link table LINK_PI_OU. Each time the process item is assigned to an organisational unit a new line will be added to the link table. The LOAD_DATE in the link table contains the datetime when this record was added. The record in the link table with the latest LOAD_DATE shows the current assignment of the process instance.
Question:
Let’ assume the business wants to know to whom all open process instances are currently assigned grouped by organisational unit.
How will a query look like for this report? Can it really be performant?
Or am I on the complete wrong way?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般来说,我不认为 Data-Vault 旨在成为最终用户报告层,甚至不是一个虚假的交易系统。
我并不完全清楚您的架构,但根据我的理解,DV 是一个历史存储库,用于保存为(Kimball/Inmon)数据仓库提供数据的企业的所有数据。所以从高层次来说......
交易系统=> DV =>深水宽度=> (cubes =>) users
在这种情况下,我不会向 Data Vault 提出查询,而是编写一些 ETL 来填充数据仓库并在 DWH 上提出查询。
我想,另一种观点是,您可以在 DV 之上构建一组视图,这会对用户隐藏结构,但我认为我有点纯粹主义者,会选择 DWH。
In general terms I didnt think that Data-Vault is intended to be an end user report layer or even a faux transactional system.
Im not completely clear on your archectiture, but in my understanding D-V is a historical repository that keeps all data for an enterprise that feeds a (Kimball/Inmon)datawarehouse. So in high level terms ...
Transaction systems => D-V => DWH => (cubes =>) users
This being the case, I wouldnt be posing queries to a Data Vault, instead I would write some ETL to populate a data warehouse and pose queries at the DWH.
The other view, I guess, is that you could build a set of views on top of the D-V, that would hide the structure from users, but I think I'm a bit of a purist and would go for a DWH.
正如 @Marcud D 所说,Data Vault 是数据仓库的模型,通常在使用 DV 建模时,您必须从 DV 构建数据集市以用于报告目的。我认为组织单位应该建模为卫星表,而不是中心表。因此,无论如何,您都应该构建一个查询来从 DV 模型提供特定的数据集市,然后将其用于报告目的。
As @Marcud D said, Data Vault is the model of Data Warehouse and usually when using DV modelling, you have to build data marts from DV for reporting purposes. I think that organizational unit should be modeled as Satellite table, not as Hub table. So, in any way, you should build a query to feed a specific data mart from DV model and then use it for reporting purposes.