多对多维度模型

发布于 2024-08-16 16:23:53 字数 352 浏览 1 评论 0 原文

各位,

我有一个名为 DIM_FILE 的维度表,其中包含我们从客户收到的文件的信息。每个文件都有详细记录,这些记录构成了我的 FACT 表 CUST_DETAIL。在主流程中,文件会经历几个阶段,每个阶段都会为其标记一个状态。简而言之,我有多对多的关系。关于星型模式维度建模的任何想法。一条客户记录只属于一个文件,一个文件可以有多种状态。

FACT
----
CustID
FileID
AmountDue


DIM_FILE
--------
FileID
FileName
DateReceived

FILE_STATUS
-----------
FileID
StatusDateTime
StatusCode

Folks,

I have a dimension table called DIM_FILE which holds information of the files we received from customers. Each file has detail records which constitutes my FACT table, CUST_DETAIL. In the main process, file is gone through several stages and each stage tags a status to it. Long in a short, I have many-to-many relationship. Any ideas around star schema dimensional modeling. A customer record only belong to a single file and a file can have multiple statuses.

FACT
----
CustID
FileID
AmountDue


DIM_FILE
--------
FileID
FileName
DateReceived

FILE_STATUS
-----------
FileID
StatusDateTime
StatusCode

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

江南月 2024-08-23 16:23:53

您可以采取一些措施将其与维度模型/星型模式结合起来:

  1. 构建两个星型(可能它们最终会出现在不同的数据集市中)。一个将 FACT 作为事实表,另一个将 FILE_STATUS 作为事实(您可以将其视为事务粒度事实表)。为了完成这项工作,我可能会反规范化并将 CustId 添加到 FILE_STATUS
  2. 由于您正在处理 FILE_STATUS,因此您可以将 FACT 转换为累积快照事实表。在此模型中,您将在 FACT 中拥有一组单独的额外列来记录属于每个状态转换的所有信息。至少,您将有一列日期/时间维度来记录何时达到特定状态。在 ETL 中,您必须更新事实表来记录文件如何通过状态进行处理。这种设计仅在状态数量有限且相对较小的情况下才有效。此外,应该有一个或多或少清晰的状态进展路径(就像客户订单:收到 -> 挑选 -> 包装 -> 发货 -> 已付款)
  3. 为状态创建一个所谓的多值维度: FACT 将获得这个新维度的键,并且这个新维度实际上表示适用于 FACT 表中的行的状态集合。
  4. 你可以有一个桥接表(虽然我不认为这适用于这个主题,不确定)

参考文献:

累积快照:http://blog.chrisadamson.com/2007/03/accumulated-snapshot-use-accumulated.html

多值维度与桥接表:http:/ /www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimension-modeling-techniques/multivalued-dimension-bridge-table/

There are a few things you can do to marry this with a dimensional model / star schema:

  1. Build two stars (possibly, they'd end up in different datamarts). One has FACT as the fact table, the other star has FILE_STATUS as fact (you can consider it as a transaction grained fact table). To make this work, I'd probably denormalize and add CustId to FILE_STATUS too
  2. Since you are dealing with FILE_STATUS, you could turn FACT into a accumulating snapshot fact table. In this model, you'd have a separate set of extra columns in FACT to record all information belonging to each status transition. At least, you'd have a column to the date/time dimension to record when a particular status was reached. In your ETL, you'd have to UPDATE the fact table to record how a file progresses through states. This design only works if the number of statuses is finite and relatively small. In addition, there should be a more or less clear path of status progressions (like with a customer order: received -> picked -> packaged -> shipped -> payed)
  3. Make a so called multivalued dimension for the statuses: FACT would get a key to this new dimension, and this new dimension would actually represent a collection of statuses that apply to a row in the FACT table.
  4. You could have a bridge table (although I don't think that applies to this subject, not sure)

References:

Accumulating snapshot:http://blog.chrisadamson.com/2007/03/accumulating-snapshot-use-accumulating.html

multivalued dimension vs bridge table: http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/multivalued-dimension-bridge-table/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文