当前位置：文江博客话题详情

数据仓库设计中星型模式的度量到底是什么？

发布于 2024-12-07 09:02:40 字数 95 浏览 0 评论 0原文

星型模式由维度表和事实表组成。

事实表包含每个维度的外键，除此之外，它还包含“度量”。这项措施具体包括哪些内容？

是否存储了某个聚合函数的答案？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

楠木可依 2024-12-14 09:02:40

基本上是的。

如果您有一个简单的网格，

Salary     Januari   Februari   March   April   May    June
           Q1                         Q2

Me         1100      1100      1100    1100    1500    1500
Collegue1   2000      2000      2000    0       0      0

那么时间是一个具有两个级别的分层维度（如图所示）。
显示的另一个维度是“EmployeeID”。另一个维度（未显示）可以在观点中（例如预算/实际）。

金额（例如 1100）是衡量标准，它构成您的事实（事实的非识别部分）。这些维度定义了各个级别上每个度量的合并函数（例如 Amount(Q1) == SUM(Amount(January...March))）。请注意，根据度量的不同，合并的行为也会有所不同（例如，不会对所得税百分比进行求和，而是以某种方式进行合并：OLAP 多维数据集的设计艺术究竟如何）。

（琐事：您可以计算出度量值，使用 MDX 来查询，例如与上一季度相比的金额偏差、整个季度的平均工资等；很明显，再次需要考虑合并公式）。

此时，您将开始看到，设计合并规则取决于计算规则的顺序（如果首先计算“工资偏差％”的公式，然后进行合并，则需要对其进行平均；但是，如果原始工资度量首先合并（求和）到 Q1、Q2 级别，然后可以像在最低级别一样计算派生度量。

现在，在决定如何存储多维数据集时，事情变得更有趣基本上存在两种方法：

预先计算。所有细胞（包括所有场景中的所有合并）
即时计算

大多数 OLAP 引擎都融合在混合方法 (HOLAP) 上，这不会令任何人感到惊讶，其中经常访问的合并级别的重要部分是预先计算和存储的，而其他部分是预先计算和存储的。。

有些会将底层数据存储在标准 RDBMS (ROLAP) 中，而另一些则不会 (OLAP)。专注于高性能的引擎倾向于将所有数据保存在预先计算的多维数据集中（仅诉诸“许多小数据集”）子立方体用于非常稀疏的维度）。

好吧，无论如何，这有点咆哮。我喜欢漫无目的地讲述我曾经在数据仓库和 OLAP 方面学到的东西

Basically yes.

If you had a simple grid

Salary     Januari   Februari   March   April   May    June
           Q1                         Q2

Me         1100      1100      1100    1100    1500    1500
Collegue1   2000      2000      2000    0       0      0

Time is a hierarchical dimension with two levels (shown).
The other dimension shown is 'EmployeeID'. The other dimension (not shown) could be in the PointOfView (e.g. Budget/Actual).

The Amount (1100, e.g.) is the Measure and it constitutes your facts (the non-identifying parts of the facts). The dimensions define consolidation functions for each measure on the various levels (E.g. Amount(Q1) == SUM(Amount(January...March))). Note that the consolidation will behave differently depending on the measure (e.g. the income tax % will not be summed, but somehow consolidated: how exactly is the art of OLAP Cube design).

(trivia: you can have calculated measures, that use MDX to query e.g. the deviation of Amount in comparison the the preceding Quarter, the Average salary acoss the whole quarter etc.; it will be pretty clear that again, the consolidation formulas require thought).

At this point you will start to see that designing the consolidation rules depends on the order in which the rules are calculated (if the formula for 'salary deviation %' is is evaluated FIRST and then consolidated, you need to average it; however if the raw SALARY measure is consolidated (summed) to the Q1,Q2 level first, then the derived Measure can be calculated like it was at the lowest level.

Now things become more fun when deciding how to store the cube. Basically two ways exist:

precalculate all cells (including all consolidations in all scenarios)
calculate on the fly

It won't surprise anyone that most OLAP engines have converged on hybrid methods (HOLAP), where significant parts of frequently accessed consolidation levels are pre-calculated and stored, and other parts are calculated on the fly.

Some will store the underlying data in a standard RDBMS (ROLAP) other won't (OLAP). The engines focused on high performance tend to keep all data in precalculated cubes (only resorting to 'many small sub-cubes' for very sparse dimensions).

Well, anywyas, this was a bit of a rant. I liked rambling off what I once learned when doing datawarehousing and OLAP

回复收藏 0 原文