数据仓库事实表的设计

发布于 2024-08-11 19:27:01 字数 797 浏览 7 评论 0原文

您如何在数据仓库中对此进行建模:


  1. 有些城市是地理区域,存在于地理层次结构中,例如省(即州,例如明尼苏达州),地区(例如中西部)。

  2. 通过计算“已完成的住房积压百分比”、“预算支出百分比”、“分配给基础设施的预算百分比”、“债务人覆盖率”等绩效指标,对这些城市进行绩效评估< /p>

  3. < p>这些性能指标大约有 100 个。

  4. 这些指标被分组为“绩效组”,而“绩效组”本身又被分组为“关键绩效领域”

  5. 将计算应用于绩效指标(计算结果会根据某些因素(例如城市类型、规模、地区等)而变化)以生成“绩效分数”。

  6. 然后将权重应用于分数以创建“最终加权分数”。 (即,当汇总到“关键绩效领域”时,某些指标的权重高于其他指标)

  7. 将有一个时间维度(每年进行评估),但目前只有一个数据集。


注意:用户需要能够轻松地跨任意指标组合查询数据。即,有人可能想查看:(i) (v) 省级的 (ii)“债务人覆盖率”与 (iii)“预算支出百分比”与 (iv)“债务人天数”的绩效水平。

我通过将“IndicatorType”作为维度进行了尝试,然后在该表中具有[指标/绩效组/绩效区域]层次结构 - 但随后我无法弄清楚如何轻松地在同一行上获取多个指标,因为它需要一个事实表别名(?)。因此,我考虑将所有 100 个项目作为列放入(非常宽!)事实表中 - 但这样我就会失去指标上的 [组/区域] 层次结构......?

有什么想法吗?

谢谢

how would you model this in a data warehouse:


  1. there are municipalities which are geographical areas, that exist in geographical hierarchies, such a province (i.e. state, e.g. Minnesota), region (e.g. MidWest).

  2. a performance evaluation is done on these municipalities, by calculating performance indicators such as "% of housing backlog completed", "% of budget spent", "% of budget allocated to infrastructure", "debtor coverage", etc.

  3. there are about 100 of these performance indicators.

  4. these indicators are grouped into "performance groups", which are themselves grouped into "key performance areas"

  5. calculations are applied to the performance indicators (the calculations vary based on certain factors such as municipality type, size, region, etc) to produce "performance scores".

  6. weightings are then applied to the scores to create "final weighted scores". (i.e. some indicators are weighted more than others when aggregated into the "key performance areas")

  7. there will be a time dimension (evaluations done yearly), but for now just the one data set.


NB: users need to be able to easily query the data across any combination of indicators. i.e. someone might want to see: (i) the performance level of (ii) "debtor coverage" against (iii) "% budget spent" against (iv) "debtor days" at a (v) provincial level.

I tried this by having "IndicatorType" as a dimension, and then having the [indicator / performance group / performance area] hierarchy in that table - but then i can't work out how to easily get multiple indicators on the same line, as it would need a fact table alias(?). So I thought of putting all 100 items as columns in a (very wide!) fact table - but then I would lose the [group/area] heirarchy on the indicators...?

Any ideas?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蓝海 2024-08-18 19:27:01

希望这是不言自明的。

regionperf_model_01

Hope this is self-explanatory.

regionperf_model_01

最舍不得你 2024-08-18 19:27:01

这是一个非常复杂的问题,但我花时间仔细研究了您的一些观点并提出了这个模型(对您来说应该是一个好的开始)。

维度:

DIM_MUNICIPALITIES:

字段 = {MUNICIPAL_KEY、COUNTRY、REGION、STATE_PROV、CITY?、SIZE_SCORE}

层次结构 = {COUNTRY <-- REGION <-- STATE_PROV <-- CITY?}

DIM_INDICATORS:

字段 = {INDICATOR_KEY、PERFORMANCE_AREA、 PERFORMANCE_GROUP、PERFORMANCE_INDICATOR}

层次结构 = {PERFORMANCE_AREA <-- PERFORMANCE_GROUP <-- PERFORMANCE_INDICATOR}

DIM_DATE:

字段 = {DATE_KEY、CALENDAR_DATE(SQL 日期时间)、年、月、周、日...}

层次结构 = {YEAR <-- - MONTH <-- WEEK <-- DAY <-- DATE_KEY}

然后在事实表(例如 MYFACT)中,您将执行如下操作:

FACT_MYFACT:

Fields = {MYFACT_KEY, DATE_KEY, MUNICIPAL_KEY, INDICATOR_KEY, PERFORMANCE_SCORE, BUDGET、ETC...}

事实表可以包含所有这些度量列(BUDGET、ETC),或者您可以在计算成员中执行它们,这完全取决于您希望如何进行访问。

希望这可以帮助您有一个好的开始!

This is a very involved question but I took the time to go through some of your points and came up with this model (should be a good start for you).

Dimensions:

DIM_MUNICIPALITIES:

Fields = {MUNICIPAL_KEY, COUNTRY, REGION, STATE_PROV, CITY?, SIZE_SCORE}

Hierarchy = {COUNTRY <-- REGION <-- STATE_PROV <-- CITY?}

DIM_INDICATORS:

Fields = {INDICATOR_KEY, PERFORMANCE_AREA, PERFORMANCE_GROUP, PERFORMANCE_INDICATOR}

Hierarchy = {PERFORMANCE_AREA <-- PERFORMANCE_GROUP <-- PERFORMANCE_INDICATOR}

DIM_DATE:

Fields = {DATE_KEY, CALENDAR_DATE (SQL datetime), YEAR, MONTH, WEEK, DAY...}

Hierarchy = {YEAR <-- MONTH <-- WEEK <-- DAY <-- DATE_KEY}

Then in your fact table (say MYFACT) you would do something like the following:

FACT_MYFACT:

Fields = {MYFACT_KEY, DATE_KEY, MUNICIPAL_KEY, INDICATOR_KEY, PERFORMANCE_SCORE, BUDGET, ETC....}

The fact table could have all these Measure columns (BUDGET, ETC) or you could do them in Calculated members, it all depends on how you want to make the accessible.

Hope this helps you get a good start!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文