星型架构设计 - 一列维度

发布于 2024-09-19 10:09:34 字数 721 浏览 6 评论 0原文

我是数据仓库的新手，但我认为我的问题可以相对容易回答。我构建了一个星型模式，其中包含维度表“产品”。该表有一列“PropertyName”和一列“PropertyValue”。因此，尺寸看起来有点像这样：

surrogate_key | natural_key (productID) | PropertyName | PropertyValue | ...
    1              5                          Size           20          ...
    2              5                          Color          red
    3              6                          Size           20
    4              6                          Material       wood

等等。

在我的事实表中，我总是使用维度的代理键。由于 PropertyName 和 PropertyValue 列，我的自然键不再唯一/无法识别，因此我的事实表中的行太多。

我现在的问题是，我应该如何处理属性列？将每个属性放入单独的维度（例如维度大小、维度颜色等）是否最好？我得到了大约 30 个不同的属性。或者我应该为事实表中的每个属性创建列吗？或者用所有属性创建一维？

预先感谢您的任何帮助。

原文

I`m new to data warehousing, but I think my question can be relatively easy answered.
I built a star schema, with a dimension table 'product'. This table has a column 'PropertyName' and a column 'PropertyValue'.
The dimension therefore looks a little like this:

surrogate_key | natural_key (productID) | PropertyName | PropertyValue | ...
    1              5                          Size           20          ...
    2              5                          Color          red
    3              6                          Size           20
    4              6                          Material       wood

and so on.

In my fact table I always use the surrogate keys of the dimensions. Cause of the PropertyName and PropertyValue columns my natural key isn`t unique / identifying anymore, so I get way too much rows in my fact table.

My question now is, what should I do with the property columns? Would it be best, to put each property into separate dimensions, like dimension size, dimension color and so on? I got about 30 different properties.
Or shall I create columns for each property in the fact table?
Or make one dimension with all properties?

Thanks in advance for any help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

木緿 2024-09-26 10:09:34

您的维度表“产品”应如下所示：

surrogate_key | natural_key (productID) | Color | Material | Size | ...
    1              5                      red     wood       20     ...
    2              6                      red     ...

如果您有许多属性，请尝试将它们分组到另一个维度中。例如，如果您可以拥有具有相同 ID 和相同价格的其他颜色或材料的相同产品，则颜色和材料可以是另一个维度的属性。你的事实表可以用两个键来识别产品：product_id 和 colormaterial_id...

阅读推荐：
数据仓库工具包，Ralph Kimball

Your dimension table 'product' should look like this:

surrogate_key | natural_key (productID) | Color | Material | Size | ...
    1              5                      red     wood       20     ...
    2              6                      red     ...

If you have to many properties, try to group them in another dimension. For example Color and Material can be attributes of another dimension if you can have the same product with same id and same price in another color or material. Your fact table can identify product with two keys: product_id and colormaterial_id...

Reading recommendation:
The Data Warehouse Toolkit, Ralph Kimball

回复收藏 0 原文