是否可以在数据仓库中创建重复维度?
是否有一种模式可以处理数据仓库中的重复维度?我有经常性的公司主题结构。销售事实可以分配到每个级别。示例
Company A <- sales facts here
Company A subcompany <- sales facts here
Department A1 <- sales facts here
Department A2 <- sales facts here
Company B <- sales facts here
Company C <- sales facts here
Company C department <- sales facts here
在显示公司 AI 的销售事实总和时,希望它是整棵树的销售额总和。
在我的关系数据库中,我有一个父子循环结构。我无法(或不知道如何)在数据仓库中创建这种结构,因为必须定义维度级别。
我想过三级层次结构,但有些公司根本没有部门。
我正在使用 InfiniDB 并尝试配置 Mondrian 和 JPalo
Is there a pattern that can handle recurring dimension in data warehouse? I've got recurring company subjects structure. Sales fact can be assigned at every level. Example
Company A <- sales facts here
Company A subcompany <- sales facts here
Department A1 <- sales facts here
Department A2 <- sales facts here
Company B <- sales facts here
Company C <- sales facts here
Company C department <- sales facts here
While displaying sales fact sum for Company A I want it to be sum of sales of whole tree.
In my relational database I have a parent-child recurring structure. I can't (or don't know how) create this kind of structure in data warehouse, as dimensions levels must be defined.
I thought about 3 levels of hierarchy, but some companies doesn't have departments at all.
I'm using InfiniDB and trying to configure Mondrian and JPalo
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
只需将其反规范化到dimDepartment表中
因此对于整个公司A:
对于公司A的子公司B
对于部门A1,子公司B,公司A
如果公司没有子公司,只需使用'
none' 或 'main'
作为默认子公司名称。Simply de-normalize this into the dimDepartment table
So for whole company A:
for sub-company B of the company A
for department A1, sub-company B, company A
If a company does not have sub-companies, simply use '
none' or 'main'
as a default sub-company name.您在这里的问题实际上与不规则层次结构与固定层次结构的建模有关。这是一个很大的主题,虽然有一些方法可以存储和查询不规则的层次结构,但在许多情况下,您会发现架构或业务模型的一个或其他方面限制您使用固定/命名级别的层次结构 - 因此除非深度确实是任意的(很少是),您最好选择一个合理的值并基于它来实施。例如,在您的数据中,这表明级别本身是已知/定义的,但可能是可选的 - 公司/子公司/部门/子部门等 - 如果您想总结所有人力资源部门的成本公司,如果您总是知道该数据存在于树的特定级别(例如 3),您会发现事情变得容易得多...
M
Your question here really relates to the modelling of ragged hierarchies vs fixed hierarchies. It's a big subject and, while there are methods for storing and querying ragged hierarchies, in many cases you will find that one or other aspect of your architecture or business model constrains you back to have fixed/named-level hierarchies - hence unless the depth is truly arbitrary (it rarely is) you are better picking a sensible value and implementing based on it. In your data for example, it would suggest that the levels themselves are known/defined but may be optional - Company/Sub-Company/Department/Sub-Department etc - If you ever wanted to sum up the costs of the HR departments of all companies you would find it much easier if you always new that that data existed at a specific level (eg 3) of your tree...
M