生产模式
有人可以帮助我理解生产模式和数据仓库模式之间的区别。
Can someone help me in understanding the difference between production schema and datawarehouse schema.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
有人可以帮助我理解生产模式和数据仓库模式之间的区别。
Can someone help me in understanding the difference between production schema and datawarehouse schema.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(3)
不要称其为“生产模式”。称之为“事务模式”或 OLTP。
为了使更新有效,它必须(至少)采用第三范式(3NF)。
数据仓库使用“星型模式”,根据不同的规则进行规范化。由于更新实际上并不像事务数据库中那样发生,因此设计的规范化完全不同。
阅读第三范式。
阅读星型模式设计。
有很多很多的谷歌链接都非常好。
更好的是 Kimball 的《数据仓库工具包》一书。这非常清楚。
Don't call it "production schema". Call it "Transactional Schema", or OLTP.
To make updates work, it must be in (at least) Third Normal Form (3NF).
Data warehouses use "Star Schema" which is normalized according to different rules. Since updates don't really occur the same way they do in a transactional database, the design is normalized completely differently.
Read up on 3rd Normal Form.
Read up on Star Schema Design.
There are lots and lots of Google links that are very good.
Better yet, but Kimball's The Data Warehouse Toolkit book. That's very clear.
在我目前的工作中,我们使用术语“事务”和/或 OLTP(感谢 S.Lott)来表示代表业务数据“当前”状态的数据库/模式。 “分析”和/或 OLAP 数据库/模式保存更多历史数据。通常,我们会拍摄当前数据的“快照”(将新行写入新表,其设计与事务表类似),然后为其添加日期标记和/或为其分配“批”ID。然后我们可以轻松分析该数据随时间的变化情况。
例如:您可以看到如何 &当公司名称发生变更时。在某些情况下,这是有用的信息。在其他情况下,您只需要当前名称。将这两种场景拆分到不同的表/数据库中意味着您不必担心总是在历史表中查找“最新”记录;所有这些信息都已移开。另一个用途是,如果您需要根据过去的数据重新生成报告(如果您想要“此数据当时的情况如何?”视图)。
这不一定是“正确”的定义……但它是一个有用的定义。
At my current gig, we use the term "Transactional" and/or OLTP (thanks S.Lott) for the databases/schemas that represent the "current" state of the business data. The "Analytical" and/or OLAP databases/schemas hold more historical data. Often we'll take a "snapshot" of the current data (writing out a new row to a new table, similar in design to the transactional table) and then date stamp it and/or assign it a "batch" ID. Then we can easily analyze how that data has changed over time.
For instance: you could see how & when the name of a Company has changed. In some circumstances, this is useful information. In others, you only want the current name. Splitting these two scenarios into different tables/databases means that you don't have to worry about always finding the "most recent" record in a historical table; all of that information has been moved out of the way. Another use is if you need to regenerate reports based on past data (if you want a "how did this data look back then?" view).
This isn't necessarily the "correct" definition... but it is a useful one.
生产、运营、事务是针对数据存储和管理进行优化的“标准规范化 (3NF) DB”。数据仓库 (DW) 星型模式针对报告和分析进行了优化。从业务用户的角度来看,数据仓库是只读的。 DW 是非规范化的(一条信息在许多地方重复),因此它们仅通过“加载程序”(ETL)进行更新,无需用户交互。这是星型模式的示例。
Production, operational, transactional is a "standard normalized (3NF) DB" optimized for data storage and management. Data warehouse (DW) star schema is optimized for reporting and analytic. From business user perspective, data warehouse is read-only. DW is de-normalized -- a piece of information is repeated in many places -- so they are updated only via "loading programs" (ETL), no user interaction. Here is an example of the star schema.