日期时间作为仓库 FACT 表中 PK 的一部分

发布于 2024-11-10 11:57:38 字数 490 浏览 3 评论 0原文

我知道通常将 DateTime 列作为 PK 并不是一个好主意,但是对于我的情况,我认为这比在事实表中使用代理键更有意义。

原因是...

  1. 插入到事实表中的数据始终是连续的。即我永远不会插入早于事实表中最后一个值的日期时间值。
  2. 日期时间字段不是PK(复合PK)的唯一列,PK当然是它本身和维度FK的代理键。
  3. 我查询数据的方式几乎总是基于时间。
  4. 事实表上的代理键不会告诉我有关该行的任何信息。每行都已经是唯一的,为了找到特定的事实,我总是首先过滤日期时间和维度中的值。
  5. 没有单独的日期时间维度表。现在或在可预见的将来不需要指定时间点等。

旁注 - 时间将采用 UTC 格式并使用 SQL 2008 R2。

我要问的是给出的情况 - 这样做有什么缺点?我会遇到不可预见的问题吗? 当稍后查询该数据时,这实际上是一件好事吗?

想知道人们对 DateTime 字段作为复合 PK 的第一列的看法。

I know generally its not good idea to have a DateTime column as your PK, however for my situation I believe it makes more sense than having a surrogate key in the Fact Table.

The reasons are...

  1. The data inserted into the fact table is always sequential. i.e. I would never insert date time value that is older than the last value already in Fact table.
  2. The date time field is not the only column of the PK (composite PK), The PK is of course itself and the dimension FK's surrogate key.
  3. The way i be querying the data is nearly always based on time.
  4. A surrogate key on the Fact table would tell me nothing about the row. Each row is already unique and to find that particular fact I would always filter on the Date time first and the values in the dimensions.
  5. There is no separate datetime dimension table. No requirement now or in the foreseeable future to have named points in time etc.

Side notes - time will be in UTC and using SQL 2008 R2.

What I'm asking is are giving the situation - what are the disadvantages to doing this? Will I come up against unforeseen issues?
Is this actaully a good thing to be doing when querying that data back later?

Would like to know peoples view points on a DateTime field as the first column of a composite PK.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦忆晨望 2024-11-17 11:57:39

日期/时间是大多数表中键的组成部分,这几乎是任何数据仓库的基本特征。这没有什么“错”。

代理键通常不应该是表的唯一键,所以也许您的问题实际上是“我也应该在我的表上创建一个代理键吗?”。我的建议是,如果您没有理由创建代理键,那就不要创建。创建代理的时间是当您发现需要它时。

It's almost an essential feature of any data warehouse that date/time is a component of a key in most tables. There's nothing "wrong" with that.

A surrogate key generally shouldn't be the only key of a table, so perhaps your question is really "Should I create a surrogate key on my table as well?". My suggestion is that if you don't have a reason to create a surrogate key then don't. The time to create a surrogate is when you find that you need it.

半边脸i 2024-11-17 11:57:39

大多数事实表都有复合键和日期时间,或者通常 DateKey, TimeKey 是其中的一部分。其实,很常见。

dimDatedimTime 只是用来避免查询的 WHERE 子句中出现“有趣的”日期时间函数。例如

-- sales on
-- weekends for previous 28 weeks
-- 
select sum(f.SaleAmount)
from factSale as f
join dimDate  as d on d.DateKey = f.DateKey 
where d.IsWeekend = 'yes'
  and d.WeeksAgo between 1 and 28 ;

,在这里我可以在 IsWeekendWeeksAgo (还有 DateKey)上建立索引。如果将它们替换为日期时间函数,这将导致逐行处理。

Most fact tables have composite keys and date-time or often DateKey, TimeKey are part of it. Actually, quite common.

The dimDate and dimTime are simply used to avoid having "funny" date-time functions in the WHERE clause of a query. For example

-- sales on
-- weekends for previous 28 weeks
-- 
select sum(f.SaleAmount)
from factSale as f
join dimDate  as d on d.DateKey = f.DateKey 
where d.IsWeekend = 'yes'
  and d.WeeksAgo between 1 and 28 ;

So here I can have indexes on IsWeekend and WeeksAgo (DateKey too). If these were replaced by date-time functions, this would cause row-by-row processing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文