处理数据仓库负载中的主键重复项

发布于 2024-08-31 02:19:51 字数 383 浏览 7 评论 0原文

我目前正在构建一个 ETL 系统来从事务系统加载数据仓库。我的事实表的粒度是事务级别。为了确保不加载重复行,我在事实表上放置了一个主键,即事务 ID。

我遇到了事务被逆转的问题 - 在事务数据库中,这是通过状态完成的,我拾取状态,我可以计算出事务是否正在完成,或者回滚,以便我可以在仓库。但是,反转行将具有相同的事务 ID,因此我遇到了主键冲突。

我现在已经通过否定主键解决了这个问题,因此交易 ID 1 将是付款,交易 ID -1(仅在仓库中)将是冲销。

我考虑过生成BIT列的替代方案,其中0是正常的,1是反转的,然后将PK作为交易ID和BIT列。

我的问题是,这是一个好的做法吗?还有其他人遇到过这样的事情吗?作为参考,这是一个支付处理系统,因此值不会被修改,因此只会有交易和逆转。

I'm currently building an ETL system to load a data warehouse from a transactional system. The grain of my fact table is the transaction level. In order to ensure I don't load duplicate rows I've put a primary key on the fact table, which is the transaction ID.

I've encountered a problem with transactions being reversed - In the transactional database this is done via a status, which I pick up and I can work out if the transaction is being done, or rolled back so I can load a reversal row in the warehouse. However, the reversal row will have the same transaction ID and so I get a primary key violation.

I've solved this for now by negating the primary key, so transaction ID 1 would be a payment, and transaction ID -1 (In the warehouse only) would be the reversal.

I have considered an alternative of generating a BIT column, where 0 is normal and 1 is reversal, then making the PK the transaction ID and the BIT column.

My question is, is this a good practice, and has anyone else encountered anything like this? For reference, this is a payment processing system, so values will not be modified, so there will only ever be transactions and reversals.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

最美的太阳 2024-09-07 02:19:51

在大多数情况下,事实表有一个主键,它是多个 FK 的组合。因此,也许您可​​以使用 TransactionID 和 FK 的组合来 dimTransactionType 作为主键。

dimTransactionType 看起来像这样:

TransactionTypeKey  integer
TransactionTypeName  varchar(20)

并且

0, 'unknown'
1, 'normal'
2, 'reversal'

不建议在 DW 中对位和标志进行修补 - 尽可能详细。

In most cases, a fact table has a primary key which is a composite of several FKs. So, maybe you could use a combination of the TransactionID and a FK to dimTransactionType as the primary key.

The dimTransactionType would look something like:

TransactionTypeKey  integer
TransactionTypeName  varchar(20)

and would have

0, 'unknown'
1, 'normal'
2, 'reversal'

Tinkering with bits and flags is not recommended in a DW -- verbose as much as possible.

川水往事 2024-09-07 02:19:51

设计事实表的常见方法是使用代理键作为主键。大整数值通常就足够了。如果事务 ID 是维度记录的外键,则不应将其用作事实表中的主键。您的代理键逻辑(即事实表中新记录的触发器)可以基于事务 ID 和事务类型的组合。

A common approach to designing fact tables is to use surrogate keys as the primary keys. A large integer value is usually sufficient. If transaction id is a foreign key to a dimension record, then it should not be used as a primary key in the fact table. Your surrogate key logic (i.e. trigger for a new record in the fact table) can be based on a combination of the transaction id and the transaction type.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文