使用 SSIS 维度合并 SCD 组件推断成员的问题
我正在使用 SSIS 维度合并 SCD 组件 (http://dimensionmergescd.codeplex.com/),并且我的配置同时包含 SCD1 和 SCD 2 列。我有一些行设置了 InferredMember 标志,但是组件插入了新行,并且没有重置现有推断行上的当前标志。
还有其他人使用过这个组件吗?您是否看到它正常工作?难道是我理解错了?我的理解是 SCD2 列变成 SCD1,其中 InferredMember 为 true,这是错误的吗?
排序是在数据库中根据业务键完成的,并且排序列设置为匹配。 DMMSCD 组件的输出直接连接到 OLE DB Command/OLE DB Destination 组件。它正在生产中,并且几个月来每天都正常工作。
这是运行的审核输出:
ExistingDimensionInputRowCount = 719941
SpecialMemberInputRowCount = 1
SourceSystemInputRowCount = 720516
UnchangedOutputRowCount = 719941
NewOutputRowCount = 720517
DeletedOutputRowCount = 0
SCD2ExpiredOutputRowCount = 0
SCD2NewOutputRowCount = 0
SCD1UpdatedOutputRowCount = 0
InvalidInputOutputRowCount = 0
I'm using the SSIS Dimension Merge SCD Component (http://dimensionmergescd.codeplex.com/) and have a situation where I have a configuration with both SCD1 and SCD 2 columns. I have rows where the InferredMember flag is set however the component inserted new rows and did not reset the current flag on the existing inferred rows.
Does anyone else use this component and have you seen it work correctly? Am I misunderstanding? My understanding is the SCD2 columns become SCD1 where InferredMember is true, is this wrong?
Sorting is done in the database on the business key and the sort columns are set to match. The output of the DMSCD component are hooked directly to the OLE DB Command/OLE DB Destination components. This is in production and has otherwise been working correctly every day for months.
This is the Audit output from a run:
ExistingDimensionInputRowCount = 719941
SpecialMemberInputRowCount = 1
SourceSystemInputRowCount = 720516
UnchangedOutputRowCount = 719941
NewOutputRowCount = 720517
DeletedOutputRowCount = 0
SCD2ExpiredOutputRowCount = 0
SCD2NewOutputRowCount = 0
SCD1UpdatedOutputRowCount = 0
InvalidInputOutputRowCount = 0
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您只对推断成员有疑问吗?您使用的是 CodePlex 上发布的最新版本的组件吗?
听起来您的问题不仅与推断的成员有关,而且可能是因为 SCD2 内务管理列(当前成员和日期列)设置不正确而发生的。 DMSCD 未按您预期交付输出的最常见原因是以下原因之一:
您在 DMSCD 之后使用派生列组件更改日期,和/或未更新/插入日期信息DMSCD 供应。相反,您在派生列中使用硬编码或变量值、表定义中的默认值,或者未正确映射目标列。
DMMSCD 输入的排序顺序不正确。您可能假设将 OLE DB 源输出上的 IsSorted 属性标记为 true,并设置列的各种 SortKeyPosition 属性就足够了 - 但事实并非如此。删除您所做的高级编辑,或者在流程中使用排序组件(出于测试目的 - 我们可以稍后修复 OLE DB 源)。
You've only got issues with inferred members? And you're using the most recent version of the component as released on CodePlex?
It sounds like your issue is not solely related to inferred members, but that it may be occurring because SCD2 housekeeping columns (current member and date columns) are not set properly. The most common reasons outputs don't get delivered as you expect from the DMSCD is due to one of the following:
You are altering dates using Derived Column components after the DMSCD, and/or are not updating/inserting the date information the DMSCD supplies. Instead, you're using hardcoded or variable values in a Derived Column, defaults in the table definition, or not mapping destination columns appropriately.
The sort order of the inputs to the DMSCD is incorrect. You may be assuming that marking the IsSorted property on the output of your OLE DB Source to true, and setting various SortKeyPosition properties of columns is sufficient - it is not. Either remove the advanced edits you've made, or use a Sort component in the flow (for testing purposes - we can fix the OLE DB Source later).
所以,回答我自己的问题,是的,我错了。 InferredMember 标志本身不会触发推断成员行为。
推断成员是插入维度表中的骨架记录 - 通常由存储过程插入。 - 当事实表维护期间代理键查找失败时。 InferredMember 标志通常会触发维度加载过程来填充骨架推断成员记录上的剩余字段。对于推断成员记录上的 SCD2 字段,它们将作为 SCD1 进行处理,并且不应生成新记录。
通过实验,我能够确定 DMSCD 组件要求推断成员框架至少包含业务密钥、推断成员标志和过去的活动日期 - 我使用的是当前日期,因此不处理记录作为推断成员记录,它们被视为新记录并生成重复记录。
遗憾的是,我一直无法找到调用与 DMSCD 兼容的推断成员记录规范的文档,以便可以通过为创建推断成员记录而开发的存储过程正确地形成推断成员记录。我还质疑无法使用当前日期的活动日期来标记推断的成员记录。
So, to answer my own question, yes, I am mistaken. The InferredMember flag alone does not trigger Inferred Member behavior.
Inferred members are skeletal records inserted in the dimension tables - often by a stored proc. - when a surrogate key look-up fails during fact table maintenance. The InferredMember flag typically triggers the dimension load process to fill in the remaining fields on the skeletal Inferred Member records. And in the case of SCD2 fields on Inferred Member records they are handled as SCD1 and new records should not be generated.
Through experimentation I was able to determine that the DMSCD component requires the Inferred Member skeleton to include at least the Business Key, Inferred Member flag, and an Active Date that is in the past - I was using the current date so the records were not treated as Inferred Member records, they were being treated as New records and duplicates were being generated.
Regrettably, I have never been able to find documentation that calls out the specification for inferred member records that are compatible with DMSCD so that Inferred Member records can be properly formed by the stored proc that is developed to create them. I'd also question the inability to tag inferred member records with an active date that is the current date.