I/O在Oracle中的PowerCenter Informatica问题

发布于 2025-01-21 04:29:43 字数 2185 浏览 0 评论 0原文

我在Oracle中有两个表,我必须在表之间同步值(字段列)。我正在使用Informatica PowerCenter进行此同步操作。源预选赛查询会导致高I/O使用,我需要解决它。


Table1

Table1有大约20m的数据。 表1中的字段是实际字段。 TIMESTAMP字段保存创建&更新日期,并具有每日分区。

ID字段时间戳
1A2017-05-12 03:13:40
2B2002-11-01 07:30:46
3C2008-03-03-03 03:26:29

Table2

Table2 Table2的数据约为500m。 表2中的字段应尽可能同步在表1中。 TIMESTAMP字段保存创建&更新日期,并具有每日分区。 Table2也是映射中的目标。

IDTable1_id字段时间戳Action
1001A2005-09-30 03:20:41nothing
1011b2015-06-29 09: 41:44更新字段为
1021c2016-01-10 23:35:49更新字段作为
1032a2019-05-08 07:42:46更新字段为b
1042b2003-06-02 11:23:57nothing> nothing
1052<强> C2021-09-21 12:04:24更新字段为B
1063A2022-01-23 01:17:17更新字段为c
1073b2008-04-24 15:17:25更新字段作为C
1083C2010-01-15 07: 20:13

什么

没有

SELECT *
FROM Table1 t1, Table2 t2
WHERE t1.Id = t2.Table1_Id AND t1.Field <> t2.Field

UPDATE Table2
SET
  Field = :tu.Field,
  Timestamp = SYSDATE
WHERE Id = :tu.Id

I have two tables in Oracle and I have to synchronize values (Field column) between the tables. I'm using Informatica PowerCenter for this synchronization operation. The source qualifier query causes high I/O usage and I need to solve it.


Table1

Table1 has about 20M data. Field in Table1 is the actual field. Timestamp field holds create & update date and it has daily partition.

IdFieldTimestamp
1A2017-05-12 03:13:40
2B2002-11-01 07:30:46
3C2008-03-03 03:26:29

Table2

Table2 has about 500M data. Field in Table2 should be as sync as possible to Field in Table1. Timestamp field holds create & update date and it has daily partition. Table2 is also target in the mapping.

IdTable1_IdFieldTimestampAction
1001A2005-09-30 03:20:41Nothing
1011B2015-06-29 09:41:44Update Field as A
1021C2016-01-10 23:35:49Update Field as A
1032A2019-05-08 07:42:46Update Field as B
1042B2003-06-02 11:23:57Nothing
1052C2021-09-21 12:04:24Update Field as B
1063A2022-01-23 01:17:18Update Field as C
1073B2008-04-24 15:17:25Update Field as C
1083C2010-01-15 07:20:13Nothing

Mapping Queries

Source Qualifier Query

SELECT *
FROM Table1 t1, Table2 t2
WHERE t1.Id = t2.Table1_Id AND t1.Field <> t2.Field

Update Transformation Query

UPDATE Table2
SET
  Field = :tu.Field,
  Timestamp = SYSDATE
WHERE Id = :tu.Id

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

莳間冲淡了誓言ζ 2025-01-28 04:29:43

您可以使用以下方法。
SQ-您的SQL是正确的,如果您看到它的工作正常,可以使用它,但添加A&lt;&gt;分区日期键列中的子句。您也可以使用此SQL加快速度。

SELECT *
FROM Table2 t2
INNER JOIN  Table1 t3 ON t3.Id = t2.Table1_Id 
LEFT OUTER JOIN Table1 t1 ON t1.Id = t2.Table1_Id AND t1.Field = t2.Field AND t1.partition_date= t2.partition_date -- You did not mention partition_date column but i am assuming there is a separate column which is used to partition. 
WHERE t1.id is null -- <> is inefficient.

然后在您的INDA目标T2定义中,请确保您提到partition_date作为密钥的一部分以及ID。
然后将更新策略设置为dd_update。您也可以将会话设置为更新。
并删除该目标覆盖。实际上,这将在整个桌子上应用更新查询,有时可能是ABD I/O效率低下。

Informatica具有强大的功能,可以通过更新策略在BUNCH中更新数据。您可以根据自己的性能增加提交间隔。

您不应该尝试使用SQL在单个GO中更新一个500m的表。是的,您可以使用PLSQL在一堆中进行更新。

You can use below approach.
SQ - Your SQL is correct and you can use it if you see its working but add a <> clause on partition date key column. You can use this SQL to speed it up as well.

SELECT *
FROM Table2 t2
INNER JOIN  Table1 t3 ON t3.Id = t2.Table1_Id 
LEFT OUTER JOIN Table1 t1 ON t1.Id = t2.Table1_Id AND t1.Field = t2.Field AND t1.partition_date= t2.partition_date -- You did not mention partition_date column but i am assuming there is a separate column which is used to partition. 
WHERE t1.id is null -- <> is inefficient.

Then in your infa target T2 definition, make sure you mention partition_date as part of key along with ID.
Then use a update strategy set to DD_UPDATE. You can set the session to update as well.
And remove that target override. This actually applies the update query on the whole table and sometime can be inefficient abd I/O intensive.

Informatica is powerful to update data in bunch through update strategy. You can increase commit interval as per your performance.

You shouldn't try to update a 500M table in a single go using SQL. Yes, you can use PLSQL to update in a bunch.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文