如何批量更新oracle pl/sql中的大表以避免用完undospace?

发布于 2024-09-15 16:16:51 字数 234 浏览 4 评论 0原文

我有一张非常大的桌子(5毫米记录)。我试图用随机字母数字来混淆表中每条记录的 VARCHAR2 列。我的过程在较小的数据集上成功执行,但它最终将在我无法控制其设置的远程数据库上使用,因此我想批量执行 UPDATE 语句以避免用完撤消空间。

是否有某种我可以启用的选项,或者是批量更新的标准方法?

我要补充的是,没有被混淆的记录不会有任何显着特征,因此我在循环中使用 rownum 的想法行不通(我认为)。

I have a very large table (5mm records). I'm trying to obfuscate the table's VARCHAR2 columns with random alphanumerics for every record on the table. My procedure executes successfully on smaller datasets, but it will eventually be used on a remote db whose settings I can't control, so I'd like to EXECUTE the UPDATE statement in batches to avoid running out of undospace.

Is there some kind of option I can enable, or a standard way to do the update in chunks?

I'll add that there won't be any distinguishing features of the records that haven't been obfuscated so my one thought of using rownum in a loop won't work (I think).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

谁把谁当真 2024-09-22 16:16:51

如果要更新表中的每一行,最好执行“创建表作为选择”,然后删除/截断原始表并重新附加新数据。如果您有分区选项,则可以将新表创建为具有单个分区的表,然后只需将其与 EXCHANGE PARTITION 交换即可。

插入需要更少的撤消,并且不带日志记录的直接路径插入(/+APPEND/ 提示)也不会生成太多重做。

无论采用哪种机制,都可能仍然存在旧值的“取证”证据(例如,保留在撤消中或由于行移动而分配给表的“可用”空间中)。

If you are going to update every row in a table, you are better off doing a Create Table As Select, then drop/truncate the original table and re-append with the new data. If you've got the partitioning option, you can create your new table as a table with a single partition and simply swap it with EXCHANGE PARTITION.

Inserts require a LOT less undo and a direct path insert with nologging (/+APPEND/ hint) won't generate much redo either.

With either mechanism, there would probably sill be 'forensic' evidence of the old values (eg preserved in undo or in "available" space allocated to the table due to row movement).

无戏配角 2024-09-22 16:16:51

以下内容未经测试,但应该可以工作:

declare
  l_fetchsize number := 10000;
  cursor cur_getrows is
  select rowid, random_function(my_column)
    from my_table;

  type rowid_tbl_type      is table of urowid;
  type my_column_tbl_type  is table of my_table.my_column%type;

  rowid_tbl     rowid_tbl_type;
  my_column_tbl my_column_tbl_type;
begin

  open cur_getrows;
  loop
    fetch cur_getrows bulk collect  
      into rowid_tbl, my_column_tbl 
      limit l_fetchsize;
    exit when rowid_tbl.count = 0;

    forall i in rowid_tbl.first..rowid_tbl.last
      update my_table 
         set my_column = my_column_tbl(i)
       where rowid     = rowid_tbl(i);
    commit;
  end loop;
  close cur_getrows;
end;
/

这并不是最有效的——单个更新就可以了——但它会使用 ROWID 执行更小的、用户可调的批次。

The following is untested, but should work:

declare
  l_fetchsize number := 10000;
  cursor cur_getrows is
  select rowid, random_function(my_column)
    from my_table;

  type rowid_tbl_type      is table of urowid;
  type my_column_tbl_type  is table of my_table.my_column%type;

  rowid_tbl     rowid_tbl_type;
  my_column_tbl my_column_tbl_type;
begin

  open cur_getrows;
  loop
    fetch cur_getrows bulk collect  
      into rowid_tbl, my_column_tbl 
      limit l_fetchsize;
    exit when rowid_tbl.count = 0;

    forall i in rowid_tbl.first..rowid_tbl.last
      update my_table 
         set my_column = my_column_tbl(i)
       where rowid     = rowid_tbl(i);
    commit;
  end loop;
  close cur_getrows;
end;
/

This isn't optimally efficient -- a single update would be -- but it'll do smaller, user-tunable batches, using ROWID.

<逆流佳人身旁 2024-09-22 16:16:51

为此,我将主键映射到整数 (mod n),然后对每个 x 执行更新,其中 0 <= x <名词

例如,也许你不走运,主键是一个字符串。您可以使用您最喜欢的散列函数对其进行散列,并将其分成三个分区:

UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=0
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=1
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=2

您可能有更多分区,并且可能希望将其放入循环中(进行一些提交)。

I do this by mapping the primary key to an integer (mod n), and then perform the update for each x, where 0 <= x < n.

For example, maybe you are unlucky and the primary key is a string. You can hash it with your favorite hash function, and break it into three partitions:

UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=0
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=1
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=2

You may have more partitions, and may want to put this into a loop (with some commits).

沦落红尘 2024-09-22 16:16:51

如果我必须更新数百万条记录,我可能会选择不更新。

我更有可能创建一个临时表,然后从旧表插入数据,因为插入不占用空间大量的重做空间并且需要更少的撤消。

CREATE TABLE new_table as select <do the update "here"> from old_table;

index new_table
grant on new table
add constraints on new_table
etc on new_table

drop table old_table
rename new_table to old_table;

您可以使用并行查询来做到这一点,大多数操作上不记录日志会产生非常多的结果
很少重做,根本不需要撤消——只需要更新数据所需时间的一小部分
数据。

If I had to update millions of records I would probably opt to NOT update.

I would more likely create a temp table and then insert data from old table since insert doesnt take up a lot of redo space and takes less undo.

CREATE TABLE new_table as select <do the update "here"> from old_table;

index new_table
grant on new table
add constraints on new_table
etc on new_table

drop table old_table
rename new_table to old_table;

you can do that using parallel query, with nologging on most operations generating very
little redo and no undo at all -- in a fraction of the time it would take to update the
data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文