如何批量更新oracle pl/sql中的大表以避免用完undospace?
我有一张非常大的桌子(5毫米记录)。我试图用随机字母数字来混淆表中每条记录的 VARCHAR2 列。我的过程在较小的数据集上成功执行,但它最终将在我无法控制其设置的远程数据库上使用,因此我想批量执行 UPDATE 语句以避免用完撤消空间。
是否有某种我可以启用的选项,或者是批量更新的标准方法?
我要补充的是,没有被混淆的记录不会有任何显着特征,因此我在循环中使用 rownum 的想法行不通(我认为)。
I have a very large table (5mm records). I'm trying to obfuscate the table's VARCHAR2 columns with random alphanumerics for every record on the table. My procedure executes successfully on smaller datasets, but it will eventually be used on a remote db whose settings I can't control, so I'd like to EXECUTE the UPDATE statement in batches to avoid running out of undospace.
Is there some kind of option I can enable, or a standard way to do the update in chunks?
I'll add that there won't be any distinguishing features of the records that haven't been obfuscated so my one thought of using rownum in a loop won't work (I think).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果要更新表中的每一行,最好执行“创建表作为选择”,然后删除/截断原始表并重新附加新数据。如果您有分区选项,则可以将新表创建为具有单个分区的表,然后只需将其与 EXCHANGE PARTITION 交换即可。
插入需要更少的撤消,并且不带日志记录的直接路径插入(/+APPEND/ 提示)也不会生成太多重做。
无论采用哪种机制,都可能仍然存在旧值的“取证”证据(例如,保留在撤消中或由于行移动而分配给表的“可用”空间中)。
If you are going to update every row in a table, you are better off doing a Create Table As Select, then drop/truncate the original table and re-append with the new data. If you've got the partitioning option, you can create your new table as a table with a single partition and simply swap it with EXCHANGE PARTITION.
Inserts require a LOT less undo and a direct path insert with nologging (/+APPEND/ hint) won't generate much redo either.
With either mechanism, there would probably sill be 'forensic' evidence of the old values (eg preserved in undo or in "available" space allocated to the table due to row movement).
以下内容未经测试,但应该可以工作:
这并不是最有效的——单个更新就可以了——但它会使用 ROWID 执行更小的、用户可调的批次。
The following is untested, but should work:
This isn't optimally efficient -- a single update would be -- but it'll do smaller, user-tunable batches, using ROWID.
为此,我将主键映射到整数 (mod n),然后对每个 x 执行更新,其中 0 <= x <名词
例如,也许你不走运,主键是一个字符串。您可以使用您最喜欢的散列函数对其进行散列,并将其分成三个分区:
您可能有更多分区,并且可能希望将其放入循环中(进行一些提交)。
I do this by mapping the primary key to an integer (mod n), and then perform the update for each x, where 0 <= x < n.
For example, maybe you are unlucky and the primary key is a string. You can hash it with your favorite hash function, and break it into three partitions:
You may have more partitions, and may want to put this into a loop (with some commits).
如果我必须更新数百万条记录,我可能会选择不更新。
我更有可能创建一个临时表,然后从旧表插入数据,因为插入不占用空间大量的重做空间并且需要更少的撤消。
您可以使用并行查询来做到这一点,大多数操作上不记录日志会产生非常多的结果
很少重做,根本不需要撤消——只需要更新数据所需时间的一小部分
数据。
If I had to update millions of records I would probably opt to NOT update.
I would more likely create a temp table and then insert data from old table since insert doesnt take up a lot of redo space and takes less undo.
you can do that using parallel query, with nologging on most operations generating very
little redo and no undo at all -- in a fraction of the time it would take to update the
data.