Oracle 更新生产数据库
场景:
- 我有一个巨大的 .csv 文件(百万行)。
- 使用 sqlldr(SQL 加载程序),我必须创建一个包含 CSV 中所有数据的临时表。
- 之后我必须对临时表进行一些处理(大写更新一些列等)。
- 处理后,我必须从临时表中取出每一行,进行一些额外的检查并将这些行插入另一个表中(在生产中大量使用)。
您建议如何进行所有这些处理,这样我就不会影响生产环境的整体性能?
(注意:我不应该事先预处理 .csv)。
任何建议将不胜感激!
Scenario:
- I have a huge .csv file (million of lines) .
- With
sqlldr
(SQL Loader) I have to create a temporary table with all the data in the CSV. - After this I have to do some processing on the temporary table (uppercase update some columns, etc.).
- After processing, I have to take every row from the temporary table, make some additional checks and insert those rows in another table (being heavily used in production) .
How do you suggest to make all this processing, so that I won't affect the overall performance of the production environment ?
(NOTE: I am not supposed to pre-process the .csv before hand).
Any suggestion will be highly appreciated !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我知道您说过您想使用 SQL Loader,但您可能想考虑使用外部表,因为它可能会让事情变得更容易。您可以将外部表声明为类似这样的内容,
这将允许您使用标准 SELECT 语句读取(但不能更改)文件中的数据。然后,您可以从外部表中进行选择,并将数据直接插入到“临时”表中,在插入期间至少进行一些编辑:
共享并享受。
I know you've said you want to use SQL Loader, but you might want to look at using an external table as it might make things easier. You could declare your external table as something like
This would allow you to read (but not change) the data in your file using standard SELECT statements. You could then SELECT from the external table and INSERT the data into your 'temp' table directly, doing at least some of the editing during the INSERT:
Share and enjoy.
检查您的数据库是否有足够的磁盘空间,并且 RAM/CPU 的压力不会太大。
如果可以的话:就这么做吧。一百万行并不引人注目。将文件加载到工作表中听起来不像通常会影响生产性能的事情。您可以在 sqlldr 控制文件中执行 to_upper() (保存工作表上的更新)。也许加载时可以进行更多的后处理?
外部表(如另一个答案中所建议的)也可以正常工作,但恕我直言,除了节省一些磁盘空间之外没有其他优势,虽然它确实给配置带来了一些额外的麻烦(创建目录、授予访问权限、将文件传输到数据库服务器) 。
Check to see if your database has enough diskspace, and isn't too strained on it's RAM/CPU.
If that's OK: just do it. A million of lines isn't spectacular. Loading the file into a work table doesn't sound like something that would normally affect production performance. You could do the to_upper() in your sqlldr control-file (saves you an update on the work table). Maybe there is more post-processing that can be done while loading?
An external table (as suggested in the other answer) works fine as well, but has imho no other advantage than saving some disk space, while it does give some extra hassle to configure (create directory, grant access, transfer file to db server).
加载文件时,您实际上可以在 SQL*Loader 中进行相当多的后处理。这可以减少数据库上的一些负载,因为更新临时表可能很慢并且会创建大量的重做/撤消/其他操作。
You can actually do a fair bit of post-processing in SQL*Loader when you load the file. This could reduce some of the load on the database as updating the temporary table may be slow and create large amounts of redo/undo/whatever.
您可以加载数据并在 CERT 数据库中进行处理。当您完成临时表的处理后,通过导出/导入(数据泵)或通过数据库链接将其移动到 PROD。然后插入 PROD 表。这将使您可以在不影响 PROD 的情况下进行数据处理。插入必须在 PROD 中完成,为了最大限度地减少影响,如果您认为这会是一个大问题,您应该查看在插入期间可以禁用/删除的任何触发器/约束/索引。
You could load the data and do your processing into your CERT database. When you've finished processing the temp table move it to PROD w/ export/import (datapump) or over database link. Then do your insert into the PROD table. This would let you do your data processing w/o any impact on PROD. The inserts have to be done in PROD, to minimize impact you should take a look at any triggers/constraints/indexes that can be disabled/dropped during the inserts if you think it'll be a big deal.