Oracle 更新生产数据库

发布于 2024-10-03 23:51:03 字数 293 浏览 1 评论 0原文

场景:

  • 我有一个巨大的 .csv 文件(百万行)。
  • 使用 sqlldr(SQL 加载程序),我必须创建一个包含 CSV 中所有数据的临时表。
  • 之后我必须对临时表进行一些处理(大写更新一些列等)。
  • 处理后,我必须从临时表中取出每一行,进行一些额外的检查并将这些行插入另一个表中(在生产中大量使用)。

您建议如何进行所有这些处理,这样我就不会影响生产环境的整体性能?

(注意:我不应该事先预处理 .csv)。

任何建议将不胜感激!

Scenario:

  • I have a huge .csv file (million of lines) .
  • With sqlldr (SQL Loader) I have to create a temporary table with all the data in the CSV.
  • After this I have to do some processing on the temporary table (uppercase update some columns, etc.).
  • After processing, I have to take every row from the temporary table, make some additional checks and insert those rows in another table (being heavily used in production) .

How do you suggest to make all this processing, so that I won't affect the overall performance of the production environment ?

(NOTE: I am not supposed to pre-process the .csv before hand).

Any suggestion will be highly appreciated !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

酒儿 2024-10-10 23:51:03

我知道您说过您想使用 SQL Loader,但您可能想考虑使用外部表,因为它可能会让事情变得更容易。您可以将外部表声明为类似这样的内容,

create table EXTERNAL_HR_DATA (
    EMPNO    NUMBER(4),
    ENAME    VARCHAR2(10),
    JOB      VARCHAR2(9),
    MGR      NUMBER(4),
    HIREDATE DATE,
    SAL      NUMBER(7,2),
    COMM     NUMBER(7,2),
    DEPTNO   NUMBER(2))
    Organization external
        (type oracle_loader
         default directory testdir
         access parameters (records delimited by newline
                            fields terminated by ‘,’)
         location (‘emp_ext.csv’))
    reject limit 1000;

这将允许您使用标准 SELECT 语句读取(但不能更改)文件中的数据。然后,您可以从外部表中进行选择,并将数据直接插入到“临时”表中,在插入期间至少进行一些编辑:

INSERT INTO TEMP_HR_DATA
  SELECT EMPNO,
         UPPER(TRIM(ENAME)),
         UPPER(TRIM(JOB)),
         MGR,
         HIREDATE,
         SAL,
         COMM,
         DEPTNO
    FROM EXTERNAL_HR_DATA;

共享并享受。

I know you've said you want to use SQL Loader, but you might want to look at using an external table as it might make things easier. You could declare your external table as something like

create table EXTERNAL_HR_DATA (
    EMPNO    NUMBER(4),
    ENAME    VARCHAR2(10),
    JOB      VARCHAR2(9),
    MGR      NUMBER(4),
    HIREDATE DATE,
    SAL      NUMBER(7,2),
    COMM     NUMBER(7,2),
    DEPTNO   NUMBER(2))
    Organization external
        (type oracle_loader
         default directory testdir
         access parameters (records delimited by newline
                            fields terminated by ‘,’)
         location (‘emp_ext.csv’))
    reject limit 1000;

This would allow you to read (but not change) the data in your file using standard SELECT statements. You could then SELECT from the external table and INSERT the data into your 'temp' table directly, doing at least some of the editing during the INSERT:

INSERT INTO TEMP_HR_DATA
  SELECT EMPNO,
         UPPER(TRIM(ENAME)),
         UPPER(TRIM(JOB)),
         MGR,
         HIREDATE,
         SAL,
         COMM,
         DEPTNO
    FROM EXTERNAL_HR_DATA;

Share and enjoy.

小梨窩很甜 2024-10-10 23:51:03

检查您的数据库是否有足够的磁盘空间,并且 RAM/CPU 的压力不会太大。

如果可以的话:就这么做吧。一百万行并不引人注目。将文件加载到工作表中听起来不像通常会影响生产性能的事情。您可以在 sqlldr 控制文件中执行 to_upper() (保存工作表上的更新)。也许加载时可以进行更多的后处理?

外部表(如另一个答案中所建议的)也可以正常工作,但恕我直言,除了节省一些磁盘空间之外没有其他优势,虽然它确实给配置带来了一些额外的麻烦(创建目录、授予访问权限、将文件传输到数据库服务器) 。

Check to see if your database has enough diskspace, and isn't too strained on it's RAM/CPU.

If that's OK: just do it. A million of lines isn't spectacular. Loading the file into a work table doesn't sound like something that would normally affect production performance. You could do the to_upper() in your sqlldr control-file (saves you an update on the work table). Maybe there is more post-processing that can be done while loading?

An external table (as suggested in the other answer) works fine as well, but has imho no other advantage than saving some disk space, while it does give some extra hassle to configure (create directory, grant access, transfer file to db server).

月亮是我掰弯的 2024-10-10 23:51:03

加载文件时,您实际上可以在 SQL*Loader 中进行相当多的后处理。这可以减少数据库上的一些负载,因为更新临时表可能很慢并且会创建大量的重做/撤消/其他操作。

You can actually do a fair bit of post-processing in SQL*Loader when you load the file. This could reduce some of the load on the database as updating the temporary table may be slow and create large amounts of redo/undo/whatever.

度的依靠╰つ 2024-10-10 23:51:03

您可以加载数据并在 CERT 数据库中进行处理。当您完成临时表的处理后,通过导出/导入(数据泵)或通过数据库链接将其移动到 PROD。然后插入 PROD 表。这将使您可以在不影响 PROD 的情况下进行数据处理。插入必须在 PROD 中完成,为了最大限度地减少影响,如果您认为这会是一个大问题,您应该查看在插入期间可以禁用/删除的任何触发器/约束/索引。

You could load the data and do your processing into your CERT database. When you've finished processing the temp table move it to PROD w/ export/import (datapump) or over database link. Then do your insert into the PROD table. This would let you do your data processing w/o any impact on PROD. The inserts have to be done in PROD, to minimize impact you should take a look at any triggers/constraints/indexes that can be disabled/dropped during the inserts if you think it'll be a big deal.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文