Oracle 11gR2加载多个文件:sqlldr还是外部表?

发布于 2024-12-23 21:19:41 字数 338 浏览 1 评论 0原文

我有 471 个文件,总计约 100GB。这些文件以“\t”分隔,事务数据采用以下格式:

char(10) not null,
char(8) not null,
char(1) not null,
char(4) not null,
number not null,
char(1) not null,
char(1) not null,
char(1) not null,
number not null

文件中事务的顺序很重要,需要保留,最好带有主键 id。最初,我使用 sqlldr 加载这些文件,但需要很长时间。我最近了解了外部表。从战略角度来看,哪种方法更好?外部表如何工作?谢谢。

I have 471 files totaling about 100GB. The files are "\t" separate, with transaction data in the following format:

char(10) not null,
char(8) not null,
char(1) not null,
char(4) not null,
number not null,
char(1) not null,
char(1) not null,
char(1) not null,
number not null

The order of the transactions in the files is important and needs to be preserved, ideally with a primary key id. Initially, I loaded these files with sqlldr but it takes a very long time. I recently learned about external tables. From a strategic perspective, which method is better? How does the external table work? Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

九局 2024-12-30 21:19:41

外部表和SQL*Loader的记录解析非常相似,所以通常
在相同的记录格式下,没有重大的性能差异。然而,
外部表可能更适合以下情况:

  • 您希望在将数据加载到数据库时对其进行转换。
  • 您想要加载数据,并且需要对临时表进行额外的索引。
  • 您希望使用透明的并行处理,而不必先拆分外部数据。

但是,在以下情况下,请使用 SQL*Loader 以获得最佳加载性能:

  • 您想要远程加载数据。
  • 不需要对数据进行转换,并且不需要并行加载数据。

为了提高 SQL*Loader 的性能,提出了以下建议。

  • 您的负载上没有任何索引和/或约束(主键)
    加载过程中的表
  • 在命令行中添加以下选项:DIRECT=TRUE。这将
    使用直接路径加载器绕过大部分 RDBMS 处理
    而不是传统的路径加载器。然而,也有一些情况
    当你不能使用直接负载时。这些限制可以从
    Oracle 服务器实用程序指南
  • 使用固定宽度数据而不是分隔数据。对于分隔数据,每个
    需要扫描记录的分隔符
  • 尽量避免字符集转换,因为转换既涉及时间又涉及 CPU
    密集
  • 对于常规路径,请使用 READSIZE 和 BINDSIZE 参数。
    READSIZE 将在每次读取系统调用时获取更大的数据块。这
    BINDSIZE 参数指定绑定数组的大小,而该大小又
    指定每批加载的行数

来源: http://download.oracle.com/otndocs/products/database/enterprise_edition/utilities/pdf/sql_loader_faq.pdf

The record parsing of External Tables and SQL*Loader is very similar, so normally
there is not a major performance difference in the same record format. However,
External Tables may be more appropriate in the following situations:

  • You want to transform the data as it is being loaded into the database.
  • You want to load data, and additional indexing of the staging table is required.
  • You want to use transparent parallel processing without having to split the external data first.

However, in the following situations, use SQL*Loader for the best load performance:

  • You want to load data remotely.
  • Transformations are not required on the data, and the data does not need to be loaded in parallel.

To improve the performance of SQL*Loader the following suggestions have been made.

  • Do not have any indexes and/or constraints (primary keys) on your load
    tables during the load process
  • Add the following option in the command line: DIRECT=TRUE. This will
    bypass most of the RDBMS processing by using the direct path loader
    instead of the conventional path loader. However, there are some cases
    when you can’t use direct load. These restrictions can be obtained from the
    Oracle Server Utilities Guide
  • Use fixed width data rather than delimited data. For delimited data, each
    record needs to be scanned for the delimiter
  • Try to avoid character set conversions as conversions are both time and cpu
    intensive
  • For conventional path, use the READSIZE and BINDSIZE parameters.
    READSIZE will grab larger chunks of data per read system call. The
    BINDSIZE parameter specifies the size of the bind array, which in turn
    specifies the number of rows which will be loaded per batch

Source: http://download.oracle.com/otndocs/products/database/enterprise_edition/utilities/pdf/sql_loader_faq.pdf

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文