Oracle 问题:改进常规路径负载、direct=false、提交次数和bindsize/readsize 时出现问题
我目前正在做一个学术基准测试,TPC-H,并且我有一些大表,我想使用直接路径(效果很好并且非常快)和传统路径加载。
为此,我使用了这个批处理脚本:
for /F "tokens=1,2" %%A in (table4_records.txt) do (
sqlldr userid='tpch/tpch' control=%%A.ctl rows=%%B bindsize=? readsize=? SILENT=HEADER log=bulkload_logs\sf4\bulk_%%A%1.log
)
问题是,无论我为 bindsize 和/或 readsize 选项赋予什么值,它总是提交 65534 到 65534 行。我已经传递了 %%B 大小,它是每个表的确切行数。
在直接加载中,我只是使用行和提交,就像加载孔表后真正完成的那样。
我想做类似的事情,但是使用传统的加载路径 - 我知道这并不更快,但这就是重点。
您能否告诉我如何提供正确的参数,以便我可以: 1- 一次加载尽可能多的数据;
2- 降低提交频率,最好是在表加载结束时提交。
以下是表的名称和行数:
lineitem 23996604 ->是最大的并且有大约。磁盘空间 3GB
订单600万
零件供应 3200000
第 800000 部分
客户60万
供应商 40000
国家25
区域5
I'm currently doing an academic benchmark, TPC-H, and I have some big tables that I want to load using direct path (which workde great and was very fast) and conventional path.
For that, I used this batch script:
for /F "tokens=1,2" %%A in (table4_records.txt) do (
sqlldr userid='tpch/tpch' control=%%A.ctl rows=%%B bindsize=? readsize=? SILENT=HEADER log=bulkload_logs\sf4\bulk_%%A%1.log
)
The problem is that, no matter what values I give to the bindsize and/or readsize options, it always commit from 65534 to 65534 rows. I already pass the %%B size which is the exactly number of rows per table.
In direct load, I just used the rows and the commit as REALLY done after the hole table was loaded.
I want to do something like that, but with conventional load path - I know that is not faster, but that's the point.
Could you please tell me how can I give the correct parameters so I can:
1- load as much data, at a time, as I can;
2- commit less frequently, preferably at the end of the table's load.
Here are the tables' names and number of rows:
lineitem 23996604 -> is the biggest and has aprox. 3GB on disk
orders 6000000
partsupp 3200000
part 800000
customer 600000
supplier 40000
nation 25
region 5
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您不会获得 3GB 文件来使用传统路径加载并在最后提交。
来自 Oracle 文档:
“当 SQL*Loader 向 Oracle 数据库发送 INSERT 命令时,会一次性插入整个数组。插入绑定数组中的行后,会发出 COMMIT 语句。”
“在传统的路径方法中,绑定数组受到读缓冲区大小的限制。因此,较大的读缓冲区的优点是在需要提交操作之前可以读取更多数据。”
最大化 READSIZE 和 BINDSIZE,直到它告诉您已达到平台的最大值。
You won't get a 3GB file to use a conventional path load and commit at the end.
From the Oracle docs:
"When SQL*Loader sends the Oracle database an INSERT command, the entire array is inserted at one time. After the rows in the bind array are inserted, a COMMIT statement is issued."
"In the conventional path method, the bind array is limited by the size of the read buffer. Therefore, the advantage of a larger read buffer is that more data can be read before a commit operation is required."
Maximise the READSIZE and BINDSIZE until it tells you that you've hit the maximum for your platform.
糟糕:
事实证明,常规路径中的 ROWS 最大数量正好是 65534,因此我可以继续增加绑定数组的大小!哈哈
抱歉,我刚刚在 Oracle 文档中找到了它
Ooops:
It turns out ROWS maximum number, in Conventional Path, is 65534 exactly, so I could keep increasing my bindarray size! LOL
Sorry, I've just found it in Oracle's documentation
Burleson 的网站有一种可行的方法。
Burleson's site has a possible way to do this.