Oracle 外部表。优化选择查询
我必须从 Oracle 外部表执行许多选择。
我有 10 个看起来很像这样的游标(ext_temp 是外部表)
CURSOR F_CURSOR (day IN varchar,code Number,orig Number)
IS
select NVL(sum(table_4.f),0)
from ext_temp table_4
where
--couple of conditions here, irrelevant for the question at hand.
AND TO_CHAR(table_4.day,'YYYYMMDD') = day
AND table_4.CODE = code
AND table_4.ORIG = orig;
并且外部表有大约 22659 个寄存器。
我的脚本主循环看起来像这样
for each register in some_query: --22659 registers
open F_cursor(register.day,register.code,register.orig);
--open 9 more cursors
fetch F_cursor into some_var;
--fetch 9 more cursors, with the same structure
查询占用太多。我从这里知道我不能有任何索引或DML。
那么有没有办法让它跑得更快呢?我可以重写我的 plsql 脚本,但我认为我没有时间了。
更新:遗漏了一个重要细节。
我不是数据库的所有者或 DBA。那家伙不希望他的数据库中有任何额外的信息(大约 3GB 的数据),而外部表是我们能从他那里得到的全部信息。他不允许我们创建临时表。我不会假装质疑他的理由,但外部表并不是解决这个问题的方法。所以,我们被他们困住了。
I have to perform many selects from an Oracle external table.
I have 10 cursors that look a lot like this (ext_temp is the external table)
CURSOR F_CURSOR (day IN varchar,code Number,orig Number)
IS
select NVL(sum(table_4.f),0)
from ext_temp table_4
where
--couple of conditions here, irrelevant for the question at hand.
AND TO_CHAR(table_4.day,'YYYYMMDD') = day
AND table_4.CODE = code
AND table_4.ORIG = orig;
And the external table has about 22659 registers.
My script main loop looks like this
for each register in some_query: --22659 registers
open F_cursor(register.day,register.code,register.orig);
--open 9 more cursors
fetch F_cursor into some_var;
--fetch 9 more cursors, with the same structure
Queries are taking way to much. And I know from here that i can't have any indexes or DML.
So, is there a way of getting it to run faster? I can rewrite my plsql script, but i don't think I have time left.
Update: missed an important detail.
I'm not the owner or DBA of the database. That guy doesn't want any extra info (its about 3gb of data) in his database, and external tables is all we could get out of him. He doesnt allow us to create temporary tables. I don't pretend to question his reasons, but external tables is not the solution for this. So, we are stuck with them.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
将它们设为
Oracle
表。外部表是用来替换
SQL*LOADER
的,而不是每天使用它们。每当您的底层文件发生更改时,只需运行导入脚本即可将外部表的内容加载到
Oracle
表中。这是您的同名者对此的看法(窃取自 此处):
更新:
使用
3GB
表永远无法获得良好的性能,因为Oracle
必须使用3GB
> 对每个查询进行全扫描,这将是一流的磁盘读取主轴移动全扫描,而不是您可以在计划中看到但在实际执行时间中几乎无法注意到的廉价缓存模仿。尝试说服那个人为您创建一个临时表,您可以使用它来处理数据,并在会话开始时从外部表加载数据。
这不是最好的解决方案,因为它需要在临时表空间中为每个会话保留单独的表副本,但它在性能方面要好得多。
Make them
Oracle
tables.External tables are there to replace
SQL*LOADER
, not to work with them on a daily basis.Just run an importing script whenever you underlying file changes which would load the contents of an external table into an
Oracle
table.Here's what your namesake thinks of it (stolen from here):
Update:
You won't ever get decent performance with a
3GB
table, sinceOracle
will have to do a3GB
fullscan on each query, and it will be a first-class disk-reading spindle-moving fullscan, not a cheap cached imitation which you can see in the plan but can barely notice in the actual execution time.Try to convince the guy to create a temporary table for you which you could use to work with the data and just load the data from the external table whenever your session begins.
This is not the best solution since it will need to keep the separate copy of the table for each session in the temporary tablespace but it's much better performance-wise.
如果您必须解决那些没有意义但又无法更改的限制,那真的很困难……
您最好读取一次外部表,然后在类似索引的数据中构建所需的数据代码中的结构(基本上是一个数组,其中每个元素对应您要查找的每个寄存器)。
因此,您的光标将如下所示:
您的寄存器循环将变成光标循环:
因此,您只需要对所有寄存器进行一个循环,而不是为每个寄存器循环遍历外部表。
这可以扩展到您提到的十个光标。
It's really tough if you have to work around restrictions that don't make sense but that you can't change ...
You should be better off reading through the external table once, and then build the required data in an index-like data structure in your code (basically an array with one element for each register you are looking for).
So your cursor would look like this:
And your register-loop would turn into a cursor-loop:
As a result, instead of a loop through the external table for each register, you just need one loop for all registers.
This can be extended for the ten cursors you mentioned.
您可以将外部表数据写入临时索引(如果需要)表,然后对其执行多个查询。
然后直接使用 your_temp_table 执行所有查询。
You could write your external table data to a temporary indexed (if you want) table, and then perform your multiple queries against it.
Then do all your queries directly using your_temp_table.
虽然完全同意 Quassnoi 的建议,即外部表似乎不是正确的解决方案,以及 DCookie 的类比,即您被束缚并被扔到海里并被要求游泳,但至少可能有一种方法来构建您的程序,以便外部表只能读取一次。根据您的描述,我认为所有 10 个游标都在从外部表读取,这意味着您强制 Oracle 扫描外部表 10 次。
假设这个推论是正确的,最简单的答案可能是使外部表成为驱动光标,类似于 IronGoofy 的建议。根据下面代码片段中的
some_query
正在执行的操作,并假设查询返回与外部表中相同数量的行这一事实并非巧合,最简单的选择是做类似
or
的事情 更进一步将逻辑从游标(和 IF 语句)移到驱动游标中应该会更有效。使用上面更简单的代码片段(当然,您可以将
some_query
加入到这些示例中。即使在执行此操作之后,您仍然发现正在执行一些逐行处理,那么您甚至可以更进一步,从驱动游标执行 BULK COLLECT 到本地声明的集合中,并对该集合进行操作。您几乎肯定不希望将 3 GB 的数据提取到本地集合中(尽管可能会破坏 PGA)。让 DBA 得出结论,临时表并不是一件坏事,我不建议这样做),使用 LIMIT 子句一次获取几百行应该会使事情变得更有效率。
While totally agreeing with Quassnoi's suggestion that external tables do not appear to be the proper solution here, as well as DCookie's analogy that you're being bound and tossed overboard and asked to swim, there may at least be a way to structure your program so that the external table is only read once. My belief from your description is that all 10 cursors are reading from the external table, meaning that you are forcing Oracle to scan the external table 10 times.
Assuming this inference is correct, the simplest answer is likely to make the external table the driving cursor, similar to what IronGoofy suggested. Depending on what
some_query
in the code snippet below is doing,and assuming that the fact that the query returns the same number of rows that are in the external table is not a coincidence, the simplest option would be to do something like
or
It should be more efficient to take things a step further and move logic out of the cursors (and IF statements) and into the driving cursor. Using the simpler of the code snippets above (you could, of course, join
some_query
to these examplesIf, even after doing this, you still find that you are doing some row-by-row processing, you could even go one more step forward and do a BULK COLLECT from the driving cursor into a locally declared collection and operate on that collection. You almost certainly don't want to fetch 3 GB of data into a local collection (though crushing the PGA might lead the DBA to conclude that temporary tables aren't such a bad thing, it's not something I would advise), fetching a few hundred rows at a time using the LIMIT clause should make things a bit more efficient.