读取 CLOB 列很慢
希望有人能够阐明我目前在 Oracle DB 方面遇到的问题 - 我确信这很简单!
我已经设法在示例中重新创建它,所以这里是数据库结构:
CREATE TABLE MyTable(
ixMyTable NUMBER,
clobData CLOB
)
/
CREATE OR REPLACE PACKAGE PKGTEST
AS
PROCEDURE DoSomething(
cur_OUT OUT SYS_REFCURSOR
);
END PKGTEST;
/
CREATE OR REPLACE PACKAGE BODY PKGTEST
AS
PROCEDURE DoSomething(
cur_OUT OUT SYS_REFCURSOR
)
AS
BEGIN
OPEN cur_OUT FOR
SELECT ixMyTable, clobData
FROM MyTable;
END;
END PKGTEST;
/
GRANT EXECUTE ON PKGTEST TO TEST_ROLE
/
BEGIN
FOR i IN 1 .. 7000 LOOP
insert into mytable values (i, TO_CLOB('123456'));
END LOOP;
END;
/
额外信息:
架构所有者是 TEST_SCHEMA
用户是 CARL
CARL 具有 TEST_ROLE 角色
鉴于上述数据库设置,我有一个使用标准的 C# 测试应用程序System.Data.OracleClient.OracleCommand 等,以便执行 PKGTEST.DoSomething 并将结果放入数据网格 (DevExpress)。
很确定网格在这里无关紧要,因为我们通过使用开源 OTL 的 C++ 遇到了同样的问题(幸运的是,不是我的部门)。
好的,问题来了……
从开始到网格填充的时间约为 35-40 秒,哎呀。
但是,如果我执行以下操作:
GRANT SELECT ON MyTable TO TEST_ROLE
/
然后再次执行查询,则需要大约 5-6 秒。
在我看来,这与特权等有关,但我不太确定为什么它仍然实际上是双向的?
只是为了将其他东西扔进锅里,如果我将过程更改为
SELECT ixMyTable, TO_CLOB(TO_NCLOB(clobData))
FROM MyTable;
那么时间约为 5-6 秒,无论有或没有额外的 SELECT 权限。
任何指示或直接的解决方案将不胜感激!
编辑:
操作系统是 Vista x86 Business
Oracle 服务器是 10.2.0.1
Oracle 客户端是 10.2.0.3
编辑:
按照建议,我尝试从 MS OracleClient 更改为 ODP.NET,这确实根据需要加快了速度。
不幸的是,受影响的 C# 应用程序只是一个内部应用程序,用于查看表/运行 SPROCS 等。
我们的主要交付成果是使用 OTL 的 C++ 应用程序 (http://otl.sourceforge.net/otl3_intro.htm) 用于数据库访问。 这并不是真正可以改变的东西,所以我真的很想了解差异的原因,而不必随意地抛出无缘无故的 GRANT SELECT。
如果缺乏 SELECT 权限导致完全失败,那么我可能会忍受这种情况,但缺乏 SELECT 似乎会导致访问 CLOB 数据的某些较慢的路径。
我已经标记了 3 个答案 - 感谢这些 - 但我真的可以这样做有一个理由,所以我会为此添加赏金。
PS 我们一开始确实想在 C++ 中使用 OCCI,但由于 Oracle 始终支持当前版本的 IDE,因此我们无法让它与 Visual Studio 2008 很好地配合。
Hopefully someone can shed a little light on an issue that I'm currently having with an Oracle DB - I'm sure it's something simple!!
I've managed to recreate this in a sample, so here is the DB structure:
CREATE TABLE MyTable(
ixMyTable NUMBER,
clobData CLOB
)
/
CREATE OR REPLACE PACKAGE PKGTEST
AS
PROCEDURE DoSomething(
cur_OUT OUT SYS_REFCURSOR
);
END PKGTEST;
/
CREATE OR REPLACE PACKAGE BODY PKGTEST
AS
PROCEDURE DoSomething(
cur_OUT OUT SYS_REFCURSOR
)
AS
BEGIN
OPEN cur_OUT FOR
SELECT ixMyTable, clobData
FROM MyTable;
END;
END PKGTEST;
/
GRANT EXECUTE ON PKGTEST TO TEST_ROLE
/
BEGIN
FOR i IN 1 .. 7000 LOOP
insert into mytable values (i, TO_CLOB('123456'));
END LOOP;
END;
/
Extra info:
Schema owner is TEST_SCHEMA
User is CARL
CARL has the TEST_ROLE role
Given the database setup as above, I have a C# test app that uses the standard System.Data.OracleClient.OracleCommand etc. in order to execute PKGTEST.DoSomething and throw the results into a datagrid (DevExpress).
Pretty sure that the grid is irrelevant here, as we experience the same issue through c++ using the open source OTL (not my department, fortunately).
OK, to the problem....
Time from beginning until the grid is populated is ~35-40s, ouch.
However, if I do the following:
GRANT SELECT ON MyTable TO TEST_ROLE
/
and then perform the query again, it takes ~5-6s.
It seems to me that this has something to do with privileges etc., but I'm not quite sure why it still actually works both ways??
Just to throw something else into the pot, if I change the procedure to
SELECT ixMyTable, TO_CLOB(TO_NCLOB(clobData))
FROM MyTable;
Then the time is ~5-6s, with or without the extra SELECT privilege.
Any pointers or straight out solutions would be much appreciated!
Edit:
OS is Vista x86 Business
Oracle Server is 10.2.0.1
Oracle Client is 10.2.0.3
Edit:
As suggested, I've tried changing from the MS OracleClient to the ODP.NET and this does speed up as required.
Unfortunately, the C# app that was being affected was just an internal app that is used for viewing tables / running SPROCS etc.
Our main deliverable is the C++ app that uses OTL (http://otl.sourceforge.net/otl3_intro.htm) for database access. This is not really something that can be changed at this time, and so I would really like to understand the reasons for the difference, without having to throw gratuitous GRANT SELECTs around willy-nilly.
If lack of SELECT privilege caused a complete failure then I could probably live with this, but lack of SELECT appears to be causing some slower route for accessing the CLOB data.
I've marked up the 3 answers - thanks for those - but I could really do with a reason, so I'll add a bounty to this.
P.S. We really wanted to go with OCCI at the outset for our C++, but as Oracle are always supporting a version of the IDE before current we couldn't get it to play nicely with our Visual Studio 2008.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
您确定每次都从磁盘读取 blob,而不是从磁盘缓存中读取第二次及后续的 blob 吗?
我在性能测试中看到过这个问题,特别是在 Oracle 上,第一次运行测试很糟糕。 然后,通过一个微小的(看似微不足道的变化),性能突然显着提高。 但实际上发生的情况是,您正在查询的数据已加载到缓存中,并且可以以 10 倍或 20 倍的速率(内存与磁盘)进行访问。
执行此测试的正确方法是在查询运行之间弹跳数据库。 如果 DBA 不允许您为该测试退回测试生产服务器,请在您的计算机上加载 Oracle XE 的副本。
编辑:或者更好:每次删除并重新创建表。 您可能正在这样做,但没有提及。
Are you sure that you are reading the blob from disk each time, and not reading if from disk cache the second and following?
I've seen this problem with performance testing, particularly on Oracle, where the first run of a test is terrible. Then with one minor (and seeming insignificant change) the performance suddenly improves significantly. But really what has happened is that the data you are querying has been loaded into the cache and can be access at 10x or 20x the rate (memory vs. disk).
The proper way to do this test is to bounce the database between query runs. Load a copy of Oracle XE on your machine if the DBA won't let you bounce the test-production server for this test.
Edit: Or better: drop and re-create the table every time. You may be doing this, but didn't mention it.
我会尝试 odp.net http://www.oracle.com/ technology/tech/windows/odpnet/index.html 而不是 System.Data.OracleClient。
I would try odp.net http://www.oracle.com/technology/tech/windows/odpnet/index.html instead of System.Data.OracleClient.
根据上述建议,您可以尝试不同的 ODBC 驱动程序或客户端软件。 事实上,
TO_CLOB(TO_NCLOB())
运行速度如此之快,而直线似乎并不表明问题就在那里。首先,我将获取两个查询并通过 SQLDeveloper 运行它们并获得解释计划。 这应该为您在 Oracle 端提供执行基准。 添加转换不应对执行路径产生任何影响。 交替运行查询并计时以查看它们的速度。 如果没有区别的话,我建议是你的客户端软件的问题。
如果我的假设是正确的,这也可以解释
GRANT SELECT
更改。 客户端软件正在对 CLOB 进行某种昂贵的转换。 授予和/或显式转换允许客户端避免这种情况。 我不知道为什么。Following up on the suggestion above, can you try a different ODBC driver, or client software. The fact that the
TO_CLOB(TO_NCLOB())
works so faster, where the straight one doesn't seems to indicate that that is where the problem exists.First I would take the two queries and run them through SQLDeveloper and get an explain plan. This should give you a baseline for execution on the Oracle side. Adding the conversions should make no difference on the execution path. Run the queries alternately and time both to see how fast they are. If there is no difference, I'd suggest that the client software is your problem.
If my assumption is correct, this would also explain the
GRANT SELECT
change. The client software is doing some sort of expensive conversion on the CLOB. The grant and/or explicit conversion allows the client to avoid this. I don't know why.我知道这是一个非常老的问题,
这是我根据 Oracle OCCI 库的经验所说的,但我相信这适用于任何 Oracle 客户端。 blob/clob 读取操作缓慢的原因如下:
set arraysize 1
& 进行实验。 对于带有 & 的表,设置 arraysize 5000
没有 blob/clob 列。 对于包含大量记录且没有 blob/clob 列的表,arraysize
将产生很大的差异。 对于具有 blob/clob 列的表,这不会有太大区别。 因此读取每条记录都会与服务器进行一次往返。 在速度较慢的网络(例如 WAN)上,这会产生很大的影响请参见此处。 https://docs.oracle.com/cd/ E18283_01/appdev.112/e10646/oci07lob.htm#CHDDHFAB
您可以通过启用 LOB 行预取来缓解此问题。 https: //docs.oracle.com/en/database/oracle/oracle-database/12.2/jajdb/index.html?oracle/jdbc/OracleStatement.html
我从来没有让这个适用于 OCCI,但实现了一个解决方法这里提到(有没有办法在奥奇?)
I know it is a really old question
I speak from my experience with Oracle OCCI library, but I believe this will apply to any Oracle client. The reasons blob/clob read operations are slow are the following
set arraysize 1
&set arraysize 5000
for tables with & without blob/clob columns. For a table with large number of records and no blob/clob columns,arraysize
will make a big difference. For a table with blob/clob columns, it won't make much difference. So reading each record will make a round trip to the server. On slower networks (e.g. WAN) this has a big impactSee here. https://docs.oracle.com/cd/E18283_01/appdev.112/e10646/oci07lob.htm#CHDDHFAB
You can mitigate this by enabling row prefetching for LOB. https://docs.oracle.com/en/database/oracle/oracle-database/12.2/jajdb/index.html?oracle/jdbc/OracleStatement.html
I never got this to work for OCCI, but implemented a workaround mentioned here (Is there a way to prefetch LOBs data in occi?)
老实说,我认为这里的问题是 OTL 驱动程序和 OracleClient 在处理来自 SYS_REFCURSOR 的 CLOB 时遇到问题。 SYS_REFCURSOR 很弱,这意味着游标本身可以返回任何类型的数据,这意味着驱动程序需要不断返回数据库并查询游标的元数据(即获取游标中的数据类型),然后加载 CLOB 流,然后加载数据。 显然,如果执行查询的用户有权访问正在查询的表,则驱动程序可以更好地检索元数据并返回正确的数据。
更改存储过程以返回强 ref_cursor 是否有效?
您可以更改存储过程来执行
TO_CLOB()
,因为这似乎也有效吗?I honestly think the problem here is that the OTL driver and the OracleClient are having problems dealing with the CLOB coming from the SYS_REFCURSOR. SYS_REFCURSORs are weak, meaning the cursor itself can return any type of data, meaning the drivers need to continually go back to database and query the metadata for the cursor (ie. get the types of data in the cursor), then load the CLOB stream, then load the data. Obviously if the user doing the query has access to the tables being queried, the driver can do a better job of retrieving the metadata and returning the correct data.
Does changing the Stored Procedure to return a strong ref_cursor work?
Can you change the Stored Procedure to do the
TO_CLOB()
, since that also seems to work?您的数据库的 NLS_CHARACTERSET 和 NLS_NCHAR_CHARACTERSET 是什么。 运行
即可得到结果。 您的客户端软件的 NLS_LANG 设置是什么?
这可能会让您对问题有更多的了解,并有助于回答 TO_CLOB(TO_NCLOB()) 调用的成本有多高的问题。
What is the NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET for your database. Run
to get the results. And what is the NLS_LANG setting for your client software?
This may give some more insight to the problem, and will help answer the question of how expensive the TO_CLOB(TO_NCLOB()) call is.
这些不一定都是直接相关的,但您可能应该检查每一个,以防它们相关。 我怀疑这与缓存相关:完成第一个查询后,您将应用选择,然后速度会很快。 在中间,如果您想正确地对其进行性能测试,则必须反弹服务器以摆脱缓存。 如果您进行此测试并且它突然表现得更好,那么请尝试将表固定到缓存中。 请参阅下面有关内联 clob 存储的内容,因为这可能是相关的。
大约一年前,我在 Oracle 10g 中遇到了有关 clob 性能的问题。 一旦我们得到了出色的 dba 的帮助,我们就解决了其中的大部分问题。 大约花了两个月的时间才让性能达到足够的速度。
您使用的是哪个版本的 Oracle?
在 Oracle 10g(早期版本)中,clob 性能存在大量问题。 事实上,在某些情况下,仅使用两个表和一个 varchar 列实际上更快(将 varchar 连接在一起,就得到了 clob)。 我们升级到了更高的版本,它好多了
另外您的数据存储在哪里?
还有一个选项可以将 clob 存储在表本身中。 根据您的数据有多大,您可能会发现这有助于提高性能。
如果您将数据存储在 SAN 上,那么还值得查看 SAN 上的缓存大小以及块大小。 当缓存大小不正确时,Oracle + SAN 可能会有点有趣。
另一个解决方法:
如果您发现持久性很慢,甚至访问很慢,并且不受 CPU 限制,请压缩数据并将其存储在 blob 中。 我们在这里也看到了巨大的性能优势。
如果您在处理 clob 的任何过程中看到性能问题(与内存相关?),我们发现我们会将对象重新创建为新字符串。 即使数据较小,驱动程序也会预先创建 32K 大小的字符串。
我确实想知道系统表是否可能存在碎片? 有很多表/模式吗? 那么同义词呢?
另外,当您存储 Clob 时,它们不会被存储到 Oracle 中的一个大文件中吗? 如果我没记错的话,你必须小心碎片; 存储不会被释放以供重用。
也许您可以在数据库前面放置一个 .NET Web 服务? 如果您无法解决性能问题,这可能是一种选择。
These aren't all necessarily directly linked, but you probably ought to check each one out in case it is related. I suspect that this is caching related: once you've done your first query you then apply select and then it goes fast. In the middle you have to bounce the server to get rid of the caches if you want to performance test it correctly. If you do this test and it suddenly performs better, then try pinning the table into the cache. See below around clob storage inline, since that will probably be related.
I had issues around clob performance in Oracle 10g a year or so back. We got around most of them once we got our awesome dba to help. It took about 2 months to get the performance up to an adequate speed.
Which version of Oracle are you using?
In Oracle 10g (early versions) there were massive issues with clob performance. In fact it was actually faster in some cases to just use two tables, and a varchar column (concatenate the varchars together and you have your clob). We upgraded to a later version, and it was much better
Also where is your data being stored?
There's also an option to store the clob in the table itself. Depending how big your data is you might find this helps performance.
If you've got your data stored on a SAN then it's worth also looking at the cache size on the SAN and also the block sizes. Oracle + SAN can be a bit funny when the caching sizes are incorrect.
Another workaround:
If you're finding persistence is slow or even access is slow and you're not CPU bound, zip the data and store it in a blob. We saw a big performance benefit here too.
If you're seeing performance issues (memory related?) in whatever is processing the clobs, we found that we would recreate the objects as new strings. The drivers were pre-creating 32K sized Strings even if the data was smaller.
I did wonder whether the system tables might be fragmented? Are there lots of tables/schemas? What about synonyms?
Also, when you store clobs, don't they get stored into one massive file in Oracle? If I remember correctly you have to be careful about fragmentation; the storage doesn't get released for reuse.
Perhaps you could put a .NET web service in front of your db? That might be one option if you can't solve the performance issues.