描述
我有一个 Oracle 存储过程,它在本地开发实例上以及运行 Oracle 8、然后是 9、然后 10 和最近 11 的多个客户端测试和生产实例上运行了 7 年左右。它一直运行,直到升级到 Oracle 11g。基本上,该过程打开一个引用游标,更新一个表,然后完成。在 10g 中,游标将包含预期结果,但在 11g 中,游标将为空。升级到11g后,DML或DDL没有变化。此行为在我尝试过的每个 10g 或 11g 实例上都是一致的(10.2.0.3、10.2.0.4、11.1.0.7、11.2.0.1 - 全部在 Windows 上运行)。
具体的代码要复杂得多,但要以某种现实的概述来解释这个问题:我在标头表和一堆子表中有一些数据,这些数据将输出为 PDF。标头表有一个布尔值(NUMBER(1),其中 0 为假,1 为真)列,指示该数据是否已被处理。
该视图仅限于显示尚未处理的行(该视图还连接其他一些表,进行一些内联查询和函数调用等)。因此,当游标打开时,视图显示一行或多行,然后在游标打开后运行更新语句以翻转标头表中的标志,发出提交,然后过程完成。
在 10g 上,游标打开,它包含该行,然后更新语句翻转标志,第二次运行该过程将不会产生任何数据。
在 11g 上,游标从不包含该行,就好像游标直到更新语句运行后才打开。
我担心 11g 中可能发生了一些变化(希望是可以配置的设置),这可能会影响其他过程和其他应用程序。我想知道的是是否有人知道为什么两个数据库版本之间的行为不同以及是否可以在不更改代码的情况下解决问题。
更新 1: 我设法跟踪这个问题归结为一个独特的约束。看来,当 11g 中存在唯一约束时,无论我是针对实际对象还是针对以下简单示例运行现实世界代码,问题都可以 100% 重现。
更新 2:我能够从方程中完全消除视图。我已经更新了简单的示例,以表明即使直接针对表查询也存在问题。
简单示例
CREATE TABLE tbl1
(
col1 VARCHAR2(10),
col2 NUMBER(1)
);
INSERT INTO tbl1 (col1, col2) VALUES ('TEST1', 0);
/* View is no longer required to demonstrate the problem
CREATE OR REPLACE VIEW vw1 (col1, col2)
AS
SELECT col1, col2
FROM tbl1
WHERE col2 = 0;
*/
CREATE OR REPLACE PACKAGE pkg1
AS
TYPE refWEB_CURSOR IS REF CURSOR;
PROCEDURE proc1 (crs OUT refWEB_CURSOR);
END pkg1;
CREATE OR REPLACE PACKAGE BODY pkg1
IS
PROCEDURE proc1 (crs OUT refWEB_CURSOR)
IS
BEGIN
OPEN crs FOR
SELECT col1
FROM tbl1
WHERE col1 = 'TEST1'
AND col2 = 0;
UPDATE tbl1
SET col2 = 1
WHERE col1 = 'TEST1';
COMMIT;
END proc1;
END pkg1;
匿名块演示
DECLARE
crs1 pkg1.refWEB_CURSOR;
TYPE rectype1 IS RECORD (
col1 vw1.col1%TYPE
);
rec1 rectype1;
BEGIN
pkg1.proc1 ( crs1 );
DBMS_OUTPUT.PUT_LINE('begin first test');
LOOP
FETCH crs1
INTO rec1;
EXIT WHEN crs1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(rec1.col1);
END LOOP;
DBMS_OUTPUT.PUT_LINE('end first test');
END;
/* After creating this index, the problem is seen */
CREATE UNIQUE INDEX unique_col1 ON tbl1 (col1);
/* Reset data to initial values */
TRUNCATE TABLE tbl1;
INSERT INTO tbl1 (col1, col2) VALUES ('TEST1', 0);
DECLARE
crs1 pkg1.refWEB_CURSOR;
TYPE rectype1 IS RECORD (
col1 vw1.col1%TYPE
);
rec1 rectype1;
BEGIN
pkg1.proc1 ( crs1 );
DBMS_OUTPUT.PUT_LINE('begin second test');
LOOP
FETCH crs1
INTO rec1;
EXIT WHEN crs1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(rec1.col1);
END LOOP;
DBMS_OUTPUT.PUT_LINE('end second test');
END;
10g 上的输出示例:
开始第一次测试
测试1
结束第一次测试
开始第二次测试
测试1
结束第二次测试
11g 上的输出示例:
开始第一次测试
测试1
结束第一次测试
开始第二次测试
结束第二次测试
说明
我无法删除 COMMIT,因为在现实场景中,该过程是从 Web 应用程序调用的。当前端的数据提供者调用该过程时,无论如何都会在与数据库断开连接时发出隐式 COMMIT。因此,如果我删除过程中的 COMMIT,那么是的,匿名块演示将起作用,但现实世界的场景不会,因为 COMMIT 仍然会发生。
问题
为什么 11g 的行为不同?除了重新编写代码之外,我还能做些什么吗?
Description
I have an Oracle stored procedure that has been running for 7 or so years both locally on development instances and on multiple client test and production instances running Oracle 8, then 9, then 10, and recently 11. It has worked consistently until the upgrade to Oracle 11g. Basically, the procedure opens a reference cursor, updates a table then completes. In 10g the cursor will contain the expected results but in 11g the cursor will be empty. No DML or DDL changed after the upgrade to 11g. This behavior is consistent on every 10g or 11g instance I've tried (10.2.0.3, 10.2.0.4, 11.1.0.7, 11.2.0.1 - all running on Windows).
The specific code is much more complicated but to explain the issue in somewhat realistic overview: I have some data in a header table and a bunch of child tables that will be output to PDF. The header table has a boolean (NUMBER(1) where 0 is false and 1 is true) column indicating whether that data has been processed yet.
The view is limited to only show rows in that have not been processed (the view also joins on some other tables, makes some inline queries and function calls, etc). So at the time when the cursor is opened, the view shows one or more rows, then after the cursor is opened an update statement runs to flip the flag in the header table, a commit is issued, then the procedure completes.
On 10g, the cursor opens, it contains the row, then the update statement flips the flag and running the procedure a second time would yield no data.
On 11g, the cursor never contains the row, it's as if the cursor does not open until after the update statement runs.
I'm concerned that something may have changed in 11g (hopefully a setting that can be configured) that might affect other procedures and other applications. What I'd like to know is whether anyone knows why the behavior is different between the two database versions and whether the issue can be resolved without code changes.
Update 1: I managed to track the issue down to a unique constraint. It seems that when the unique constraint is present in 11g the issue is reproducible 100% of the time regardless of whether I'm running the real world code against the actual objects or the following simple example.
Update 2: I was able to completely eliminate the view from the equation. I have updated the simple example to show the problem exists even when querying directly against the table.
Simple Example
CREATE TABLE tbl1
(
col1 VARCHAR2(10),
col2 NUMBER(1)
);
INSERT INTO tbl1 (col1, col2) VALUES ('TEST1', 0);
/* View is no longer required to demonstrate the problem
CREATE OR REPLACE VIEW vw1 (col1, col2)
AS
SELECT col1, col2
FROM tbl1
WHERE col2 = 0;
*/
CREATE OR REPLACE PACKAGE pkg1
AS
TYPE refWEB_CURSOR IS REF CURSOR;
PROCEDURE proc1 (crs OUT refWEB_CURSOR);
END pkg1;
CREATE OR REPLACE PACKAGE BODY pkg1
IS
PROCEDURE proc1 (crs OUT refWEB_CURSOR)
IS
BEGIN
OPEN crs FOR
SELECT col1
FROM tbl1
WHERE col1 = 'TEST1'
AND col2 = 0;
UPDATE tbl1
SET col2 = 1
WHERE col1 = 'TEST1';
COMMIT;
END proc1;
END pkg1;
Anonymous Block Demo
DECLARE
crs1 pkg1.refWEB_CURSOR;
TYPE rectype1 IS RECORD (
col1 vw1.col1%TYPE
);
rec1 rectype1;
BEGIN
pkg1.proc1 ( crs1 );
DBMS_OUTPUT.PUT_LINE('begin first test');
LOOP
FETCH crs1
INTO rec1;
EXIT WHEN crs1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(rec1.col1);
END LOOP;
DBMS_OUTPUT.PUT_LINE('end first test');
END;
/* After creating this index, the problem is seen */
CREATE UNIQUE INDEX unique_col1 ON tbl1 (col1);
/* Reset data to initial values */
TRUNCATE TABLE tbl1;
INSERT INTO tbl1 (col1, col2) VALUES ('TEST1', 0);
DECLARE
crs1 pkg1.refWEB_CURSOR;
TYPE rectype1 IS RECORD (
col1 vw1.col1%TYPE
);
rec1 rectype1;
BEGIN
pkg1.proc1 ( crs1 );
DBMS_OUTPUT.PUT_LINE('begin second test');
LOOP
FETCH crs1
INTO rec1;
EXIT WHEN crs1%NOTFOUND;
DBMS_OUTPUT.PUT_LINE(rec1.col1);
END LOOP;
DBMS_OUTPUT.PUT_LINE('end second test');
END;
Example of what the output on 10g would be:
begin first test
TEST1
end first test
begin second test
TEST1
end second test
Example of what the output on 11g would be:
begin first test
TEST1
end first test
begin second test
end second test
Clarification
I can't remove the COMMIT because in the real world scenario the procedure is called from a web application. When the data provider on the front end calls the procedure it will issue an implicit COMMIT when disconnecting from the database anyways. So if I remove the COMMIT in the procedure then yes, the anonymous block demo would work but the real world scenario would not because the COMMIT would still happen.
Question
Why is 11g behaving differently? Is there anything I can do other than re-write the code?
发布评论
评论(3)
这似乎是最近发现的一个错误。 Metalink Bug 1045196 描述了确切的问题。希望补丁很快就会发布。对于那些无法越过 Metalink 墙的人,这里有一些详细信息:
Metalink
Bug 10425196:PL/SQL RETURNING REF CURSOR ACTS DIFFERENTLY ON 11.1.0.6 VS 10.2.0.5
Type: Defect
严重性:2 - 服务严重丢失
状态:代码错误
创建时间:2010 年 12 月 22 日
根据原始案例提交进行的诊断分析:
- 10.2.0.4 Windows 预期行为
- 10.2.0.5 Solaris 预期行为
- 11.1.0.6 Solaris 意外行为
- 11.1.0.7 Windows 意外行为
- 11.2.0.1 Solaris 意外行为
- 11.2.0.2 Solaris 意外行为
我可以确认的更多详细信息:
- 10.2.0.3 Windows 预期行为
- 11.2.0.1 Windows 意外行为
其他详细信息
更改OPTIMIZER_FEATURES_ENABLE='10.2.0.4' 参数无法解决问题。因此,这似乎更多地与 11g 数据库引擎中的设计更改有关,而不是优化器调整。
代码解决方法
这似乎是查询表时使用索引的结果,而不是更新表和/或提交的行为。使用上面的示例,有两种方法可以确保查询不使用索引。两者都可能影响查询的性能。
在发布补丁之前,影响查询的性能可能暂时可以接受,但我相信按照 @Edgar Chupit 建议使用 FLASHBACK 可能会影响整个实例的性能(或者可能在某些实例上不可用),因此该选项可能不可用对于某些人来说可以接受。无论哪种方式,此时代码更改似乎是唯一已知的解决方法。
方法 1:更改代码以将列包装在函数中,以防止使用这一列上的唯一索引。就我而言,这是可以接受的,因为尽管该列是唯一的,但它永远不会包含小写字符。
方法 2:更改查询以使用防止使用索引的提示。您可能期望 NO_INDEX(unique_col1) 提示起作用,但事实并非如此。 RULE 提示不起作用。您可以使用FULL(tbl1)提示,但这可能比使用方法1更减慢您的查询速度。
Oracle 的响应和建议的解决方法
Oracle 支持人员最终通过以下 Metalink 更新进行了响应:
经过一些进一步的通信,听起来似乎这并没有被视为一个错误,而是一个向前推进的设计决策:
在我们的例子中,给定我们的客户端环境,并且由于它被隔离到单个存储过程,我们将继续使用我们的代码解决方法来防止任何未知的实例范围的副作用影响其他应用程序和用户。
This appears to be a bug discovered fairly recently. Metalink Bug 1045196 describes the exact problem. Hopefully a patch will be released soon. For those of you who can't get past the Metalink wall here are a few details:
Metalink
Bug 10425196: PL/SQL RETURNING REF CURSOR ACTS DIFFERENTLY ON 11.1.0.6 VS 10.2.0.5
Type: Defect
Severity: 2 - Severe Loss of Service
Status: Code Bug
Created: 22-Dec-2010
DIAGNOSTIC ANALYSIS from original case submission:
- 10.2.0.4 Windows Expected Behavior
- 10.2.0.5 Solaris Expected Behavior
- 11.1.0.6 Solaris Un-Expected Behavior
- 11.1.0.7 Windows Un-Expected Behavior
- 11.2.0.1 Solaris Un-Expected Behavior
- 11.2.0.2 Solaris Un-Expected Behavior
FURTHER DETAILS I can confirm:
- 10.2.0.3 Windows Expected Behavior
- 11.2.0.1 Windows Un-Expected Behavior
Additional Details
Changing the OPTIMIZER_FEATURES_ENABLE='10.2.0.4' parameter does not resolve the problem. So it seems to be related more to a design change in the 11g database engine rather than an optimizer tweak.
Code Workaround
This appears to be a result of the use of the index when querying the table and not the act of updating the table and/or committing. Using my example above, here are two ways to ensure the query does not use the index. Both may affect the performance of the query.
Affecting the performance of the query might be temporarily acceptable until a patch is released but I believe that using FLASHBACK as @Edgar Chupit suggested could affect the performance of the entire instance (or may not be available on some instances) so that option may not be acceptable for some. Either way, at this point in time code changes appear to be the only known workaround.
Method 1: Change your code to wrap the column in a function to prevent the unique index on this one column from being used. In my case this is acceptable because although the column is unique it will never contain lower case characters.
Method 2: Change your query to use a hint preventing the index from being used. You might expect the NO_INDEX(unique_col1) hint to work, but it does not. The RULE hint does not work. You can use the FULL(tbl1) hint but it's likely that this may slow down your query more than using method 1.
Oracle's Response and Proposed Workaround
Oracle support has finally responded with the following Metalink update:
After some further correspondence it sounds as though this isn't being treated as a bug so much as a design decision moving forward:
In our case, given our client environments and since it is isolated to a single stored procedure we will continue to use our code workaround to prevent any unknown instance-wide side effects from affecting other applications and users.
这确实是一个奇怪的问题,感谢分享!
从 Oracle 11.1 开始,它确实看起来像是 Oracle 中的行为变化,甚至在 metalink 上也确认了类似问题的错误 (bug#10425196)。不幸的是,目前在Metalink 上没有太多关于该主题的信息,但我也向Oracle 提出了SR,要求提供更多信息。
虽然目前我无法向您提供解释为什么会发生这种情况,以及是否有一个(隐藏)参数可以将此行为反转回 10g 风格,但我想我可以为您提供解决方法。您可以使用 Oracle 闪回查询功能强制 Oracle 按预期时间点检索数据。
如果您按如下方式更改代码:
那么结果应该与 10g 中的结果相同。
这是原始测试用例的简化版本:
如果您注释掉第 16 行的提交,则输出将是:
This is indeed strange issue, thanks for sharing!
It really looks like a behavior change in Oracle starting with Oracle 11.1 and there is even confirmed bug with similar issue on metalink (bug#10425196). Unfortunately at the moment there is no much information available on metalink on subject mater, but I've also opened SR with Oracle asking to provide more information.
While at the moment I can not provide you an explanation why it happens and if there is a (hidden) parameter that can reverse this behavior back to 10g style, I think I can provide you with workaround. You can use Oracle flashback query functionality to force Oracle to retrieve data as to expected point in time.
If you change your code as follows:
then result should be the same as in 10g.
And this is simplified version of original test case:
If you comment out commit on line 16 than the output will be:
来自 Metalink(又名 Oracle 支持)
状态 bug 10425196:92 - 已关闭,不是 Bug
问题:
调用返回 REF CURSOR 的存储过程时,不同的行为
出现在 10.2.0.5 及更早版本与 11.1.0.6 及更高版本数据库中。
事件序列
10.2.0.5 及更早版本
返回的游标看不到更新之前打开的数据
正在更新的数据。这是预期的行为。
11.1.0.6及更高版本
返回的游标看到更新的数据并返回更新的数据
与 10.2.0.5 及更早版本的行为不同。
诊断分析:
10.2.0.4 Windows 预期行为
10.2.0.5 Solaris 预期行为
11.1.0.6 Solaris 意外行为
11.1.0.7 Windows 意外行为
11.2.0.1 Solaris 意外行为
11.2.0.2 Solaris 意外行为
相关错误:
未发现。
如果有必要,您可以恢复到 10.2.0.5 之前的行为,设置以下启动参数并重新启动数据库。
_row_cr = 假
From Metalink (aka Oracle Support)
Status bug 10425196 : 92 - Closed, Not a Bug
PROBLEM:
When calling a stored procedure that returns a REF CURSOR, different behavior
is seen in 10.2.0.5 and earlier vs 11.1.0.6 and later databases.
Sequence Of Events
10.2.0.5 and Earlier
The returned cursor does not see the updated data as it was opened prior to
the data being updated. This is the expected behavior.
11.1.0.6 and Later
The returned cursor sees the updated data and returns the updated data which
is different than the 10.2.0.5 and earlier behavior.
DIAGNOSTIC ANALYSIS:
10.2.0.4 Windows Expected Behavior
10.2.0.5 Solaris Expected Behavior
11.1.0.6 Solaris Un-Expected Behavior
11.1.0.7 Windows Un-Expected Behavior
11.2.0.1 Solaris Un-Expected Behavior
11.2.0.2 Solaris Un-Expected Behavior
RELATED BUGS:
None found.
If it is necessary, you can revert back to the pre-10.2.0.5 behavior setting the following startup parameter and restart the database.
_row_cr = false