大型数据集上的 ODP.Net 存储过程性能问题
我们的数据库之一的 SELECT 查询遇到了一些重大性能问题。请参阅下面的简单过程和相关代码。
在代码中,ExecuteReader() 方法在返回 30K 记录的查询上执行大约 10 秒。迭代 Reader 需要 2 分钟(即使我没有将数据注入任何其他对象)。 2 分钟处理 30k 行数据集对我们来说是不可接受的,因为我们预计数据集有数百万行。
这里有什么东西对你们来说很突出吗?希望您使用 ODP.NET 和 PL/SQL 的经验对您有所帮助。
create or replace PROCEDURE TRACKING_FETCH (
p_tracking_id IN NUMBER,
p_parent_id IN NUMBER,
p_media_id IN NUMBER,
p_custodian_id IN NUMBER,
p_return_cursor OUT SYS_REFCURSOR)
AS
BEGIN
OPEN p_return_cursor FOR
SELECT
*
FROM
tracking
WHERE
(tracking_id = p_tracking_id OR p_tracking_id = 0)
AND (parent_id = p_parent_id OR p_parent_id = 0)
AND (media_id = p_media_id OR p_media_id = 0)
AND (custodian_id = p_custodian_id OR p_custodian_id = 0);
END TRACKING_FETCH;
--
using (DataFactory command
= new DataFactory(dbConnection,
DatabaseType.Oracle,
CommandType.StoredProcedure,
"TRACKING_FETCH"))
{
command.AddInParameter("p_tracking_id", DbType.Int32, trackingid);
command.AddInParameter("p_parent_id", DbType.Int32, parentid);
command.AddInParameter("p_media_id", DbType.Int32, mediaid);
command.AddInParameter("p_custodian_id", DbType.Int32, custodianid);
using (var dr = command.ExecuteReader())
{
while (dr.Read())
{
//Do Things...
}
}
}
任何指导将不胜感激。
We are having some major performance issues with SELECT queries out one of our databases. Please see the simple procedure and associated code below.
In the code the ExecuteReader() method is executing in around 10 seconds on a query returning 30K records. Iterating the Reader takes 2 minutes (even when I am not pumping the data into any other object). 2 minutes for a 30k row data set is unacceptable for us as we are expecting datasets in the millions.
Is there anything here that stands out to any of you? Hoping that your experience with ODP.NET and PL/SQL might help out.
create or replace PROCEDURE TRACKING_FETCH (
p_tracking_id IN NUMBER,
p_parent_id IN NUMBER,
p_media_id IN NUMBER,
p_custodian_id IN NUMBER,
p_return_cursor OUT SYS_REFCURSOR)
AS
BEGIN
OPEN p_return_cursor FOR
SELECT
*
FROM
tracking
WHERE
(tracking_id = p_tracking_id OR p_tracking_id = 0)
AND (parent_id = p_parent_id OR p_parent_id = 0)
AND (media_id = p_media_id OR p_media_id = 0)
AND (custodian_id = p_custodian_id OR p_custodian_id = 0);
END TRACKING_FETCH;
--
using (DataFactory command
= new DataFactory(dbConnection,
DatabaseType.Oracle,
CommandType.StoredProcedure,
"TRACKING_FETCH"))
{
command.AddInParameter("p_tracking_id", DbType.Int32, trackingid);
command.AddInParameter("p_parent_id", DbType.Int32, parentid);
command.AddInParameter("p_media_id", DbType.Int32, mediaid);
command.AddInParameter("p_custodian_id", DbType.Int32, custodianid);
using (var dr = command.ExecuteReader())
{
while (dr.Read())
{
//Do Things...
}
}
}
Any guidance will be greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Oracle 等待接口值得研究。
我怀疑网络延迟会害死你。该过程返回一个指向结果集的指针。在循环中的某个时刻,我猜测您正在获取行(即使它们被转储)。
检查 v$sql 会告诉您正在执行多少次提取以及处理了多少行。将一个除以另一个,您将看到每次提取有多少行。如果您执行 1 行/提取甚至 10-20 行,则需要数千次网络等待。如果您要拉回数百万条记录,则理想情况下每次提取都需要数千行,尽管这可能会占用您的内存。
根据您对这数百万行执行的操作,可能值得重新考虑架构。例如,如果它们被转储到文件中,则可能会在数据库服务器上生成该文件,对其进行压缩,通过网络移动该文件,然后解压缩。
Worth studying up on the Oracle Wait Interface.
I'd suspect network latency is killing you. The procedure is returning a pointer to the result set. At some point in your loop I would guess you are fetching the rows (even if they are being dumped).
Checking v$sql would tell you how many fetches are being done and how many rows are processed. Divide one by the other and you'll see how many rows per fetch. If you are doing 1 row/fetch or even 10-20, that's thousands of network waits. You ideally want thousands of rows per fetch if you are going to be pulling back millions of records, though that may cost you in memory.
Depending on what you are doing with those millions of rows, it may be worth rethinking the architecture. For example, if they are being dumped to a file, then maybe generate the file on the DB server, zip it, move the file over the network, then unzip it.
您是否尝试过在存储过程上运行
EXPLAIN PLAN
?我没有看到您的代码或存储过程有任何直接问题,但全表扫描会严重缩短查询的执行时间。解释计划会告诉您是否有表扫描,然后您可以调整查询以加快速度。Have you tried running an
EXPLAIN PLAN
on the stored procedure? I don't see any immediate problems with either your code or your stored procedure, but full table scans would seriously kill the execution time on your query. An explain plan would tell you if there are table scans, and then you can tune your query to speed it up.不是你的odp.net程序的问题。原因就在 SELECT 中。如果表包含大量记录,优化器可能会根据您的参数决定运行全表扫描。检查解释计划语句运行。如果您看到没有任何帮助。尝试跟踪该语句请参阅物理读数。
It is not a problem of your odp.net program. The reason is in the SELECT. If the table contains lots of records it can be that the optimizer decides to run a full table scan, depending on your parameters. Check with explain plan how the statement runs. If you see nothing helpful. Try to trace the statement to see physical reads.