建议如何优化解决方案(循环所有记录并检查错误)

发布于 12-07 16:04 字数 1862 浏览 2 评论 0原文

我使用了以下内容(检查循环中的错误,如果存在,我将它们插入表中):

FOR rec IN (SELECT MAX(t.s_id) as s_id,t.sdate,t.stype,t.snumber,t.code,
            SUM(t.amount) as amount, t... (other fields)
            FROM stable t WHERE t.sdate=p_date AND t.stype=p_type 
                          AND t.snumber=p_num 
            GROUP BY t.sdate,t.snumber,t.stype, t... (other fields)) LOOP  
    v_reason := null;

    BEGIN
      SELECT d.source_id INTO i_source_id FROM mapping m, source d 
      WHERE TO_NUMBER(m.stage)=rec.snumber AND 
            m.month=EXTRACT(MONTH FROM rec.sdate) AND 
            m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
            m.month=d.month AND m.year=d.year AND m.name='SOURCE';
    EXCEPTION
      WHEN OTHERS
        THEN 
           e_id := 1;
           v_reason := 'source_id';
    END;

    IF (v_reason IS NULL) THEN
        BEGIN     
          SELECT p.product_id INTO i_product_id FROM mapping m, product p
          WHERE m.stage=rec.code AND 
                m.month=EXTRACT(MONTH FROM rec.sdate) AND 
                m.year=EXTRACT(YEAR FROM rec.sdate) AND 
                m.desc=p.product_name AND m.month=p.month AND 
                m.year=p.year AND m.name='PRODUCT';               
        EXCEPTION
          WHEN OTHERS
            THEN 
               e_id := 2;
               v_reason := 'product_id';
        END;
      END IF;

    --- and 5 more checks from other tables ---
    ---....---

    IF (v_reason IS NULL) THEN
       INSERT INTO tbl_destination(sdate,source_id,product_id,amount, ... and others) 
       VALUES(rec.sdate,i_source_id,i_product_id,NVL(abs(rec.amount),0), ...);  
    ELSE
       INSERT INTO tbl_errors(rec_id,e_id,desc) VALUES(rec.s_id,e_id,v_reason);
    END IF; 
    COMMIT;                         
END LOOP;    

对于大量记录(大约 20000 条)来说太慢了。请帮我。

I used the following (check for errors in loop and if they are exists I insert they into the table):

FOR rec IN (SELECT MAX(t.s_id) as s_id,t.sdate,t.stype,t.snumber,t.code,
            SUM(t.amount) as amount, t... (other fields)
            FROM stable t WHERE t.sdate=p_date AND t.stype=p_type 
                          AND t.snumber=p_num 
            GROUP BY t.sdate,t.snumber,t.stype, t... (other fields)) LOOP  
    v_reason := null;

    BEGIN
      SELECT d.source_id INTO i_source_id FROM mapping m, source d 
      WHERE TO_NUMBER(m.stage)=rec.snumber AND 
            m.month=EXTRACT(MONTH FROM rec.sdate) AND 
            m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
            m.month=d.month AND m.year=d.year AND m.name='SOURCE';
    EXCEPTION
      WHEN OTHERS
        THEN 
           e_id := 1;
           v_reason := 'source_id';
    END;

    IF (v_reason IS NULL) THEN
        BEGIN     
          SELECT p.product_id INTO i_product_id FROM mapping m, product p
          WHERE m.stage=rec.code AND 
                m.month=EXTRACT(MONTH FROM rec.sdate) AND 
                m.year=EXTRACT(YEAR FROM rec.sdate) AND 
                m.desc=p.product_name AND m.month=p.month AND 
                m.year=p.year AND m.name='PRODUCT';               
        EXCEPTION
          WHEN OTHERS
            THEN 
               e_id := 2;
               v_reason := 'product_id';
        END;
      END IF;

    --- and 5 more checks from other tables ---
    ---....---

    IF (v_reason IS NULL) THEN
       INSERT INTO tbl_destination(sdate,source_id,product_id,amount, ... and others) 
       VALUES(rec.sdate,i_source_id,i_product_id,NVL(abs(rec.amount),0), ...);  
    ELSE
       INSERT INTO tbl_errors(rec_id,e_id,desc) VALUES(rec.s_id,e_id,v_reason);
    END IF; 
    COMMIT;                         
END LOOP;    

It is too slow for large number of records (about 20000). Please, help me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

长发绾君心2024-12-14 16:04:05

在 SQL 和 PLSQL 之间来回跳转会带来巨大的开销。在您的情况下,您执行一个查询,然后对主查询中找到的每条记录执行新查询。由于 SQL 和 PLSQL 之间的所有上下文切换以及单独的查询更难优化,这会减慢速度。写一个大查询。优化器可以发挥其所有魔力,而您只需要一次上下文切换。

执行下一个查询:它返回的每一行都是一个错误。您只需读取 sourceCount 和 ProductCount 即可了解哪一个有问题(或两者都有)。

插入错误:

insert into tbl_errors (rec_id, e_id, desc) 
select
  s_id, 
  case 
    when sourceCount <> 1 then 1
    when productCount <> 1 then 2
    when ...
  end as e_id,
  case 
    when sourceCount <> 1 then 'source_id'
    when productCount <> 1 then 'product_id'
    when ...
  end as reason
from
(
    SELECT 
      MAX(t.s_id) as s_id,
      t.sdate,t.stype,t.snumber,t.code,
      SUM(t.amount) as amount, 

      (SELECT count(*) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as sourceCount,

      (SELECT count(*)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as productCount,

      /* other checks */        

    FROM 
      stable t 
    WHERE 
      t.sdate=p_date AND t.stype=p_type 
      AND t.snumber=p_num 
    GROUP BY 
      t.sdate, t.snumber, t.stype
) x
having 
  sourceCount <> 1 or productCount <> 1 or /* other checks */

插入正确的记录。使用相同的查询进行检查,但添加额外的子查询来获取正确的产品 ID 和源 ID。

insert into tbl_destination(sdate,source_id,product_id,amount, ...)
select
  sdate,
  source_id,
  product_id,
  amount,
  ...
from
(
    SELECT 
      MAX(t.s_id) as s_id,
      t.sdate,t.stype,t.snumber,t.code,
      SUM(t.amount) as amount, 

      (SELECT count(*) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as sourceCount,
      (SELECT min(source_id) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as source_id,

      (SELECT count(*)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as productCount,
      (SELECT min(product_id)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as product_id,

      /* other checks */        

    FROM 
      stable t 
    WHERE 
      t.sdate=p_date AND t.stype=p_type 
      AND t.snumber=p_num 
    GROUP BY 
      t.sdate, t.snumber, t.stype
) x
having 
  sourceCount = 1 and productCount = 1 and /* other checks */

Jumping back and forth between SQL and PLSQL gives a tremendous amount of overhead. In your case, you execute a query, and then execute new queries for each record found in the main query. This slows the lot down because of all those context switches between SQL and PLSQL and because of separate queries are harder to optimize. Write one big query. The optimizer can do all its magic, and you only got a single context switch.

Execute the next query: every row it returns is an error. You only need to read sourceCount and productCount to see which one is the problem (or both).

To insert the errors:

insert into tbl_errors (rec_id, e_id, desc) 
select
  s_id, 
  case 
    when sourceCount <> 1 then 1
    when productCount <> 1 then 2
    when ...
  end as e_id,
  case 
    when sourceCount <> 1 then 'source_id'
    when productCount <> 1 then 'product_id'
    when ...
  end as reason
from
(
    SELECT 
      MAX(t.s_id) as s_id,
      t.sdate,t.stype,t.snumber,t.code,
      SUM(t.amount) as amount, 

      (SELECT count(*) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as sourceCount,

      (SELECT count(*)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as productCount,

      /* other checks */        

    FROM 
      stable t 
    WHERE 
      t.sdate=p_date AND t.stype=p_type 
      AND t.snumber=p_num 
    GROUP BY 
      t.sdate, t.snumber, t.stype
) x
having 
  sourceCount <> 1 or productCount <> 1 or /* other checks */

To insert the records that are ok. Use the same query for checks, but add extra subqueries to get the right product id and source id.

insert into tbl_destination(sdate,source_id,product_id,amount, ...)
select
  sdate,
  source_id,
  product_id,
  amount,
  ...
from
(
    SELECT 
      MAX(t.s_id) as s_id,
      t.sdate,t.stype,t.snumber,t.code,
      SUM(t.amount) as amount, 

      (SELECT count(*) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as sourceCount,
      (SELECT min(source_id) 
      FROM mapping m, source d 
      WHERE 
        TO_NUMBER(m.stage)=rec.snumber AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND m.desc=d.source_desc AND
        m.month=d.month AND m.year=d.year AND m.name='SOURCE') as source_id,

      (SELECT count(*)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as productCount,
      (SELECT min(product_id)
      FROM mapping m, product p
      WHERE 
        m.stage=rec.code AND 
        m.month=EXTRACT(MONTH FROM rec.sdate) AND 
        m.year=EXTRACT(YEAR FROM rec.sdate) AND 
        m.desc=p.product_name AND m.month=p.month AND 
        m.year=p.year AND m.name='PRODUCT') as product_id,

      /* other checks */        

    FROM 
      stable t 
    WHERE 
      t.sdate=p_date AND t.stype=p_type 
      AND t.snumber=p_num 
    GROUP BY 
      t.sdate, t.snumber, t.stype
) x
having 
  sourceCount = 1 and productCount = 1 and /* other checks */
悸初2024-12-14 16:04:05

通常,最高效的方法是将 plsql 转换为基于集合的操作,并摆脱循环,我将首先采用驱动查询并将其嵌入到每个查询中(在循环中)。然后将它们变成插入物。注意将 IF 语句中的任何逻辑合并到 WHERE 子句中。

例如:
当您插入没有找到记录的错误时,您可以将第一个 SELECT INTO....EXCEPTION 块更改为直接插入,在映射表中找不到任何行,

    INSERT INTO tbl_errors
    SELECT s_id, 1 as e_id , 'source_id' as reason 
    FROM
    (
        SELECT MAX(t.s_id) as s_id,t.sdate,t.stype,t.snumber,t.code,
                   SUM(t.amount) as amount, t... (other fields)
        FROM stable t 
        WHERE t.sdate=p_date AND t.stype=p_type AND t.snumber=p_num 
        GROUP BY t.sdate,t.snumber,t.stype, t... (other fields)
    ) drv
    LEFT JOIN mapping m ON TO_NUMBER(m.stage) = drv.s_id   --etc 
    LEFT JOIN source d ON m.desc=d.source_desc AND m.month=d.month --etc
    WHERE m.stage IS NULL

最终您将得到多次插入,现在应该可以进一步优化并将所有选择合并到单个语句中并将操作作为单个插入执行。

然后,要插入错误,只需插入驱动查询中没有错误的行,

即:

INSERT INTO tbl_destination
SELECT * from drv
WHERE NOT EXISTS(SELECT * from tbl_errors WHERE s_id=drv.s_id)

Usually the most performant way is to convert the plsql into set based operations, and get rid of the LOOP, I would start by taking the driving query and embed it into each of the queries (in the loop). Then turn these into inserts. taking care to incorporate any logic in the IF statements into the WHERE clause.

e.g:
as you are inserting an error where there are no records found you could change the first SELECT INTO....EXCEPTION block into a direct insert where it can't find any rows in the mapping tables

    INSERT INTO tbl_errors
    SELECT s_id, 1 as e_id , 'source_id' as reason 
    FROM
    (
        SELECT MAX(t.s_id) as s_id,t.sdate,t.stype,t.snumber,t.code,
                   SUM(t.amount) as amount, t... (other fields)
        FROM stable t 
        WHERE t.sdate=p_date AND t.stype=p_type AND t.snumber=p_num 
        GROUP BY t.sdate,t.snumber,t.stype, t... (other fields)
    ) drv
    LEFT JOIN mapping m ON TO_NUMBER(m.stage) = drv.s_id   --etc 
    LEFT JOIN source d ON m.desc=d.source_desc AND m.month=d.month --etc
    WHERE m.stage IS NULL

eventually you will end up with several inserts, it should now be possible to optimise furthur and merge all of the selects into a single statement and do the operation as a single insert.

Then to insert the errors just insert the rows from the driving query that have no errors

i.e:

INSERT INTO tbl_destination
SELECT * from drv
WHERE NOT EXISTS(SELECT * from tbl_errors WHERE s_id=drv.s_id)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文