从 Java 批量插入到 Oracle

发布于 2024-08-30 07:46:16 字数 187 浏览 7 评论 0原文

我需要将许多小行快速插入 Oracle 中。 (5 个字段)。

对于 MySQL,我将插入分为 100 个插入组,然后对每组 100 个插入使用一个插入语句。

但对于 Oracle,用户反馈是批量插入(1000-30000 次)太慢。

是否有类似的技巧可以用来加快从 Java 到 Oracle 的编程插入速度?

I need to insert many small rows rapidly into Oracle. (5 fields).

With MySQL, I break the inserts into groups of 100, then use one insert statement for every group of 100 inserts.

But with Oracle, user feedback is that the mass inserts (anywhere from 1000-30000) are too slow.

Is there a similar trick I can use to speed up the programmatic inserts from Java into Oracle?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

疑心病 2024-09-06 07:46:17

您可以使用Spring的DAO模块批量插入多行。

在一次更新中将 Order 对象集合插入数据库的示例:

public class OrderRepositoryImpl extends SimpleJdbcDaoSupport implements
        OrderRepository {

    private final String saveSql = "INSERT INTO orders(userid, username, coffee, coffeename, amount) "
            + "VALUES(?, ?, ?, ?, ?)";

    public void saveOrders(final Collection<Order> orders) {
        List<Object[]> ordersArgumentList = new ArrayList<Object[]>(orders
                .size());

        Object[] orderArguments;
        for (Order order : orders) {
            orderArguments = new Object[] { order.getUserId(),
                    order.getUserName(), order.getCoffe(),
                    order.getCoffeeName(), order.getAmount() };

            ordersArgumentList.add(orderArguments);
        }

        getSimpleJdbcTemplate().batchUpdate(saveSql, ordersArgumentList);
    }
}

You can use Spring's DAO module to batch insert many rows.

An example that inserts a collection of Order objects into the database in one update:

public class OrderRepositoryImpl extends SimpleJdbcDaoSupport implements
        OrderRepository {

    private final String saveSql = "INSERT INTO orders(userid, username, coffee, coffeename, amount) "
            + "VALUES(?, ?, ?, ?, ?)";

    public void saveOrders(final Collection<Order> orders) {
        List<Object[]> ordersArgumentList = new ArrayList<Object[]>(orders
                .size());

        Object[] orderArguments;
        for (Order order : orders) {
            orderArguments = new Object[] { order.getUserId(),
                    order.getUserName(), order.getCoffe(),
                    order.getCoffeeName(), order.getAmount() };

            ordersArgumentList.add(orderArguments);
        }

        getSimpleJdbcTemplate().batchUpdate(saveSql, ordersArgumentList);
    }
}
下壹個目標 2024-09-06 07:46:17

您没有说明如何将这些记录传递到数据库。最好的方法是使用数组,因为这允许使用 Oracle 的大量漂亮的 FORALL 批量操作。

该示例包有两个过程。一种是填充 T23 记录的集合(一个包含五个数字列的表),另一种是使用数组将记录批量插入到该表中。

SQL> create or replace package p23 as
  2      type t23_nt is table of t23%rowtype;
  3      function pop_array ( p_no in number )
  4          return t23_nt;
  5      procedure ins_table ( p_array in t23_nt );
  6  end p23;
  7  /

Package created.

SQL> create or replace package body p23 as
  2
  3      function pop_array ( p_no in number )
  4          return t23_nt
  5      is
  6          return_value t23_nt;
  7      begin
  8          select level,level,level,level,level
  9          bulk collect into return_value
 10          from dual
 11          connect by level <= p_no;
 12          return return_value;
 13      end pop_array;
 14
 15      procedure ins_table
 16              ( p_array in t23_nt )
 17      is
 18          s_time pls_integer;
 19      begin
 20
 21          s_time := dbms_utility.get_time;
 22
 23          forall r in p_array.first()..p_array.last()
 24              insert into t23
 25              values p_array(r);
 26
 27          dbms_output.put_line('loaded '
 28                  ||to_char(p_array.count())||' recs in '
 29                  ||to_char(dbms_utility.get_time - s_time)
 30                  ||' csecs');
 31      end ins_table;
 32  end p23;
 33  /

Package body created.

SQL>

以下是一些示例运行的输出:

SQL> declare
  2      l_array p23.t23_nt;
  3  begin
  4      l_array := p23.pop_array(500);
  5      p23.ins_table(l_array);
  6      l_array := p23.pop_array(1000);
  7      p23.ins_table(l_array);
  8      l_array := p23.pop_array(2500);
  9      p23.ins_table(l_array);
 10      l_array := p23.pop_array(5000);
 11      p23.ins_table(l_array);
 12      l_array := p23.pop_array(10000);
 13      p23.ins_table(l_array);
 14      l_array := p23.pop_array(100000);
 15      p23.ins_table(l_array);
 16  end;
 17  /
loaded 500 recs in 0 csecs
loaded 1000 recs in 0 csecs
loaded 2500 recs in 0 csecs
loaded 5000 recs in 1 csecs
loaded 10000 recs in 1 csecs
loaded 100000 recs in 15 csecs

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from t23
  2  /

  COUNT(*)
----------
    119000

SQL>

我认为在 0.15 秒内插入 100,000 条记录应该可以满足除了最苛刻的用户之外的所有用户的需求。那么,问题是,您如何处理插入内容?

You don't say how you are passing these records to the database. The best way is to use an array, as this allows the use of Oracle's plenty of nifty FORALL bulk operations.

This example package has two procedures. One which populates a collection of T23 records (a table which comprises five numeric columns) and one which bulk inserts records into that table using an array.

SQL> create or replace package p23 as
  2      type t23_nt is table of t23%rowtype;
  3      function pop_array ( p_no in number )
  4          return t23_nt;
  5      procedure ins_table ( p_array in t23_nt );
  6  end p23;
  7  /

Package created.

SQL> create or replace package body p23 as
  2
  3      function pop_array ( p_no in number )
  4          return t23_nt
  5      is
  6          return_value t23_nt;
  7      begin
  8          select level,level,level,level,level
  9          bulk collect into return_value
 10          from dual
 11          connect by level <= p_no;
 12          return return_value;
 13      end pop_array;
 14
 15      procedure ins_table
 16              ( p_array in t23_nt )
 17      is
 18          s_time pls_integer;
 19      begin
 20
 21          s_time := dbms_utility.get_time;
 22
 23          forall r in p_array.first()..p_array.last()
 24              insert into t23
 25              values p_array(r);
 26
 27          dbms_output.put_line('loaded '
 28                  ||to_char(p_array.count())||' recs in '
 29                  ||to_char(dbms_utility.get_time - s_time)
 30                  ||' csecs');
 31      end ins_table;
 32  end p23;
 33  /

Package body created.

SQL>

Here is the output from some sample runs:

SQL> declare
  2      l_array p23.t23_nt;
  3  begin
  4      l_array := p23.pop_array(500);
  5      p23.ins_table(l_array);
  6      l_array := p23.pop_array(1000);
  7      p23.ins_table(l_array);
  8      l_array := p23.pop_array(2500);
  9      p23.ins_table(l_array);
 10      l_array := p23.pop_array(5000);
 11      p23.ins_table(l_array);
 12      l_array := p23.pop_array(10000);
 13      p23.ins_table(l_array);
 14      l_array := p23.pop_array(100000);
 15      p23.ins_table(l_array);
 16  end;
 17  /
loaded 500 recs in 0 csecs
loaded 1000 recs in 0 csecs
loaded 2500 recs in 0 csecs
loaded 5000 recs in 1 csecs
loaded 10000 recs in 1 csecs
loaded 100000 recs in 15 csecs

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from t23
  2  /

  COUNT(*)
----------
    119000

SQL>

I think inserting 100,000 records in 0.15 seconds should please all but the most demanding of users. So, the question is, how do you approach your inserts?

朮生 2024-09-06 07:46:17

现在MySQL是Oracle,所以也许一个更简单的解决方案可能是留在MySQL上......

如果不是,那么你应该确保在开始插入组之前启动事务,一旦组完成,然后提交事务并启动一个下一组插入的新事务。

还要检查是否有不必要的索引定义,这些定义可能会减慢插入时间。

已更新...
批量插入指的是ETL(Extract Transform Load)的最后一步,所以您是否考虑过使用基于 Java 的 ETL 工具,例如 pentaho Kettletalend-studio

Pentaho 在此处描述了他们的 Oracle Bulk 加载工具。

快速谷歌一下还显示了一些初步证据,表明 Talend 也对 Oracle 批量加载有一些支持。

Nowadays MySQL is Oracle so perhaps a simpler solution might be to stay on MySQL...

If not then you should make sure that the transaction is started before you start your group of inserts, once the group is finished then commit the transaction and start a new transaction for the next group of inserts.

Also check for unnecessary index definitions that may be slowing the insert time.

Updated...
Bulk inserting refers to the last step of ETL (Extract Transform Load), so have you considered using a java based ETL tool such as pentaho kettle or talend-studio.

Pentaho describes their Oracle Bulk loading facilities here.

A quick google also shows some preliminary evidence that Talend also has some support for Oracle bulk loading.

掩于岁月 2024-09-06 07:46:17

尝试一下,

public Boolean inserTable(String fileName) {
    logger.info("Begin - " + this.getClass().getSimpleName() + "." + "inserTable");
        logger.info("File : " + fileName);
    try (Connection conn1 =  jdbcTemplate.getDataSource().getConnection();) {
        OracleConnection conn = ( OracleConnection ) conn1.getMetaData().getConnection();
        ScriptRunner sr = new ScriptRunner(conn);
        StringBuilder sBuffer = new StringBuilder();
        StringBuffer sb=new StringBuffer();
        String query = "Insert into TABLE_DATA (ID, DATA1, DATECREATED, CREATEDBY) Values ";
        String line = "";
        //Creating a reader object
        BufferedReader br = new BufferedReader(new FileReader(fileName),1024 * 1024 );
        while ((line = br.readLine()) != null) {
            //logger.info("Leyo linea : " + line);
            sb.append(query.concat("(").concat("TABLE_DATA_SQ.NEXTVAL,").concat(line.substring(0,6)).concat(",").concat("sysdate,").concat("'BDVENLINEA'").concat(");"));
            sb.append("\n");
        }
        sb.append("commit;");
        Reader reader = new StringReader(sb.toString());
        //Running the script
        sr.runScript(reader);
        reader.close();
        br.close();
        return true;
    } catch (FileNotFoundException e) {
        logger.error(e.getMessage(), e);
        throw new TechnicalException(e, e.getMessage());
    } catch (SQLException e) {
        e.printStackTrace();
        throw new TechnicalException(e, e.getMessage());
    } catch (IOException e) {
        e.printStackTrace();
        throw new TechnicalException(e, e.getMessage());
    } finally {
        logger.info("End   - " + this.getClass().getSimpleName() + "." + "inserTable");
    }
}

try it,

public Boolean inserTable(String fileName) {
    logger.info("Begin - " + this.getClass().getSimpleName() + "." + "inserTable");
        logger.info("File : " + fileName);
    try (Connection conn1 =  jdbcTemplate.getDataSource().getConnection();) {
        OracleConnection conn = ( OracleConnection ) conn1.getMetaData().getConnection();
        ScriptRunner sr = new ScriptRunner(conn);
        StringBuilder sBuffer = new StringBuilder();
        StringBuffer sb=new StringBuffer();
        String query = "Insert into TABLE_DATA (ID, DATA1, DATECREATED, CREATEDBY) Values ";
        String line = "";
        //Creating a reader object
        BufferedReader br = new BufferedReader(new FileReader(fileName),1024 * 1024 );
        while ((line = br.readLine()) != null) {
            //logger.info("Leyo linea : " + line);
            sb.append(query.concat("(").concat("TABLE_DATA_SQ.NEXTVAL,").concat(line.substring(0,6)).concat(",").concat("sysdate,").concat("'BDVENLINEA'").concat(");"));
            sb.append("\n");
        }
        sb.append("commit;");
        Reader reader = new StringReader(sb.toString());
        //Running the script
        sr.runScript(reader);
        reader.close();
        br.close();
        return true;
    } catch (FileNotFoundException e) {
        logger.error(e.getMessage(), e);
        throw new TechnicalException(e, e.getMessage());
    } catch (SQLException e) {
        e.printStackTrace();
        throw new TechnicalException(e, e.getMessage());
    } catch (IOException e) {
        e.printStackTrace();
        throw new TechnicalException(e, e.getMessage());
    } finally {
        logger.info("End   - " + this.getClass().getSimpleName() + "." + "inserTable");
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文