JDBC 批量插入 OutOfMemoryError

发布于 2024-08-20 22:35:57 字数 1383 浏览 8 评论 0原文

我编写了一个方法 insert(),其中我尝试使用 JDBC Batch 将 50 万条记录插入 MySQL 数据库:

public void insert(int nameListId, String[] names) {
    String sql = "INSERT INTO name_list_subscribers (name_list_id, name, date_added)" + 
        " VALUES (?, ?, NOW())";
    Connection conn = null;
    PreparedStatement ps = null;

    try {
        conn = getConnection();
        ps = conn.prepareStatement(sql);

        for (String s : names ) {
            ps.setInt(1, nameListId); 
            ps.setString(2, s);
            ps.addBatch();
        }

        ps.executeBatch();

    } catch (SQLException e) {
        throw new RuntimeException(e);
    } finally {
        closeDbResources(ps, null, conn);
    }
}

但是每当我尝试运行此方法时,都会收到以下错误:

java.lang.OutOfMemoryError: Java heap space
    com.mysql.jdbc.ServerPreparedStatement$BatchedBindValues.<init>(ServerPreparedStatement.java:72)
    com.mysql.jdbc.ServerPreparedStatement.addBatch(ServerPreparedStatement.java:330)
    org.apache.commons.dbcp.DelegatingPreparedStatement.addBatch(DelegatingPreparedStatement.java:171)

如果我将 ps.addBatch() 替换为 ps.executeUpdate() 并删除 ps.executeBatch(),它可以正常工作,尽管需要某个时间。如果您知道在这种情况下使用 Batch 是否合适,请告诉我,如果是,那么为什么会给出 OurOfMemoryError

谢谢

I have written a method insert() in which I am trying to use JDBC Batch for inserting half a million records into a MySQL database:

public void insert(int nameListId, String[] names) {
    String sql = "INSERT INTO name_list_subscribers (name_list_id, name, date_added)" + 
        " VALUES (?, ?, NOW())";
    Connection conn = null;
    PreparedStatement ps = null;

    try {
        conn = getConnection();
        ps = conn.prepareStatement(sql);

        for (String s : names ) {
            ps.setInt(1, nameListId); 
            ps.setString(2, s);
            ps.addBatch();
        }

        ps.executeBatch();

    } catch (SQLException e) {
        throw new RuntimeException(e);
    } finally {
        closeDbResources(ps, null, conn);
    }
}

But whenever I try to run this method, I get the following error:

java.lang.OutOfMemoryError: Java heap space
    com.mysql.jdbc.ServerPreparedStatement$BatchedBindValues.<init>(ServerPreparedStatement.java:72)
    com.mysql.jdbc.ServerPreparedStatement.addBatch(ServerPreparedStatement.java:330)
    org.apache.commons.dbcp.DelegatingPreparedStatement.addBatch(DelegatingPreparedStatement.java:171)

If I replace ps.addBatch() with ps.executeUpdate() and remove ps.executeBatch(), it works fine, though it takes some time. Please let me know if you know if using Batch is appropriate in this situation, and if it is, then why does it give OurOfMemoryError?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

你与清晨阳光 2024-08-27 22:35:57

addBatchexecuteBatch 为您提供了执行批量插入的机制,但您仍然需要自己执行批处理算法。

如果您只是将每个语句堆积到同一个批次中,就像您所做的那样,那么您将耗尽内存。您需要每 n 条记录执行/清除批次。 n 的值由您决定,JDBC 无法为您做出决定。批处理大小越大,处理速度就越快,但太大会导致内存不足,处理速度会变慢或失败。这取决于你有多少内存。

例如,从批量大小 1000 开始,然后尝试不同的值。

final int batchSize = 1000;
int count = 0;
for(String s : names ) {
   ps.setInt(1, nameListId); 
   ps.setString(2, s);
   ps.addBatch();

   if (++count % batchSize == 0) {
      ps.executeBatch();
      ps.clearBatch(); //not sure if this is necessary
   }
}
ps.executeBatch();   // flush the last few records.

addBatch and executeBatch give you the mechanism to perform batch inserts, but you still need to do the batching algorithm yourself.

If you simply pile every statement into the same batch, as you are doing, then you'll run out of memory. You need to execute/clear the batch every n records. The value of n is up to you, JDBC can't make that decision for you. The larger the batch size, the faster things will go, but too large and you'll get memory starvation and things will slow down or fail. It depends how much memory you have.

Start off with a batch size of 1000, for example, and experiment with different values from there.

final int batchSize = 1000;
int count = 0;
for(String s : names ) {
   ps.setInt(1, nameListId); 
   ps.setString(2, s);
   ps.addBatch();

   if (++count % batchSize == 0) {
      ps.executeBatch();
      ps.clearBatch(); //not sure if this is necessary
   }
}
ps.executeBatch();   // flush the last few records.
没有你我更好 2024-08-27 22:35:57

它内存不足,因为它将所有事务保存在内存中,并且仅在您调用 executeBatch 时才将其发送到数据库。

如果您不需要它是原子的并且希望获得更好的性能,您可以保留一个计数器并每 n 个数字调用 executeBatch的记录。

It is out of memory because it hold all the transaction in memory and only send it over to the database when you call executeBatch.

If you don't need it to be atomic and would like the get better performance, you can keep a counter and call executeBatch every n number of records.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文