使用 JDBC 在 MySQL 中快速填充关系数据库的方法？

发布于 2024-11-05 04:43:24 字数 722 浏览 0 评论 0原文

我正在尝试用 Java 实现简单的程序，该程序将用于从 CSV 源文件填充 MySQL 数据库。对于 CSV 文件中的每一行，我需要执行以下 SQL 语句序列（伪代码示例）：

execute("INSERT INTO table_1 VALUES(?, ?)");
String id = execute("SELECT LAST_INSERT_ID()");
execute("INSERT INTO table_2 VALUES(?, ?)");
String id2 = execute("SELECT LAST_INSERT_ID()");
execute("INSERT INTO table_3 values("some value", id1, id2)");
execute("INSERT INTO table_3 values("some value2", id1, id2)");
...

存在三个基本问题：
1. 数据库不在本地主机上，因此每个 INSERT/SELECT 都有延迟，这是基本问题
2. CSV 文件包含数百万行（例如 15 000 000），因此需要很长时间。
3. 我无法修改数据库结构（添加额外的表、禁用键等）。

我想知道如何加快 INSERT/SELECT 过程？目前80%的执行时间都消耗在通信上。

我已经尝试对上述语句进行分组并作为批处理执行它们，但由于 LAST_INSERT_ID 它不起作用。在任何其他情况下，都会花费太长的时间（参见第 1 点）。

原文

I am trying to implement simple program in Java that will be used to populate a MySQL database from a CSV source file. For each row in the CSV file, I need to execute following sequence of SQL statements (example in pseudo code):

execute("INSERT INTO table_1 VALUES(?, ?)");
String id = execute("SELECT LAST_INSERT_ID()");
execute("INSERT INTO table_2 VALUES(?, ?)");
String id2 = execute("SELECT LAST_INSERT_ID()");
execute("INSERT INTO table_3 values("some value", id1, id2)");
execute("INSERT INTO table_3 values("some value2", id1, id2)");
...

There are three basic problems:
1. Database is not on localhost so each single INSERT/SELECT has latency and this is the basic problem
2. CSV file contains millions of rows (like 15 000 000) so it takes too long.
3. I cannot modify the database structure (add extra tables, disable keys etc).

I was wondering how can I speed up the INSERT/SELECT process? Currently 80% of the execution time is consumed by communication.

I already tried to group the above statements and execute them as batch but because of LAST_INSERT_ID it does not work. In any other cases it takes too long (see point 1).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

秋千易 2024-11-12 04:43:25

将数据输入黑洞

CREATE TABLE  `test`.`blackhole` (
  `t1_f1` int(10) unsigned NOT NULL,
  `t1_f2` int(10) unsigned NOT NULL,
  `t2_f1` ... and so on for all the tables and all the fields.
) ENGINE=BLACKHOLE DEFAULT CHARSET=latin1;

请注意，这是一个黑洞表，因此数据无处可去。
但是，您可以在黑洞表上创建一个触发器，如下所示。

并使用触发器传递它

delimiter $

create trigger ai_blackhole_each after insert on blackhole for each row
begin
  declare lastid_t1 integer;
  declare lastid_t2 integer;

  insert into table1 values(new.t1_f1, new.t1_f2);
  select last_insert_id() into lastid_t1;
  insert into table2 values(new.t2_f1, new.t2_f1, lastid_t1);
  etc....
end$

delimiter ;

现在，您可以全速使用单个插入语句向黑洞表提供数据，甚至可以一次性插入多行。

insert into blackhole values(a,b,c,d,e,f,g,h),(....),(...)...

禁用索引更新以加快速度

ALTER TABLE $tbl_name DISABLE KEYS;
....Lot of inserts
ALTER TABLE $tbl_name ENABLE KEYS;

将禁用所有非唯一键更新并加快插入速度。 （自动增量键是唯一的，因此不受影响）

如果您有任何唯一键并且不希望 MySQL 在批量插入期间检查它们，请确保执行 alter table 以消除唯一键并随后启用它。
请注意，更改表将唯一密钥放回原处将需要很长时间。

Feed the data into a blackhole

CREATE TABLE  `test`.`blackhole` (
  `t1_f1` int(10) unsigned NOT NULL,
  `t1_f2` int(10) unsigned NOT NULL,
  `t2_f1` ... and so on for all the tables and all the fields.
) ENGINE=BLACKHOLE DEFAULT CHARSET=latin1;

Note that this is a blackhole table, so the data is going nowhere.
However you can create a trigger on the blackhole table, something like this.

And pass it on using a trigger

delimiter $

create trigger ai_blackhole_each after insert on blackhole for each row
begin
  declare lastid_t1 integer;
  declare lastid_t2 integer;

  insert into table1 values(new.t1_f1, new.t1_f2);
  select last_insert_id() into lastid_t1;
  insert into table2 values(new.t2_f1, new.t2_f1, lastid_t1);
  etc....
end$

delimiter ;

Now you can feed the blackhole table with a single insert statement at full speed and even insert multiple rows in one go.

insert into blackhole values(a,b,c,d,e,f,g,h),(....),(...)...

Disable index updates to speed things up

ALTER TABLE $tbl_name DISABLE KEYS;
....Lot of inserts
ALTER TABLE $tbl_name ENABLE KEYS;

Will disable all non-unique key updates and speed up the insert. (an autoincrement key is unique, so that's not affected)

If you have any unique keys and you don't want MySQL to check for them during the mass-insert, make sure you do an alter table to eliminate the unique key and enable it afterwards.
Note that the alter table to put the unique key back in will take a long time.

回复收藏 0 原文