在MySQL中，如何仅在行不存在时插入并仅在现有版本较少时更新

发布于 2024-09-18 15:06:26 字数 1411 浏览 11 评论 0原文

我正在寻找一种方法，仅在 MySQL 中不存在该行时插入，并在该行存在并且现有行的版本小于（或等于）新行的版本时更新。

例如，该表定义为：

CREATE TABLE documents (
  id VARCHAR(64) NOT NULL,
  version BIGINT UNSIGNED NOT NULL,
  data BLOB,
  PRIMARY KEY (id)
);

并包含以下数据：

id  version  data
----------------------------
1   3        first data set
2   2        second data set
3   5        third data set

我想合并下表（更新：id 列是唯一的）：

id  version  data
----------------------------
1   4        updated 1st
3   3        updated 2nd
4   1        new 4th

它应该生成以下内容（更新：看看如何仅更新 1 和 4插入）：

id  version  data
----------------------------
1   4        updated 1st
2   2        second data set
3   5        third data set
4   1        new 4th

我查看了 INSERT ... ON DUPLICATE KEY UPDATE ... 语句，但它不允许某种 WHERE 子句。另外，我不能真正使用 REPLACE 因为它也不允许 WHERE。这甚至可以通过单个 MySQL 语句实现吗？

我正在使用 Java，并尝试使用 PreparedStatement 与批处理 (addBatch)。任何帮助将不胜感激。

更新：有什么方法可以将此查询与Java中的PreparedStatement一起使用吗？我有一个包含 id、版本和数据的 Document 对象列表。

原文

I am looking for a way to only insert when the row does not exist in MySQL, and update when the row exists AND the version of the existing row is less than (or equal to) the version of the new row.

For example, the table is defined as:

CREATE TABLE documents (
  id VARCHAR(64) NOT NULL,
  version BIGINT UNSIGNED NOT NULL,
  data BLOB,
  PRIMARY KEY (id)
);

And contains the following data:

id  version  data
----------------------------
1   3        first data set
2   2        second data set
3   5        third data set

And I want to merge the following table (UPDATE: id column is unique):

id  version  data
----------------------------
1   4        updated 1st
3   3        updated 2nd
4   1        new 4th

And it should produce the following (UPDATE: see how only 1 is updated and 4 is inserted):

id  version  data
----------------------------
1   4        updated 1st
2   2        second data set
3   5        third data set
4   1        new 4th

I've looked at INSERT ... ON DUPLICATE KEY UPDATE ... statement, but it doesn't allow for some sort of WHERE clause. Also, I can't really use REPLACE because it also does not allow WHERE. Is this even possible with a single MySQL statement?

I am using Java and am trying to possible insert/update many records using the PreparedStatement with batching (addBatch). Any help would be appreciated.

UPDATE: Is there any way to use this query with the PreparedStatement in Java? I have a List of Document objects with id, version, and data.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

够钟 2024-09-25 15:06:26

编辑：在我之前的回答中，我建议对(id, version)需要一个唯一的约束，但实际上这是没有必要的。仅对 id 的唯一约束就足以使解决方案发挥作用。

您应该能够使用 REPLACE< /a> 命令如下：

REPLACE INTO main 
SELECT  IFNULL(m.id, s.id) id, 
        IFNULL(m.version, s.version) version, 
        IFNULL(m.data, s.data) data
FROM       secondary s
LEFT JOIN  main m ON (m.id = s.id AND m.version > s.version);

测试用例：

CREATE TABLE main ( 
   id int, 
   version int, 
   data varchar(50), 
   PRIMARY KEY (id)
);

CREATE TABLE secondary (id int, version int, data varchar(50));

INSERT INTO main VALUES (1, 3, 'first data set');
INSERT INTO main VALUES (2, 2, 'second data set');
INSERT INTO main VALUES (3, 5, 'third data set');

INSERT INTO secondary VALUES (1, 4, 'updated 1st');
INSERT INTO secondary VALUES (3, 3, 'udated 2nd');
INSERT INTO secondary VALUES (4, 1, 'new 4th');

结果：

SELECT * FROM main;
+----+---------+-----------------+
| id | version | data            |
+----+---------+-----------------+
|  1 |       4 | updated 1st     |
|  2 |       2 | second data set |
|  3 |       5 | third data set  |
|  4 |       1 | new 4th         |
+----+---------+-----------------+
4 rows in set (0.00 sec)

作为旁注，为了帮助您了解 REPLACE 命令中发生的情况，请注意以下内容：

SELECT     s.id s_id, s.version s_version, s.data s_data,
           m.id m_id, m.version m_version, m.data m_data
FROM       secondary s
LEFT JOIN  main m ON (m.id = s.id AND m.version > s.version);

+------+-----------+-------------+------+-----------+----------------+
| s_id | s_version | s_data      | m_id | m_version | m_data         |
+------+-----------+-------------+------+-----------+----------------+
|    1 |         4 | updated 1st | NULL |      NULL | NULL           |
|    3 |         3 | udated 2nd  |    3 |         5 | third data set |
|    4 |         1 | new 4th     | NULL |      NULL | NULL           |
+------+-----------+-------------+------+-----------+----------------+
3 rows in set (0.00 sec)

然后是 IFNULL() 函数负责“覆盖”主表中的最新版本（如果存在），如 id=3、version=5 的情况。因此，如下所示：

SELECT  IFNULL(m.id, s.id) id, 
        IFNULL(m.version, s.version) version, 
        IFNULL(m.data, s.data) data
FROM       secondary s
LEFT JOIN  main m ON (m.id = s.id AND m.version > s.version);

+------+---------+----------------+
| id   | version | data           |
+------+---------+----------------+
|    1 |       4 | updated 1st    |
|    3 |       5 | third data set |
|    4 |       1 | new 4th        |
+------+---------+----------------+
3 rows in set (0.00 sec)

上面的结果集仅包含辅助表中的记录，但如果这些记录中的任何一条恰好在主表中具有较新的版本，则该行将被主表中的数据覆盖。这是我们为 REPLACE 命令提供的输入。

EDIT: In my earlier answer I suggested that a unique constraint is needed on (id, version), but actually this is not necessary. Your unique constraint on id only is enough for the solution to work.

You should be able to use the REPLACE command as follows:

REPLACE INTO main 
SELECT  IFNULL(m.id, s.id) id, 
        IFNULL(m.version, s.version) version, 
        IFNULL(m.data, s.data) data
FROM       secondary s
LEFT JOIN  main m ON (m.id = s.id AND m.version > s.version);

Test case:

CREATE TABLE main ( 
   id int, 
   version int, 
   data varchar(50), 
   PRIMARY KEY (id)
);

CREATE TABLE secondary (id int, version int, data varchar(50));

INSERT INTO main VALUES (1, 3, 'first data set');
INSERT INTO main VALUES (2, 2, 'second data set');
INSERT INTO main VALUES (3, 5, 'third data set');

INSERT INTO secondary VALUES (1, 4, 'updated 1st');
INSERT INTO secondary VALUES (3, 3, 'udated 2nd');
INSERT INTO secondary VALUES (4, 1, 'new 4th');

Result:

SELECT * FROM main;
+----+---------+-----------------+
| id | version | data            |
+----+---------+-----------------+
|  1 |       4 | updated 1st     |
|  2 |       2 | second data set |
|  3 |       5 | third data set  |
|  4 |       1 | new 4th         |
+----+---------+-----------------+
4 rows in set (0.00 sec)

As a side-note, to help you understand what's happening in that REPLACE command, note the following:

SELECT     s.id s_id, s.version s_version, s.data s_data,
           m.id m_id, m.version m_version, m.data m_data
FROM       secondary s
LEFT JOIN  main m ON (m.id = s.id AND m.version > s.version);

+------+-----------+-------------+------+-----------+----------------+
| s_id | s_version | s_data      | m_id | m_version | m_data         |
+------+-----------+-------------+------+-----------+----------------+
|    1 |         4 | updated 1st | NULL |      NULL | NULL           |
|    3 |         3 | udated 2nd  |    3 |         5 | third data set |
|    4 |         1 | new 4th     | NULL |      NULL | NULL           |
+------+-----------+-------------+------+-----------+----------------+
3 rows in set (0.00 sec)

Then the IFNULL() functions were taking care of "overwriting" the latest version from the main table if one was present, as in the case of id=3, version=5. Therefore the following:

SELECT  IFNULL(m.id, s.id) id, 
        IFNULL(m.version, s.version) version, 
        IFNULL(m.data, s.data) data
FROM       secondary s
LEFT JOIN  main m ON (m.id = s.id AND m.version > s.version);

+------+---------+----------------+
| id   | version | data           |
+------+---------+----------------+
|    1 |       4 | updated 1st    |
|    3 |       5 | third data set |
|    4 |       1 | new 4th        |
+------+---------+----------------+
3 rows in set (0.00 sec)

The above result set contains only records from the secondary table, but if any of these records happen to have a newer version in the main table, then the row is overwritten by the data from the main table. This is the input that we are feeding the REPLACE command.

回复收藏 0 原文

春花秋月 2024-09-25 15:06:26

我认为 INSERT ON DUPLICATE KEY UPDATE 是您最好的选择。您可以像未经测试的语法一样使用它

INSERT INTO table1 SELECT * FROM table2 ON DUPLICATE KEY UPDATE table1.data=IF(table1.version > table2.version, table1.data, table2.data), table1.version=IF(table1.version > table2.version, table1.version, table2.version)

，但我相信这个想法应该可行。

I think INSERT ON DUPLICATE KEY UPDATE is your best bet. You can use it like

INSERT INTO table1 SELECT * FROM table2 ON DUPLICATE KEY UPDATE table1.data=IF(table1.version > table2.version, table1.data, table2.data), table1.version=IF(table1.version > table2.version, table1.version, table2.version)

Untested syntax, but I belive the idea should work.

回复收藏 0 原文

~没有更多了~