如何自动地将行从一个表移动到另一个表?
我正在收集数千个传感器的读数并将它们存储在 MySQL 数据库中。每秒有数百个插入。为了提高插入性能,我最初将值存储到内存缓冲表中。我每分钟运行一次存储过程,将插入的行从内存缓冲区移动到永久表。
基本上我想在我的存储过程中执行以下操作以从临时缓冲区中移动行:
INSERT INTO data SELECT * FROM data_buffer;
DELETE FROM data_buffer;
不幸的是,前一个不可用,因为数据收集进程在上面的 INSERT 和 DELETE 之间的“data_buffer”中插入了额外的行。因此,这些行将被删除,而不会插入到“数据”表中。
如何使操作原子化或使 DELETE 语句仅删除在前面的语句中 SELECTed 和 INSERTed 的行?
如果可能的话,我更愿意以适用于不同数据库引擎的标准方式执行此操作。
由于性能开销和存储要求,我不想添加任何额外的“id”列。
我希望标准 SQL 中有 SELECT_AND_DELETE 或 MOVE 语句或类似的语句...
I am collecting readings from several thousand sensors and storing them in a MySQL database. There are several hundred inserts per second. To improve the insert performance I am storing the values initially into a MEMORY buffer table. Once a minute I run a stored procedure which moves the inserted rows from the memory buffer to a permanent table.
Basically I would like to do the following in my stored procedure to move the rows from the temporary buffer:
INSERT INTO data SELECT * FROM data_buffer;
DELETE FROM data_buffer;
Unfortunately the previous is not usable because the data collection processes insert additional rows in "data_buffer" between INSERT and DELETE above. Thus those rows will get deleted without getting inserted to the "data" table.
How can I make the operation atomic or make the DELETE statement to delete only the rows which were SELECTed and INSERTed in the preceding statement?
I would prefer doing this in a standard way which works on different database engines if possible.
I would prefer not adding any additional "id" columns because of performance overhead and storage requirements.
I wish there was SELECT_AND_DELETE or MOVE statement in standard SQL or something similar...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我相信这会起作用,但会阻塞直到插入完成
I beleive this will work but will block until insert is done
避免所有这些问题并保持快速的一种可能方法是使用两个
data_buffer
表(我们称它们为data_buffer1
和data_buffer2
);当集合处理插入data_buffer2
时,您可以对data_buffer2
执行插入
和删除
;比您切换时,收集的数据会进入data_buffer2
,而数据会从data_buffer1
插入和删除到data
。A possible way to avoid all those problems, and to also stay fast, would be to use two
data_buffer
tables (let's call themdata_buffer1
anddata_buffer2
); while the collection processes insert intodata_buffer2
, you can do theinsert
anddelete
ondata_buffer2
; than you switch, so collected data goes intodata_buffer2
, while data is inserted+deleted fromdata_buffer1
intodata
.怎么样有一个行id,在插入之前获取最大值,进行插入然后删除记录<= max(id)
How about having a row id, get the max value before insert, make the insert and then delete records <= max(id)
这是与 @ammoQ 的答案类似的解决方案。不同之处在于,您可以在计划过程中透明地交换表,而不是让 INSERTing 进程确定要写入哪个表。
在计划过程中使用 RENAME 来交换表:
这是有效的,因为 RENAME 语句会自动交换表,因此 INSERTing 进程不会因“找不到表”而失败。但这是 MySQL 特有的。
This is a similar solution to @ammoQ's answer. The difference is that instead of having the INSERTing process figure out which table to write to, you can transparently swap the tables in the scheduled procedure.
Use RENAME in the scheduled procedure to swap tables:
This works because RENAME statement swaps the tables atomically, thus the INSERTing processes will not fail with "table not found". This is MySQL specific though.
我假设这些表是相同的,具有相同的列和主键?如果是这种情况,您可以将 select 嵌套在 where 子句中...如下所示:
I assume the tables are identical, with the same columns and primary key(s)? If that is the case, you could nestled select inside a where clause...something like this:
这是 MySQL 特定的解决方案。您可以使用锁定来防止 INSERTing 进程在移动行时添加新行。
移动行的过程应如下所示:
在缓冲区中插入新行的代码应更改如下:
当锁到位时,INSERT 过程显然会阻塞。
This is a MySQL specific solution. You can use locking to prevent the INSERTing processes from adding new rows while you are moving rows.
The procedure which moves the rows should be as follows:
The code which INSERTs new rows in the buffer should be changed as follows:
The INSERT process will obviously block while the lock is in place.