我在 Windows 服务器上有一个 MySQL 数据库(“Master”),它应该存储数十 GB(使用 InnoDB 压缩),并且每天都会添加新记录。
出于速度目的,我想在运行 Windows 或 Linux 的远程计算机(“从机”)上复制表,这些计算机正在进行数据分析(因此本地没有并发问题)。
我考虑使用 SQLite 数据库来实现此目的,其中包含 Master 的同步快照。
到目前为止,我一直在使用 Dropbox(适用于团队)来同步 csv 文件,但增量同步可能无法处理大型数据库文件。
因此,我非常感谢您的意见,以确定在这两个不同引擎之间执行复制的最佳方法。特别是,它应该能够检测字段级别的变化,以限制需要传输的数据量!
到目前为止,我知道以下可能性:
I have a MySQL database on a Windows server (the "Master"), which should store tens of GB (with InnoDB compression), with new records added on a daily basis.
For speed purposes, I would like to replicate the tables on remote computers (the "Slaves"), running Windows or Linux, which are doing data analysis (hence no concurrency issue locally).
I thought of using a SQLite database for this purpose, that would contain a synchronized snapshot of the Master.
So far, I have been using Dropbox (for teams) to sync csv files but delta sync would probably not work with huge database files.
I would therefore appreciate your input to determine the best way to perform the replication between these two different engines. In particular, it should be able to detect changes at a field level to limit the amount of data that needs to be transferred!
So far, I am aware of the following possibilities:
发布评论
评论(2)
点击 - 点击 - 点击 -- 点击!
Heroku 的人编写了一个很好的小 Ruby 脚本来帮助解决这种情况。我认为您会发现它经过了相当好的战斗测试,并且一般来说只是一个不错的与数据库无关的小“同步”工具。
https://github.com/ricardochimal/taps
http://adam.heroku.com/past/2009/2/11/taps_for_easy_database_transfers/
注意警告
任何优秀的神奇软件都会有警告,最好提前说明它们:
外键约束在模式传输中丢失
没有主键的表传输速度会非常慢。这是因为在具有大偏移值时效率低下。
查询。
当前不支持多个架构
谢谢,
Anuj
Tap - Tap - Tap -- Taps!
The guys at Heroku wrote a nice little Ruby script to help with just this scenario. I think you'll find it fairly well battle-tested and in general just a nice little database agnostic 'sync' tool.
https://github.com/ricardochimal/taps
http://adam.heroku.com/past/2009/2/11/taps_for_easy_database_transfers/
Beware the caveats
With any good piece of magical software there's caveats, it's good to be upfront about them:
Foreign key constraints get lost in the schema transfer
Tables without primary keys will be incredibly slow to transfer. This is due to it being inefficient having large offset values in
queries.
Multiple schemas are currently not supported
Thanks,
Anuj
您可以使用SQLyog数据同步工具,使用它可以为要同步的表指定有效的SQL WHERE子句仅那些满足 WHERE 子句的行。
You can use SQLyog Data synchronization tool using which you can specify a valid SQL WHERE clause for the table to sync only those rows that fulfill the WHERE clause.