使用 MySQL+EC2 自制廉价且令人愉快的集群?
我有一个由 MySQL + EC2 + EBS 支持的 Java Web 服务。为了数据完整性,我研究了 DRBD、MySQL 集群等,但想知道是否没有更简单的解决方案。我不需要高可用性(可以处理停机时间),
只有少数操作的数据需要保留——创建帐户、更改密码、购买收据。我有能力从过时的备份中恢复大部分数据。
我的想法是,我可以将选定的 INSERT/UPDATE 命令通过管道传输到存储(例如 S3、SimpleDB),并在需要时(当数据库崩溃时)从上次备份点重播这些命令。如果这个功能是在 JDBC 驱动程序本身中实现的,那岂不是很完美。
这是否太愚蠢而无法工作,或者我是否错过了另一个明显且强大的解决方案?
I've got a Java web service backed by MySQL + EC2 + EBS. For data integrity I've looked into DRBD, MySQL cluster etc. but wonder if there isn't a simpler solution. I don't need high availability (can handle downtime)
There are only a few operations whose data I need to preserve -- creating an account, changing password, purchase receipt. The majority of the data I can afford to recover from a stale backup.
What I am thinking is that I could pipe selected INSERT/UPDATE commands to storage (S3, SimpleDB for instance) and when required (when the db blows up) replay these commands from the point of last backup. And wouldn't it be neat if this functionality was implemented in the JDBC driver itself.
Is this too silly to work, or am I missing another obvious and robust solution?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您是否考虑过将 MySQL 也迁移到 Amazon Web Services 中?您可以使用 Amazon Relational Database Service (RDS)。另请参阅 MySQL 企业支持。
Have you looked into moving your MySQL into Amazon Web Services as well? You can use Amazon Relational Database Service (RDS). Also see MySQL Enterprise Support.
您总是会遇到一个窗口,在该窗口中,服务器和关联文件存储的完全丢失将导致一定量的数据丢失。
当我在 AWS 中运行一个相当繁忙的 SaaS 解决方案时,我有一个在大型实例上运行的 MySQL Master 和一个在不同可用区的小型实例上运行的 MySQL Slave。复制延迟通常不超过 2 秒,但流量激增可能需要长达一两分钟的时间。
如果您无法承受丢失 5 分钟的数据,我建议您运行主/从设置,而不是滚动您自己的恢复机制。如果您自己推出,请确保“过时”备份和记录/记录的关键数据位于不同的可用区域中。 AWS 之前曾失去过整个区域。
You always have a window where total loss of a server and associated file storage will result in some amount of lost data.
When I ran a modestly busy SaaS solution in AWS, I had a MySQL Master running on a large instance and a MySQL Slave running on a small instance in a different availability zone. The replication lag was typically no more than 2 seconds, though a surge in traffic could take that up to a minute or two.
If you can't afford losing 5 minutes of data, I would suggest running a Master/Slave setup over rolling your own recovery mechanism. If you do roll your own, ensure the "stale" backups and the logged/journaled critical data are in a different availability zone. AWS has lost entire zones before.