MySQL 复制 - 一个网站、多台服务器、不同大陆
考虑一个相当大的网站(200 万+ 页面浏览量/米,大量用户),有 2 台前端服务器:一台前端服务器在美国,一台在欧洲。 两个专用 URL 将访问者带到其中一台服务器上,一个使用法语,另一个使用英语。 两个站点共享完全相同的数据。
最具成本效益的解决方案是什么? (我公司使用的数据库:MySQL)
1/ Amazon EC2(美国)上有一个主服务器,前端服务器上有从属服务器?
优点:没有主-主代表,意味着没有风险
- 缺点:滞后! 在欧洲写作,在美国写作会不会太滞后? 另一个缺点可能是缺乏快速且肮脏的解决方案,以防万一主人死亡。 那么如果让从服务器与前端在同一台服务器上呢?
2/ 两个 Amazon EC2 实例,一个在美国,一个在欧洲,充当主-主复制服务器。 在每个前端加上两个从站?
Adv:数据的速度和安全性。 当然,没有负载均衡器,但是将主机切换到另一个负载均衡器似乎非常简单。
Drwbcks:价格。 以及 DB 的腐败风险
3/还有其他解决方案吗?
由于这是我第一次在两个大洲使用服务器,我非常感谢您在该领域的经验学习,包括 MySQL 或不包括 MySQL ,包括或不包括 EC2。
谢谢 马歇尔
Consider a reasonably large website (2M+ pageviews / m, lots of users) with 2 frontend servers: one front server in the US, and one in Europe. Two dedicated URL bring the visitors on one of the server, one in the french language, the other one in english. Both sites share exactly the same data.
What would be the most cost effective solution? (DB used at my company: MySQL)
1/ A single Master server on Amazon EC2 (US), and slaves on the frontend servers?
Advantages: no master-master rep, meaning no risk of data conflict with autoincrement and duplicates on unique columns, etc..
Drawbacks: The lag! Won't there be too much lagging for writing in the US when you are in Europe?
Another drawback could be the lack of quick n dirty solution in case the master dies. And what about having slaves on same server as front?
2/ Two Amazon EC2 instances, one in the US, one in Europe, acting as master-master replication servers. Plus two slaves on each of the frontends?
Adv: Speed, and security of data. Of course there is no load balancer, but making a hack to switch the master to the other one seems pretty trivial.
Drwbcks: Price. And the risk of corruption on the DB
3/ Any other solution ?
As it is my first time working with servers in 2 continents, I would really appreciate learning from you experience in that area, including MySQL or not, including EC2 or not.
Thanks
Marshall
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
像往常一样,我要说的内容取决于您的应用程序、它如何使用数据库等。您需要问自己:
我假设法语服务器在欧洲,而英语服务器在美国? 如果您可以对数据进行分区,以便法语站点使用一个数据库,而英语站点使用另一个数据库,那么您的情况会更好。 即使两个站点都访问两个数据库,因为您不必担心冲突。 您甚至可以在每个主服务器上运行两个 mysql 实例,并为两者进行多主复制。
如果您无法分区,我可能会选择#2,但我会指定其中一台机器作为“真正的”主机,并将所有写入发送到它,以帮助避免数据破坏。 这样在紧要关头就可以轻松切换。
如果您对成本敏感并且无论如何都要在前端服务器上运行副本,则只需在前端服务器上运行主数据库即可。 您以后随时可以完成它。 副本通常比具有相同读取负载的主服务器具有更高的 CPU/IO 成本:它们必须串行执行写入,这确实会把事情搞砸。
另外,不要将 m1.small 实例用于您的数据库。 或者至少留意你的表现。 m1.smalls 的供电严重不足,如果您观看
top
,您会发现您的 CPU 时间的很大一部分被虚拟机管理程序窃取。 我推荐 c1.medium。As usual, what I'm about to say depends on your app, how it uses the database, etc. You need to ask yourself:
I'm assuming the french server is in Europe, while the English server is in the US? If you can partition your data so that the french site uses one DB and the english site uses the other, you're better off. Even if both sites access both DB's, since you don't have to worry about collisions. You can even run two mysql instances on each master server and do multimaster replication for both.
If you can't partition, I'd probably go with #2, but I'd designate one of the machines as the 'true' master and send all the writes to it to help avoid data clobber. This way it's easy to switch in a pinch.
If you're cost sensitive and you're going to run replicas on your front end servers anyway, just run the master databases on the front end servers. You can always pull it off later. Replicas can often have higher CPU/IO costs than masters taking the same read load: they have to execute their writes in serial, which can really screw things up.
Also, don't use m1.small instances for your DB. Or at least keep an eye on your performance. m1.smalls are significantly under powered, and if you watch
top
, you'll notice a significant percentage of your CPU time being stolen by the hypervisor. I recommend c1.medium's.永远不要使用主主复制。 没有解决冲突的机制。 如果您尝试同时写入两个主服务器(或者在一个主服务器赶上您之前写入另一主服务器的更改之前写入该主服务器),那么您最终会遇到复制失败的情况。 服务不会停止,他们只会越来越疏远,导致和解变得不可能。
如果没有精心设计的监控来检查 MySQL 复制是否正常工作,请勿使用 MySQL 复制。 不要因为您最初正确配置了它而认为它会继续工作或保持同步。
确实有一个记录良好、经过充分测试的程序来恢复从站不同步或停止的情况。 有一个类似的记录程序从头开始安装新的从站。
如果您关心正确或最新的数据,您的应用程序可能需要足够的情报才能知道从属设备不同步或已停止,并且不应使用它。 为此,您需要来自监控的某种反馈。
如果你有一个从站,比如说在美国,而你的主站在欧洲,那么通常会给你带来你期望的延迟量,即比它们位于同一位置时多大约 150 毫秒。
在 MySQL 中,从站直到主站完成查询后才会开始查询,因此它总是落后于更新所需的时间。
此外,从属服务器是单线程的,因此单个“硬”更新查询将延迟所有后续更新查询。
如果您在多线程写入负载上努力推动主服务器,假设您的从服务器具有相同的硬件,那么它们不太可能跟上。
Don't use master-master replication, ever. There is no mechanism for resolving conflicts. If you try to write to both masters at the same time (or write to one master before it has caught up with changes you previously wrote to the other one), then you will end up with a broken replication scenario. The service won't stop, they'll just drift further and further apart making reconciliation impossible.
Don't use MySQL replication without some well-designed monitoring to check that it's working ok. Don't assume that becuase you've configured it correctly initially it'll either keep working, OR stay in sync.
DO have a well-documented, well-tested procedure for recovering slaves from being out of sync or stopped. Have a similarly documented procedure for installing a new slave from scratch.
Your application may need sufficient intelligence to know that a slave is out of sync or stopped, and that it should not be used, if you care about correct or up-to-date data. You'll need some kind of feedback from your monitoring to do this.
If you have a slave in, say the US when your master is in Europe, that would normally give you the amount of latency you expect, i.e. something in the order of 150ms more than if they were co-located.
In MySQL, the slave does not start a query until the master finishes it, so it will always be behind by the length of time an update takes.
Also, the slave is single-threaded, so a single "hard" update query will delay all subsequent ones.
If you're pushing your master hard on multithreaded write-load, assuming your slaves have identical hardware, it is very unlikely that they'll be able to keep up.
我们正在考虑类似的情况 - 在 Amazon Eastcoast 本周两次完全断网之后 - 这意味着甚至没有在多个区域进行复制并使用 RDB 实例来保持我们的可用性。
但 DRB 不允许从东到西甚至进入欧洲。
我们现在正在回顾东西方甚至欧洲的 Master Master 的做法,其中一个 Master 仅充当故障转移,并通过响应速度极快的 dnsmadeeasy 进行故障转移。
优点:故障转移快速可靠、停机时间短、无需复杂的故障转移功能管理。
缺点:在不使用它的情况下运行一个额外的系统 - 但与使用 RDB 相比,这并不更昂贵
DRB 由 Amazon 很好地管理,包括时间点恢复等 - 如果放弃它,所有这些都会丢失。 但它仅限于在一个区域内进行复制,并且该区域可以完全切断,这一事实使其成为问题。 作为 RDB 备份的替代方案,我们正在寻找 Zmanda 开源工具来负责备份管理。 尚未测试,但基于我们对故障转移、数据库和硬件的所有填充,因此这看起来是最简单且最有前途的高可用性方法。
We are looking at a similar scenario - after Amazon Eastcoast has completely been cut off the net twice this week - meaning not even being replicated in multiple regions and using RDB instances in kept us available.
But DRB does not allow crossing from East to West or even into Europe.
We are now reviewing the approach of Master Master in East and West or even Europe with one master acting as a failover only, and failover via dnsmadeeasy which responds extremely fast.
Advantage: quick and reliable failover, short downtime, no complex management of the failover function.
Disadvantage: One extra system running without using it - but compared to using RDB that's not more expensive
DRB is nicely managed by Amazon including point in time recovery and so on - all that is lost by switching away from it. But the fact that it is limited to replications within only one area and that area can be completely cut off make it problematic. As an alternative to RDB backup we are looking at Zmanda open source tools to take care of backup management. NOt yet tested, but based on all our stuffing around with failover and databases and hardware and so this looks like the simplest and therefore most promising approach for high availiability.
这个问题很老了,但现在有了解决方案:Galera。 它可以进行 MySQL (InnoDB) 复制,并且也可以很好地与 WAN 配合使用。 http://codership.com/
This question is old, but the solution exists now: Galera. It does MySQL (InnoDB) replication, and works well with WANs, too. http://codership.com/