如何实现高可用?

发布于 2024-08-25 14:27:48 字数 386 浏览 5 评论 0原文

我的老板想要一个能够考虑到整个大陆灾难性事件的系统。他希望在美国有两台服务器,在亚洲有两台服务器(每个大陆有 1 个登录服务器和 1 个工作服务器)。

  1. 如果地震破坏了两个大陆之间的联系,两个大陆都应该单独工作。当连接恢复时,它们应该彼此同步恢复正常。
  2. 不允许使用外部云系统,因为他没有信心。
  3. 系统应考虑可扩展性,这意味着添加新服务器应易于配置。
  4. 服务器应该是负载平衡的。
  5. 服务器之间的连接应该非常安全(加密并通过 SSL 发送,尽管 SSL 负责加密)。
  6. 系统应该让一个且仅一个用户使用一个帐户登录。 (注意大陆之间的延迟,共享帐户的两个用户可能会同时到达两个登录服务器)

请帮忙。我已经无计可施了。先感谢您。

My boss wants to have a system that takes into concern of continent wide catastrophic event. He wants to have two servers in US and two servers in Asia (1 login server and 1 worker server in each continent).

  1. In the event that earthquake breaks the connection between the two continents, both should work alone. When the connection is revived, they should sync each other back to normal.
  2. External cloud system not allowed as he has no confidence.
  3. The system should take into account of scalability which means addition of new servers should be easy to configure.
  4. The servers should be load balanced.
  5. The connection between the servers should be very secure(encrypted and send through SSL although SSL takes care of encryption).
  6. The system should let one and only one user log in with one account. (beware of latency between continent and two users sharing account may reach both login server at the same time)

Please help. I'm already at the end of my wit. Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

内心荒芜 2024-09-01 14:27:48

我认为这些要求(如果分析得当)本质上是不兼容的,因为它们不能根据 CAP 定理工作。

如果您有多个数据中心,即使它们很接近,也会发生分区。如果发生分区,则可用性或一致性必须丢失,因为:

  • 您有一个预先确定的“主”,它继续工作,而其他“从”DC 发生故障(或变为只读)。这以牺牲可用性为代价来保持一致性。
  • 或者您在分区期间失去一致性(这意味着依赖于即时一致性的操作也不可用)。

据我所知,这不符合您的要求。你老板想要的显然是不可能的。他需要理解CAP定理。

现在,在您的应用程序案例中,为了方便起见,您可能决定可以改变规则并重新定义一致性或可用性,并拥有一个降级为不一致但暂时可接受的状态的系统。

您可能希望让产品管理人员查看这些要求的业务案例。放弃其中一些可能是可以的。一致性是一个需要保持的良好要求,因为它使事情按照人们的预期运行——这意味着降低可用性或分区容错性。从工程角度来看,保持一致性肯定更容易。

I imagine that these requirements (if properly analysed) are essentially incompatible, in that they cannot work according to CAP Theorem.

If you have several datacentres, even if they are close by, partitions WILL happen. If a partition happens, either availability OR consistency MUST be lost, because either:

  • you have a pre-determined "master", which keeps working and other "slave" DCs which fail (or go readonly). This keeps consistency at the expense of availability.
  • OR you lose consistency for the duration of the partition (this means that operations which depend on immediate consistency are also unavailable).

This is incompatible with your requirements, as far as I can see. What your boss wants is clearly impossible. He needs to understand CAP theorem.

Now, in YOUR application case, you may decide that you can bend the rules and redefine what consistency or availiblity are, for convenience, and have a system which degrades into an inconsistent but temporarily acceptable state.

You probably want to get product management to have a look at the business case for these requirements. Dropping some of them is probably ok. Consistency is a good requirement to keep, as it makes things behave as people expect - this means to drop availability or partition-tolerance. Keeping consistency is definitely easier from an engineering perspective.

究竟谁懂我的在乎 2024-09-01 14:27:48

这是雇主往往不了解使用现成解决方案的好处的另一件事。如果您作为一名程序员真的不知道从哪里开始,那么自己动手可能会花费大量金钱和时间。不了解这些东西也没有什么问题;考虑到关键组件的灾难性故障的高可用性、故障安全网络是一个大问题领域,许多人投入了大量的精力和金钱。为什么不利用提供商提供的服务呢?

再次尝试与您的老板讨论使用现有的云提供商。

This is another one of those things where employers tend not to understand the benefits of using an off-the-shelf solution. If you as a programmer don't really even know where to start with this, then rolling your own is probably a going to be a huge money and time sink. There's nothing wrong with not knowing this stuff either; high-availability, failsafe networking that takes into consideration catastrophic failure of critical components is a large problem domain that many people pour a lot of effort and money into. Why not take advantage of what providers have to offer?

Give talking to your boss about using existing cloud providers one more try.

倦话 2024-09-01 14:27:48

您可以联系在全球不同地区拥有数据中心的可靠且经验丰富的托管提供商之一(我们使用 Rackspace),并根据您的要求获取他们的建议。

You could contact one of the solid and experienced hosting provides (we use Rackspace) that have data centers in different regions world wide and get their recommendations upon your requirements.

青衫负雪 2024-09-01 14:27:48

这需要专家的协助、大量的预算和认真的规划。

我更好的选择是联系具有全球影响力的信誉良好的提供商,并选择具有可靠 SLA 服务支持的优质解决方案,并让他们定制最接近您需求的解决方案。

只要意识到,即使像谷歌、雅虎、微软和亚马逊(仅举几例)这样的公司,也曾在某个时候遇到过一些或其他问题,导致某些用户的系统部分离线。

This will require expert assistance and a large budget, and serious planning.

I better option will be contact a reputable provider with a global footprint and select a premium solution with a solid SLA backing up there service and let them tailor a solution that comes close to your needs.

Just realize even the guys like Google, Yahoo, Microsoft and Amazon (to name a few), at one time or another have had some or other issue that rendered segments of there systems offline to certain users.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文