从 1 个 Web 服务器进行扩展1 台数据库服务器
我们是一家 Web 2.0 公司,使用 LAMP 从头开始构建托管内容管理解决方案。简而言之,人们登录我们的后端来管理他们的网站内容,然后使用我们的 API 提取该内容。该 API 被插入到可以托管在互联网上任何地方的模板中。
我们的扩展进展如下:
- 共享托管(1 和 1)
- 专用单服务器托管 (Rackspace)
- 1 个 Web 服务器、1 个数据库服务器(Rackspace)
- 1 个后端 Web 服务器、1 个 API Web 服务器、1 个数据库服务器
- Memcache,缓存,缓存,缓存。
问题是,我们接下来要做什么?每当我们的网站被挖掘或在热门网站中提及时,我们的 API 服务器就会因连接过多而崩溃。或者每次我们的数据库服务器因查询而超限时,我们的 Web 服务器就会请求备份。
对于像我们这样的任何公司来说,这显然是“下一个问题”,我想知道您是否能为我指出一些方向。
我目前对虚拟化解决方案(如 EC2)很感兴趣,但需要一些关于要考虑的事项的指导。
We are Web 2.0 company that built a hosted Content Management solution from the ground up using LAMP. In short, people log into our backend to manage their website content and then use our API to extract that content. This API gets plugged into templates that can be hosted anywhere on the interwebs.
Scaling for us has progressed as follows:
- Shared hosting (1and1)
- Dedicated single server hosting (Rackspace)
- 1 Web Server, 1 DB Server (Rackspace)
- 1 Backend Web Server, 1 API Web Server, 1 DB Server
- Memcache, caching, caching, caching.
The question is, what's next for us? Every time one of our sites are dugg or mentioned in a popular website, our API server gets crushed with too many connections. Or every time our DB server gets overrun with queries, our Web server requests back up.
This is obviously the 'next problem' for any company like ours and I was wondering if you could point me in some directions.
I am currently attracted to the virtualization solutions (like EC2) but need some pointers on what to consider.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
扩展什么/在哪里/如何扩展取决于您的问题是什么。由于您已经被攻击过几次,而且您知道是 API 服务器造成的,因此您需要确定到底是什么导致了问题。
是数据库查找时间吗?
Web 服务器无法处理大量请求,即使它们是短暂的?
API 请求处理时间太长? (独立于数据库查找,例如,代码是否需要一些时间才能运行)?
一旦确定问题是什么,您就应该清楚地了解需要做什么。如果只是请求量,并且是 API 服务器,那么您只需要更多的 Web 服务器(并更改代码以允许水平扩展)或更强大的 Web 服务器。如果 API 请求花费的时间太长,您需要考虑代码优化。在可扩展性方面,从来没有一劳永逸的解决方案。
最常见的扩展问题与每个请求的实际代码执行缓慢(2-3 秒)有关,这反过来会导致更多的 Web 服务器,从而导致更多的数据库交互(对于跨服务器会话等)这会导致数据库性能问题。使用 memcache 的高性能、服务器独立代码(我实际上更喜欢 memcache 周围的包装器,因此应用程序不知道/关心它从哪里获取数据,只是它获取数据并且转换层处理 DB/memcache 查找以及填充内存缓存)。
What/where/how to scale is dependent on what your issues are. Since you've been hit a few times, and you know it's the API server, you need to identify what's actually causing the issue.
Is it DB lookup times?
A volume of requests that the web server just can't handle even though they're shortlived?
API requests take too long to process? (independent of DB lookups, e.g., does the code take a bit to run)?
Once you identify WHAT the problem is, you should have a pretty clear picture of what you need to do. If it's just volume of requests, and it's the API server, you just need more web servers (and code changes to allow horizontal scaling) or a beefier web server. If it's API requests taking too long, you're looking at code optimizations. There's never a 1-shot fix when it comes to scalability.
The most common scaling issues have to do with slow (2-3 seconds) execution of the actual code for each request, which in turn leads to more web servers, which leads to more database interactions (for cross-server sessions, etc.) which leads to database performance issues. High performance, server independent code with memcache (I actually prefer a wrapper around memcache so the application doesn't know/care where it gets the data from, just that it gets it and the translation layer handles DB/memcache lookups as well as populating memcache).
实际上取决于您的瓶颈是读取还是写入。扩展写入比读取困难得多。
它还取决于数据库中有多少数据。
如果您的数据库很小,但无法应对读取负载,您可以部署足够的内存以适应内存。如果它仍然无法应对,您可以添加只读副本,可能与您的 Web 服务器在同一台机器上,这将为您提供良好的读取可扩展性 - 来自一个 MySQL 主服务器的从服务器数量相当高,并且主要取决于写入工作量。
如果您需要扩展写入,那是一个完全不同的游戏。为此,您需要水平(分区/分片)或垂直(功能分区等)拆分数据,以便可以将工作负载分散到多个不需要执行彼此工作的写入服务器上。
我不确定 EC2 能为您做什么,它本质上是在或多或少不存在的 SLA 的基础上提供缓慢、高延迟的机器,这些机器带有非持久性磁盘和低 IO 性能。我想这对你的情况可能很有用,因为你可以相对快速地配置它们 - 前提是你只是将它们用作只读副本并且你没有太多数据(记住它们有非持久磁盘和糟糕的 IO)
Depends really if your bottleneck is reads or writes. Scaling writes is much harder than reads.
It also depends on how much data you have in the database.
If your database is small, but cannot cope with the read load, you can deploy enough ram that it fits in ram. If it still cannot cope, you can add read-replicas, possibly on the same box as your web servers, this will give you good read-scalability - the number of slaves from one MySQL master is quite high and will depend chiefly on the write workload.
If you need to scale writes, that's a totally different game. To do that you'll need to split your data out, either horizontally (partitioning / sharding) or vertically (functional partitioning etc) so that you can spread the workload over several write servers which do not need to do each others' work.
I'm not sure what EC2 can do for you, it essentially offers slow, high latency machines with nonpersistent discs and low IO performance on the end of a more-or-less nonexistent SLA. I guess it might be useful in your case as you can provision them relatively quickly - provided you're just using them as read-replicas and you don't have too much data (remember they have nonpersistent discs and sucky IO)
您正在寻找什么级别的扩展?这是一个权宜之计,例如垂直扩展吗?如果是更具战略性的扩展项目,您当前的架构是否支持水平扩展?
What is the level of scaling you are looking for? Is it a stop-gap solution e.g. scale vertically? If it is a more strategic scaling project, does your current architecture support scaling horizontally?