EC2 中的 SSL 速度缓慢
我们已将 Rails 应用程序部署到 EC2。在我们的设置中,我们在循环 DNS 后面的小型实例上有两个代理。它们运行 nginx 负载均衡器,以实现动态增长和收缩的 Web 服务器场。每个 Web 服务器还运行带有杂种集群的 nginx。这里的 nginx 负责静态内容和负载平衡杂种。
无论如何,我们的流量总体来说是 HTTPS。我们有 2 个代理负责 SSL。我注意到这些实例上的网络吞吐量上限仅为 60 Mbps 左右。相比之下,在测试中,我能够通过常规 HTTP 在小型实例上始终获得 700+ Mbps 的速度。事实上,这与我在大型实例上得到的结果是一样的。类似于 Right Scale 人员在 他们的测试。 (亚马逊说小型网络 I/O 为“中等”,而大型网络 I/O 为“高”。如果我必须推测,我认为这只是他们的说法,每个物理盒子有更多小型实例共享一张网卡我不确定这是否意味着大型实例可以获得专用网络接口,但我对此表示怀疑。)
在测试中,我能够让大型实例获得大约 250 Mbps SSL。这对我来说 CPU 或其他一些资源是瓶颈。但是,我们的监控图表并未显示代理上的 CPU 特别繁忙。
我的问题是:
- 我对 SSL 由于 CPU 原因而变慢的直觉是否正确,而我们的监控图表是错误的?或者其他一些资源可能是限制因素?
- 我们是否应该承担额外的成本并将代理放在高 CPU 实例上?或者添加更多小实例会更好吗?
- 我们应该将 SSL 终止卸载到 Web 服务器吗?但这又引入了一个问题:我们如何在应用程序中获取客户端 IP 地址?现在我们的代理将其设置在 X-FORWARDED-FOR 标头中,但显然如果它不解密 SSL,这是不可能的。
我很想听听任何类似的设置。我们对他们的弹性负载均衡器进行了一些修改,但我认为这基本上使我们处于与上面#3 相同的情况。还有其他人转用 ELB 并认为值得吗?
We've deployed our rails app to EC2. In our setup, we have two proxies on small instances behind round-robin DNS. These run nginx load balancers for a dynamically growing and shrinking farm of web servers. Each web server also runs nginx with a cluster of mongrels. The nginx here takes care of static content and load balancing the mongrels.
Anyway, our traffic by-and-large is HTTPS. We have the 2 proxies taking care of SSL. I've noticed that our network throughput on those instances caps out at only 60 Mbps or so. To contrast, in testing I am able consistently to get 700+ Mbps on a small instance via regular HTTP. In fact, this is the same as what I can get on a large instance. Similar to what the Right Scale guys got in their testing. (Amazon says a small gets "moderate" network I/O, while a large gets "high". If I had to speculate, I think this is just their way of saying that there are more small instances per physical box sharing one network card. I'm not sure if it means that a large gets a dedicated network interface, but I would doubt it.)
In testing, I was able to get a large instance to get about 250 Mbps SSL. This says to me that the CPU or some other resource is the bottleneck. However, our monitoring graphs don't show the CPU on our proxies being particularly busy.
My questions are:
- Is my instinct about SSL being slower due to CPU correct and our monitoring graphs are wrong? Or could some other resource be the limiting factor?
- Should we just take the extra cost and put the proxies on high-CPU instances? Or would it be better to do just add more small instances?
- Should we offload the SSL termination to the web servers? This introduces one more problem, though: how do we get the client IP address in our application? Right now our proxy sets it in the X-FORWARDED-FOR header, but obviously this wouldn't be possible if it's not decrypting SSL.
I'd love to hear about any similar setups. We tinkered a bit with their Elastic Load Balancer, but I think that basically puts us in the same situation as #3 above. Has anyone else made the switch to ELB and found it to be worth it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您是否使用 nginx 提供的 SSL 会话缓存?这可以帮助 nginx 节省不断重新制定加密的周期。请参阅 http://wiki.nginx.org/NginxHttpSslModule#ssl_session_cache
您使用什么监控确定你的CPU使用率? SSL 通常非常消耗 CPU 资源。
我会将 SSL 代理保留为指定层,这样您就可以将协商 ssl 的成本与其他问题分开。
Are you using the SSL session cache that nginx provides? That can help nginx save on cycles constantly re-working-out the encryption. See http://wiki.nginx.org/NginxHttpSslModule#ssl_session_cache
What monitoring are you using to determine your cpu usage? SSL is typically very CPU intensive.
I would keep the SSL proxies as a designated layer, that way you can scale the cost of negotiating ssl separately from other concerns.
我在 Apache 上使用 SSL,它处理对小型 Windows EC2 实例上的 Subversion 存储库的访问。在测试中,我发现 HTTPS 访问比 HTTP 慢一点,但这很明显,因为加密/解密不是一个瞬时过程,正如您所期望的那样。
如果您的 CPU 指标正确并且您没有看到过多的负载,则意味着带宽是限制因素;但是,我真的不明白为什么在 HTTP 实例上能够获得 700+ Mbps,而在 HTTPS 实例上只能获得 60 Mbps。当然,除非测试条件实际上并不相同,并且 HTTPS 实例内部还发生了其他您没有考虑到的情况……
较大的实例当然比小型实例获得更好的主机带宽份额 - 那里竞争资源的人越来越少。由于内部 EC2 网络是千兆位以太网,假设同一节点上没有其他大型实例提出类似的带宽需求,在大型实例上实现 700Mbps 是可行的。要从小型实例中获得这一点,您必须非常幸运能够在负载非常轻的主机中运行。在这种情况下,无法保证您会保持该性能水平 - 一旦其他 Small 上线,您的可用带宽份额就会开始下降。
我认为这本质上是一个小型实例带宽问题 - 添加更多小型实例不一定有多大帮助,因为您无法控制它们在哪个主机上启动;然而,大型实例可以获得更大的带宽份额,因此可能具有更一致的容量可用性。
I'm using SSL on Apache, which handles access to our Subversion repository on a Small Windows EC2 instance. In testing, I found that HTTPS access was fractionally slower than HTTP, but that's for the obvious reason that encryption/decryption is not an instantaneous process, as you'd expect.
If your CPU metrics are correct and you're not seeing excessive load, then the implication is that bandwidth is the limiting factor; however, I really can't see why you'd be able to get 700+ Mbps on an HTTP instance, compared to only 60Mbps on an HTTPS instance. Unless the test conditions were not actually identical, of course, and there's something else going on inside the HTTPS instance you haven't factored-in...
The larger instances do of course get a better share of the host bandwidth than Smalls - there are fewer of them competing for the resource. Since the internal EC2 network is Gigabit Ethernet, seeing 700Mbps on a Large instance is feasible assuming no other Large instances on the same node were making similar bandwidth demands. To get that out of a Small instance, you'd have to be really fortunate to be running inside a very lightly-loaded host. And in that case, there'd be no guarantee that you'd keep that performance level - as soon as other Smalls came online, your share of the available bandwidth is going to start dropping.
I think this is essentially a Small instance bandwidth issue - adding more Smalls won't necessarily help much, because you have no control over what host they spin-up on; Large instances, however, get a bigger slice of the bandwidth pie and therefore are likely to have more consistent capacity availability.
SSL 变慢:- true,那么任何正常的 HTTP 请求 HTTPSSL 都会变慢。
尝试在本地 LAN 上创建类似的设置,其中有 3 个 mongrel_clust 和 2 个 Web 服务器。
并使用curl加载器进行检查,发送大约5k个请求。
如果一切都很好,那就太好了。
也许您会与 EC2 人员一起更加努力地工作。
SSL being slower:- true, then any normal HTTP request the HTTPSSL, will be slower.
Try creating a similar setup, at local LAN, where you have 3 mongrel_clust, and 2 webserver.
and check using the curl loader, with sending about 5k requests .
if everything is fine, thats great.
may be you will working harder with EC2 guys.