RabbitMQ 应对 EC2 性能挑战
EC2 上 RabbitMQ 的性能预期是什么?很高兴在这里分享经验。
我正在尝试在 aws EC2 上对 RabbitMQ 进行一些性能测试。我有 3 个单独的 EC2 实例为 RabbitMQ、发布者和消费者/工作者运行。
我遇到的场景是,Publisher 推送 JSON 字符串(大约 165-200 字节)来直接交换类型并将持久设置为 true,并将持久设置为 true 绑定队列(即两者都处于持久模式)。消费者/工作人员在单独的盒子上运行 - 不断拉取消息。 (向前推进,worker 中的这些消息预计将保留在 MongoDB 中,并且 Publisher 将使用 REST easy 替换为 Restful 服务)
为了简单起见,我使用多播示例代码模拟了此场景。我将多播代码拆分为两个单独的 java 文件,即“Producer”和“Worker”,以便在单独的机器上运行每个文件。我使用“c1.mediam”EC2 和 Ubuntu 服务器 v11.4 32 位来运行生产者和消费者,使用“m1.large”和 Ubuntu 服务器 v11.4 64 位来运行 RabbitMQ。
我能够实现每秒 3-5k 条消息的吞吐量,即保持研究消息推送速率为 5K。 (这与 http://www.rabbitmq.com/faq.html#performance-latency 一致)
此外,当我将推送速率提高到每秒 10-12k 条消息时。消费者消费消息的能力下降到每秒 1-2k 条消息,并且会产生积压(很多时候也低于每秒 800 条消息)。
对于上述情况,我有以下问题,并且也希望获得提高消费者吞吐量的想法/建议。 (注意:我的场景中的所有消息都应该是相似的类型,没有机会将它们分组以设置路由,因此可能需要某种负载均衡器方法)
1)使用一台rabbitMQ服务器、一台交换器和一个队列观察到这种性能。是否可以进一步配置、微调,以在持久模式下将吞吐量提高到 5k 以上。
2)我确实理解,集群可能是另一种选择。但是,我需要根据传入负载设置集群,并且我可能无法获得消息分组/身份来定义路由(因为消息预计只是日志描述)。我可以为工作人员/消费者提供以下负载平衡选项的集群吗?
3)我预计每秒处理数十万个请求。我很乐意分享一些实现这一目标的经验和方法。
What could be performance expectations of RabbitMQ on EC2? Would appreciate sharing experience here.
I am trying to do some performance test of RabbitMQ on aws EC2. I have 3 separate EC2 instance running for RabbitMQ, Publisher and consumer/worker.
The scenario I have is that Publisher pushes JSON string (approx 165-200 bytes) to exchange type direct with durable set to true and bind queue with durable set to true (i.e. both in persistent mode). Consumer/worker is running on separate box - keeps pulling messages. (Moving forward these messages at worker are expected to be persisted in MongoDB and Publisher would be replaced with Restful service using REST easy)
To keep things simple I have simulated this scenario by using Multicast sample code. I had split multicast code in to two separate java file namely “Producer” and “Worker” to run each on separate box. I have used “c1.mediam” EC2 with Ubuntu server v11.4 32 bit for running producer and consumer and “m1.large” with Ubuntu server v11.4 64 bit for RabbitMQ.
I am able to achieve a throughput of 3-5k messages per second i.e. keeping study message push rate to 5K. (This concur with http://www.rabbitmq.com/faq.html#performance-latency)
Further, when I increase the push rate to 10-12k messages per second. Consumer’s ability to consume messages drops to 1-2k messages per second and it generates backlog (Many time it goes below 800 messages per second too).
With above scenario, I have following questions and would appreciate thoughts/suggestion to improve throughput of consumer as well. (NOTE: all the messages in my scenario are expected to similar type giving no opportunity to group them for setting routing therefore may need some kind of load-balancer approach)
1) This performance is observed with one rabbitMQ server, one exchange and one queue. Is anything further can be configured, fine-tuned to improvise throughput to more than 5k with persistent mode.
2) I do understand, clustering could be another option. However, I need to set cluster based on incoming load and I may not get message grouping / identity to define routing (since messages are expected to be just log description). Can I have clustering following load balancing option for worker/consumer?
3) I am expected to process several hundred thousand requests per second. I would appreciate sharing some experience and approach to achieve this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的 EC2 实例使用什么类型的存储?
EBS 存储更可靠,但有时它的吞吐量非常(特别是如果它是小型 EBS 卷,即 <100GB)。另一方面,实例存储具有更好的 IO 性能(至少根据我们的经验),但只有在实例运行时才能“生存”。
此外,您使用的实例类型也存在很大差异。 m1.small 和 c1.medium 均具有中等 IO 性能 (http://aws.amazon.com/ec2/instance-types/)。
我们在 EC2 中运行 RabbitMQ,并持久保存所有消息。我们仅使用 m1.large 实例(64 位,具有高 IO 性能)。我们从 EBS 存储开始,然后切换到实例存储,看看是否有任何改进。而且实例存储实例的 IO 吞吐量更快。但是,缺点是所有持久化消息都会随着实例的终止/失败而丢失(尽管到目前为止我们从未经历过失败)。
在我们的场景中,我们不需要这么大的吞吐量,但我们确实非常关心消息是否丢失:-)
总之,您可以尝试切换到实例存储设置,看看它是如何处理的,如果有什么改进。如果效果更好,那么我认为 http://www.rabbitmq.com/pacemaker.html是克服失败的解决方案。至少这是我们正在转向的方向。
干杯
What type of storage are you using for the EC2 instances?
EBS storage is more reliable, but sometimes it has very low throughput (especially if it's a small sized EBS volume, i.e. <100GB). Instance store, on the other hand, has much better IO performance (from our experience, at least), but can only "live" as long as the instance is running.
Also, quite a difference is the instance type you're using. m1.small and c1.medium both have moderate IO performance (http://aws.amazon.com/ec2/instance-types/).
We're running RabbitMQ in EC2 with persistence for all the messages. We use only m1.large instances (64bit with high IO performance). We started with EBS storage, then switched to instance-store, to see if there's any improvement. And instance-store instances are faster in terms of IO throughput. But, the drawback is that all persisted messages are lost along with the termination/failure of the instance (although we never experienced a failure ever, so far).
In our scenario, we don't need such a big throughput, but we do care a lot if our messages get lost :-)
In conclusion, you could try to switch to an instance-store setup, to see how that handles, if there's any improvement. And if that works much better, then I think http://www.rabbitmq.com/pacemaker.html is a solution to overcome failure. At least that's the direction we're switching to.
Cheers
您是否考虑过添加多个消费者?与严格耦合架构相比,这是松散耦合总线/消息架构的核心优势之一。它也可能有助于理解消息量的需求。这是一个基准测试,只是为了看看您能做什么,还是与实际应用程序需求相关?
Have you considered adding multiple consumers? This is one of the core benefits of a loosely coupled bus/message architecture as compared to a strictly coupled architecture. It may help to understand the need for the message volume as well. Is this a benchmark just to see what you can do or is this tied to an actual application need?
数百 kHz 是非常非常高的:如果 RabbitMQ 能够做到这一点,那么您正在考虑跨集群节点进行分区。 这些作者发现他们的 EC2实例每秒最多可以处理 100K 数据包,因此显然您不会获得比单个实例更高的消息吞吐量。
您可以研究 Kafka,它是 LinkedIn 为类似的大型消防模型而编写的。它将一些复杂性推给了消费者,以实现真正的分布式性和更低的消息开销。
Hundreds of kHz is very, very high: if RabbitMQ can do that at all, you're looking at partitioning across clustered nodes. These writers found that their EC2 instances could process at most 100K packets/second, so obviously you won't get message throughput higher than that through a single instance.
You might investigate Kafka, written by LinkedIn for a similar sort of vast-firehose model. It pushes some complexity out to consumers in order to allow for genuine distributed-ness and lower message overhead.