我需要使用一些东西来协调我的系统与多个消费者/生产者,每个消费者/生产者都运行在具有不同操作系统的不同机器上。我一直在研究使用 MySql 来做到这一点,但这似乎非常困难。
我的要求很简单:我希望能够随时添加或删除消费者/生产者,因此他们根本不应该相互依赖。当然,数据库可以很好地将两者分开。
我一直在研究 MySql 的 Q4M 消息队列插件,但使用起来似乎很复杂。
我确实需要一些关于如何最好地构建我的系统的意见。
I need to use something to coordinate my system with several consumers/producers each running on different machines with different operating systems. I have been researching on using MySql to do this, but it seems ridiculously difficult.
My requirements are simple: I want to be able to add or remove consumers/producers at any time and thus they should not depend on each other at all. Naturally a database would separate the two nicely.
I have been looking at Q4M message queuing plugin for MySql but it seems complicated to use.
I really need some input on how to construct my system best possible.
发布评论
评论(4)
那是一个消息队列。不要寻求其他替代方案。其他一切(即使用具有插入和删除功能的数据库)都非常缓慢且麻烦。
使用数据库构建大型、缓慢的消息队列在实践中通常会产生糟糕的结果,因为 (1) 数据库速度很慢,(2) 数据库庞大且复杂,(3) 存在锁定和争用问题,这些问题可能导致每个事务都可能变慢,( 4)这比问题应有的开销要多得多。
有许多消息队列解决方案。
如果你不能让 Q4M 发挥作用,你应该转向另一个。
http://en.wikipedia.org/wiki/Message_queue
http://linux.die.net/man/7/mq_overview
http://qpid.apache.org/
http://code.google.com/p/httpsqs/
That's a message queue. Do not pursue other alternatives. Everything else (i.e., using a database with insert and deletes) is dreadfully slow and cumbersome.
Building a large, slow message queue with a database often turns out badly in practice because (1) databases are slow, (2) databases are huge and complex, (3) you have locking and contention issues that make each transaction potentially slow, (4) it's a lot more overhead than the problem deserves.
There are numerous message queue solutions.
If you can't make Q4M work, you should move on to another.
http://en.wikipedia.org/wiki/Message_queue
http://linux.die.net/man/7/mq_overview
http://qpid.apache.org/
http://code.google.com/p/httpsqs/
构建这样的系统实际上(相当)复杂。 (我说公平,因为这当然是可行的)。
如果您有多个生产者和一个消费者,那就很容易了。所有生产者同时写入,单个消费者在数据可见(提交)后立即读取数据。
但是,如果您希望具有多个消费者的可扩展性,则需要创建一个并非易事的锁定方案。 (您必须确保没有行被分派给两个消费者。这对于数据库事务和锁来说并不容易实现。简单的解决方案会导致所有消息传递的序列化,就像您只有一个消费者一样,这是我们不希望的。 )。
我建议使用内置解决方案。您还可以阅读 这个问题是关于类似问题的。
It's actually (fairly) complicated to build such system. (I say fairly, because it's of course doable).
If you have multiple producer and one consumer, it's easy. All producer write concurrently, and the single consumer read data as soon as they are visible (committed).
But if you want scalability with several consumer, you will need to create a locking scheme that is not trivial. (You must ensure that no row gets dispatched to two consumers. This is not easy to achieve with database transactions and locks. Naive solutions lead to the serialization of all message delivery, like you had only one consumer, which we don't want.).
I would suggest to use a built-in solution. You can also read this question about a similar question.
我觉得不用第三方软件也是可行的。
我的第一个设计如下:
由于事务需求,InnoDB 是存储引擎的合理选择。您还必须仔细选择隔离级别。我的第一个猜测是“可序列化”以避免幻读,但也许更弱的级别也是可能的。
如果性能和可扩展性是一个问题,您应该考虑使用“真正的”消息传递解决方案。推出您的产品很可能会导致性能和/或可扩展性问题。
I think it's feasible without third-party software.
My first design would look like this:
Because of the transactions requirement InnoDB is the logical choice of the storage engine. Also you have to carefully choose the isolation level. My first guess is "serializable" to avoid phantom reads, but perhaps a weaker level also is possible.
If performance and scalability is an issue, you should consider using a "real" messaging solution. Rolling out your one will most likely lead to performance and/or scalability issues.
这取决于具体情况。
就我而言,唯一的生产者每天生成数千条消息,并且多个消费者在接下来的 24 小时内消费这些消息,每个消费者都需要几分钟才能完成。所以,我认为mysql可以满足我的要求,并且我可以使用事务来保证消费者之间的一致性。
希望它会有所帮助。
It depends situations.
In my case, the only one producer make thousands message per day, and several consumers consume these messages in the following 24 hours, which each one take several minites to finish. So, I think the mysql would meet my requirement, and I can use the transactions to ensure the consistency between consumers.
Hope it will help.