JAVA 中的工作负载分配/并行执行
我遇到的情况是,我需要将工作分配给在不同 JVM(可能是不同机器)中运行的多个 JAVA 进程。
假设我有一个包含记录 1 到 1000 的表。我正在寻找要收集和分发的工作,以 10 为一组。假设将记录 1-10 分配给了workerOne。 然后将11-20记录给workerThree。 等等等等。 不用说,workerOne 永远不会做workerTwo 的工作,除非workerTwo 无法完成。
这个例子纯粹基于数据库,但可以扩展到任何系统,我相信是文件处理、电子邮件处理等等。
我有一种小小的感觉,立即的反应是采用主/工人方法。 然而这里我们讨论的是不同的 JVM。 即使一个 JVM 宕机,另一个 JVM 也应该继续执行其工作。
现在,百万美元的问题是:是否有任何好的框架(生产就绪)可以让我方便地做到这一点。 即使有特定需求的具体实现,例如数据库记录、文件处理、电子邮件处理等。
我见过 Java 并行执行框架,但不确定它是否可以用于不同的 JVM,如果一个 JVM 宕机,另一个会继续运行。我相信 Workers 可以在多个 JVM 上,但是 Master 呢?
更多信息 1:由于 JDK 1.6 要求,Hadoop 将成为一个问题。 那就有点太多了。
谢谢, 富兰克林
I have a situation here where I need to distribute work over to multiple JAVA processes running in different JVMs, probably different machines.
Lets say I have a table with records 1 to 1000. I am looking for work to be collected and distributed is sets of 10. Lets say records 1-10 to workerOne. Then records 11-20 to workerThree. And so on and so forth. Needless to say workerOne never does the work of workerTwo unless and until workerTwo couldnt do it.
This example was purely based on database but could be extended to any system, I believe be it File processing, email processing and so forth.
I have a small feeling that the immediate response would be to go for a Master/Worker approach. However here we are talking about different JVMs. Even if one JVM were to come down the other JVM should just keep doing its work.
Now the million dollar question would be: Are there any good frameworks(production ready) that would give me facility to do this. Even if there are concrete implementations of specific needs like Database records, File processing, Email processing and their likes.
I have seen the Java Parallel Execution Framework, but am not sure if it can be used for different JVMs and if one were to come down would the other keep going.I believe Workers could be on multiple JVMs, but what about the Master?
More Info 1: Hadoop would be a problem because of the JDK 1.6 requirement. Thats bit too much.
Thanks,
Franklin
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我会考虑使用 Jgroups。 您可以对 jvm 进行集群,并且可以选择其中一个节点作为主节点,然后可以通过通过网络发送消息将工作分配给其他节点。 或者,您已经可以对工作项进行分区,然后在主节点中管理分区的分布,例如partion-1进入JVM-4,partion-2进入JVM-3,partion-3进入JVM-2等等。 如果 JVM-4 发生故障,主节点将意识到这一点,然后主节点将通知其他节点之一也开始拾取分区 1。
另一种更容易使用的替代方案是 redis pub sub 支持。 http://redis.io/topics/pubsub 。 但随后你将不得不维护我不喜欢的 redis 服务器。
I would consider using Jgroups for that. You can cluster your jvms and one of your nodes can be selected as master and then can distribute the work to the other nodes by sending message over network. Or you can already partition your work items and then manage in master node the distribution of the partitions like partion-1 one goes to JVM-4 , partion-2 goes to JVM-3, partion-3 goes to JVM-2 and so on. And if JVM-4 goes down it will be realized by the master node and then master node will tell to one of the other nodes to start pick up partition-1 as well.
One other alternative which is easier to use is redis pub sub support. http://redis.io/topics/pubsub . But then you will have to maintain redis servers which i dont like.
可能需要研究 MapReduce 和 Hadoop
Might want to look into MapReduce and Hadoop
您还可以使用消息队列。 有一个流程可以生成工作列表并将其打包成漂亮的小块。 然后它将这些块放入队列中。 每个工作人员都在队列中等待有东西出现。 当它发生时,工作人员从队列中取出一个块并对其进行处理。 如果一个进程出现故障,其他进程就会弥补这一不足。 很简单,人们长期以来一直这样做,所以网上有很多关于它的信息。
You could also use message queues. Have one process that generates the list of work and packages it in nice little chunks. It then plops those chunks on a queue. Each one of the workers just keeps waiting on the queue for something to show up. When it does, the worker pulls a chunk off the queue and processes it. If one process goes down, some other process will pick up the slack. Simple and people have been doing it that way for a long time so there's a lot information about it on the net.
查看 Hadoop
Check out Hadoop
我相信 Terracotta 可以做到这一点。 如果你正在处理网页,JBoss可以是集群的。
如果您想自己执行此操作,您将需要一个工作经理来跟踪待完成的工作、正在进行的工作以及从未完成且需要重新安排的工作。 然后,工人们要求做某事,去做,然后将结果发回,要求更多。
您可能需要详细说明您想要做什么类型的工作。
I believe Terracotta can do this. If you are dealing with web pages, JBoss can be clustered.
If you want to do this yourself you will need a work manager which keeps track of jobs to do, jobs in progress and jobs never done which needs to be rescheduled. The workers then ask for something to do, do it, and send the result back, asking for more.
You may want to elaborate on what kind of work you want to do.
您所描述的问题绝对最好使用主/工作模式来解决。
您应该看看 JavaSpaces(Jini 框架的一部分),它非常适合这种事情。 基本上,您只想将要执行的每个任务封装在 Command 对象内,并根据需要进行子类化。 将它们转储到 JavaSpace 中,让您的工作人员一次抓取并处理一个,然后在完成后重新组装。
当然,您的性能提升完全取决于处理每组记录所需的时间,但如果分布在多台机器上,JavaSpaces 不会造成任何问题。
The problem you've described is definitely best solved using the master/worker pattern.
You should have a look into JavaSpaces (part of the Jini framework), it's really well suited to this kind of thing. Basically you just want to encapsulate each task to be carried out inside a Command object, subclassing as necesssary. Dump these into the JavaSpace, let your workers grab and process one at a time, then reassemble when done.
Of course your performance gains will totally depend on how long it takes you to process each set of records, but JavaSpaces won't cause any problems if distributed across several machines.
如果您处理单个数据库中的记录,请考虑使用存储过程在数据库本身内执行该工作。 在不同机器上处理记录的收益可能会被数据库和计算节点之间检索和传输工作的成本所抵消。
对于文件处理来说,情况可能类似。 处理(共享)文件系统中的文件可能会给操作系统带来巨大的 I/O 压力。
在多台机器上维护多个 JVM 的成本也可能过高。
对于这个问题:我曾经使用JADE(Java代理开发环境)进行一些分布式模拟。 它的多机支持和消息传递性质可能会对您有所帮助。
If you work on records in a single database, consider performing the work within the database itself using stored procedures. The gain for processing the records on different machine might be negated by the cost of retrieving and transmitting the work between the database and the computing nodes.
For file processing it could be a similar case. Working on files in (shared) filesystem might introduce large I/O pressure for OS.
And the cost for maintaining multiple JVM's on multiple machines might be an overkill too.
And for the question: I used the JADE (Java Agent Development Environment) for some distributed simulation once. Its multi-machine suppord and message passing nature might help you.