如何使用数据库作为通信媒介的分布式应用程序实现最大并发性

发布于 2024-10-13 06:23:33 字数 299 浏览 8 评论 0原文

我有一个类似于经典生产者消费者问题的应用程序。只是想检查所有可能的实现来实现它。问题是-

进程A:向数据库(生产者)的表中插入一行

进程B:从表中读取M行,处理后删除读取的M行。

流程B中的任务: 1.读取M行 2. 处理这些行 3.删除

进程A的N1个实例的这些行, 进程 B 的 N2 个实例同时运行。

每个实例都在不同的机器上运行。

一些要求: 如果进程 p1 正在读取 (0,M-1) 行。进程 p2 不应该等待 p1 直到它释放这些行上的锁,而是应该读取 (M,2M-1) 行。

I have an application which is similar to classic producer consumer problem. Just wanted to check out all the possible implementations to achieve it. The problem is-

Process A: inserts a row into the table in database (producers)

Process B: reads M rows from the table, deletes the read M rows after processing.

Tasks in process B:
1. Read M rows
2. Process these rows
3. Delete these rows

N1 instances of process A,
N2 instances of process B runs concurrently.

Each instance runs on a different box.

Some requirements:
If a process p1 is reading (0,M-1) rows. process p2 should not wait for p1 until it releases the lock on these rows, instead it should read (M,2M-1) rows.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

高速公鹿 2024-10-20 06:23:33

我敢打赌,有比使用数据库作为生产者和消费者之间的交换器更好的并行处理方法。为什么不队列?您检查过为Map/Reduce设计的工具/框架吗? Hadoop、GridGain、JPPF 都可以做到这一点。

I bet there are better ways of parallel processing than using DB as the excahnger between producer and consumer. Why not queues? Have you checked the tools/frameworks designed for Map/Reduce. Hadoop, GridGain, JPPF all can do this.

妳是的陽光 2024-10-20 06:23:33

Java.15 的 ConcurrentHashMap 中也使用了类似的概念。
应单独维护正在处理的行列表。当任何进程需要与数据库交互时,它应该检查这些行是否正在由另一个进程处理。如果是这样,它应该等待该条件,否则它可以处理。在这种情况下维护索引可能会有所帮助

Similar concept is being used in ConcurrentHashMap of Java.15.
A list of rows which are being processed should be maintained separately. When any process needs to interact with DB, it should check whether that rows are being processed by another process. If so it should wait on that condition, else it can process. maintaining Indexes might help in such a case

暮光沉寂 2024-10-20 06:23:33

我认为如果这个应用程序被实现,它实际上使用手工制作的队列。我相信 JMS 在这种情况下要好得多。有许多可用的 JMS 实现。其中大多数都是开源的。

在您的情况下,进程 A 应该将任务插入队列中。进程 B 应在 receive() 上阻塞,获取 N 条消息,然后处理它们。您可能有理由从队列中获取大量任务,但如果您将实现更改为基于 JMS,您可能根本不需要它,因此您只需侦听队列并立即处理消息即可。实现变得几乎微不足道,非常灵活且可扩展。您可以根据需要运行任意多个进程 A 和 B,并将它们分布在不同的盒子中。

I think that if this application is implemented it actually uses hand made queue. I believe that JMS is much better in this case. There are a lot of JMS implementations available. Most of them are open source.

In your case process A should insert tasks into the queue. Process B should be blocked on receive(), get N messages and then process them. You probably have reasons to get a bulk of tasks from your queue but if you change implementation to JMS based you probably do not need this at all, so you can just listen to the queue and process message immediately. The implementation becomes almost trivial, very flexible and scalable. You can run as many processes A and B as you want and distribute them among separate boxes.

别在捏我脸啦 2024-10-20 06:23:33

您可能还想了解一下 Amazon Elastic Map Reduce

http://aws.amazon.com/elasticmapreduce/< /a>

You may also want to take a look into Amazon Elastic Map Reduce

http://aws.amazon.com/elasticmapreduce/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文