两个线程从同一个表读取:如何使两个线程不从 TASKS 表读取同一组数据

发布于 2024-12-16 13:31:31 字数 149 浏览 3 评论 0原文

我有一个任务线程在两个单独的 tomcat 实例中运行。 任务线程在某些条件下并发读取(使用 select)TASKS 表,然后进行一些处理。

问题是,有时两个线程都会选择相同的任务,因此该任务会执行两次。 我的问题是如何使两个线程不从 TASKS 表中读取同一组数据

I have a tasks thread running in two separate instances of tomcat.
The Task threads concurrently reads (using select) TASKS table on certain where condition and then does some processing.

Issue is ,sometimes both the threads pick the same task , because of which the task is executed twice.
My question is how do i make both thread not to read the same set of data from the TASKS table

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

转身以后 2024-12-23 13:31:32

如果您提到的TASKS 表 是一个数据库表,那么我将使用事务隔离。

建议在事务中将 TASK 表的属性设置为某个唯一的可识别值(如果未设置)。进行牵引。如果一切正常,则该任务已被线程选择。

我还没有遇到过这个用例,所以请谨慎对待我的建议。

If the TASKS table you mention is a database table then I would use Transaction isolation.

As a suggestion, within a trasaction, set an attribute of the TASK table to some unique identifiable value if not set. Commit the tracaction. If all is OK then the task has be selected by the thread.

I haven't come across this usecase so treat my suggestion with catuion.

ゃ人海孤独症 2024-12-23 13:31:32

我认为您需要了解一些如何与任何企业作业调度程序配合使用的信息,例如与 Quartz

I think you need to see some information how does work with any enterprise job scheduler, for example with Quartz

春庭雪 2024-12-23 13:31:32

对于您的用例,有一个更好的工具可以完成这项工作 - 这就是消息传递。您正在持久化需要处理的项目,然后尝试同步工作人员之间的访问。在完成这项工作时,您需要解决许多问题 - 一般来说,更新表和从中选择不应该混合(它会锁定),因此在那里存储状态不起作用; Java 代码中的同步也不会,因为服务器重新启动后同步就无法继续存在。

将 JMS API 与 ActiveMQ 等消息代理结合使用,您可以将消息发布到队列。该消息将包含要执行的任务的详细信息。消息代理会将其保存在某处(在其自己的消息存储中或数据库中)。然后,工作线程将订阅消息代理上的队列,并且每条消息只会传递给其中之一。这是一个非常强大的模型,因为您可以让数百个消息使用者都执行任务,因此它可以很好地扩展。您还可以使其具有所需的弹性,这样任务就可以在 Tomcat 和代理重新启动后继续存在。

For your use case there is a better tool for the job - and that's messaging. You are persisting items that need to be worked on, and then attempting to synchronise access between workers. There are a number of issues that you would need to resolve in making this work - in general updating a table and selecting from it should not be mixed (it locks), so storing state there doesn't work; neither would synchronization in your Java code, as that wouldn't survive a server restart.

Using the JMS API with a message broker like ActiveMQ, you would publish a message to a queue. This message would contain the details of the task to be executed. The message broker would persist this somewhere (either in its own message store, or a database). Worker threads would then subscribe to the queue on the message broker, and each message would only be handed off to one of them. This is quite a powerful model, as you can have hundreds of message consumers all acting on tasks so it scales nicely. You can also make this as resilient as it needs to be, so tasks can survive both Tomcat and broker restarts.

惜醉颜 2024-12-23 13:31:32

数据库是否能够对此提供优雅的管理在很大程度上取决于它是否使用严格的两阶段锁定(S2PL)或多版本并发控制(MVCC)技术来管理并发。在 MVCC 下,读取不会阻止写入,反之亦然,因此很可能使用相对简单的逻辑来管理此问题。在 S2PL 下,您会花费太多时间来阻止数据库成为管理此问题的良好机制,因此您可能需要查看外部机制。当然,无论数据库如何,外部机制都可以工作,只是对于 MVCC 来说并不是必需的。

使用 MVCC 的数据库有 PostgreSQL、Oracle、MS SQL Server(在某些配置中)、InnoDB(除了 SERIALIZABLE 隔离级别),可能还有许多其他数据库。 (这些是我立即知道的。)

我没有在问题中找到任何关于您正在使用哪种数据库产品的线索,但如果是 PostgreSQL,您可能需要考虑使用咨询锁。 http://www.postgresql.org/docs/current /interactive/explicit-locking.html#ADVISORY-LOCKS 我怀疑许多其他产品也有一些类似的机制。

Whether the database can provide graceful management of this will depend largely on whether it is using strict two-phase locking (S2PL) or multi-version concurrency control (MVCC) techniques to manage concurrency. Under MVCC reads don't block writes, and vice versa, so it is very possible to manage this with relatively simple logic. Under S2PL you would spend too much time blocking for the database to be a good mechanism for managing this, so you would probably want to look at external mechanisms. Of course, an external mechanism can work regardless of the database, it's just not really necessary with MVCC.

Databases using MVCC are PostgreSQL, Oracle, MS SQL Server (in certain configurations), InnoDB (except at the SERIALIZABLE isolation level), and probably many others. (These are the ones I know of off-hand.)

I didn't pick up any clues in the question as to which database product you are using, but if it is PostgreSQL you might want to consider using advisory locks. http://www.postgresql.org/docs/current/interactive/explicit-locking.html#ADVISORY-LOCKS I suspect many of the other products have some similar mechanism.

远昼 2024-12-23 13:31:32

我认为您需要一些变量(列)来保存行的上次修改日期。您的线程可以读取具有相同修改日期限制的同一组数据。

编辑:
我没有看到“不读”

在这种情况下,您需要另一个表 TaskExecutor (taskId , executorId) ,当某个线程运行任务时,您将数据放入 TaskExecutor ;当您启动另一个线程时,它只是检查任务是否已经执行(Select ... from RanTask where taskId = ...)。
您还需要注意事务的隔离级别。

I think you need have some variable (column) where you keep last modified date of rows. Your threads can read same set of data with same modified date limitation.

Edit:
I did not see "not to read"

In this case you need have another table TaskExecutor (taskId , executorId) , and when some thread runs task you put data to TaskExecutor; and when you start another thread it just checks that task is already executing or not (Select ... from RanTask where taskId = ...).
Нou also need to take care of isolation level for transaсtions.

窗影残 2024-12-23 13:31:31

这只是因为您的代码(正在访问数据库)DAO 函数未同步。使其同步,我认为您的问题将得到解决。

It is just because your code(which is accessing data base)DAO function is not synchronized.Make it synchronized,i think your problem will be solved.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文