Django 中的竞争条件

发布于 2024-10-22 13:36:52 字数 1369 浏览 3 评论 0原文

在 Django 中,我遇到了一些严重的竞争情况。当两个运行程序尝试同时执行 some_method() 时,问题就开始了。创建的日志记录如下:

Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
(et cetera for 18 MB)

以下方法给我带来了麻烦。应该注意的是,该方法会重新运行,直到该方法返回 False:

def some_method():
    conditions = #(amongst others, excludes jobs with status EXECUTING)

    try:
        cjob = Job.objects.filter(conditions).order_by(some_fields)[0]
    except IndexError:
        return False

    print 'Job %s: Candidate' % cjob.id

    job = cjob.for_update()

    if cjob.status != job.status:
        print 'Job %s: Already taken' % cjob.id
        return True

    print 'Job %s: Starting...' % job.id

    job.status = Job.EXECUTING
    job.save()
    # Critical section

# In models.py:
class Job(models.Model):
    # ...

    def for_update(self):
        return Job.objects.raw('SELECT * FROM `backend_job` WHERE `id` = %s FOR UPDATE', (self.id, ))[0]

目前,Django 没有专用的 for_update 方法并防止使用我们用来创建的所有条件创建查询确定是否必须运行作业,我们在简单的 FOR UPDATE 查询之前执行困难的查询。

我真的不明白这会如何导致我们看到的麻烦,我们执行查询,然后是当另一个运行者持有作业锁时阻塞的语句。只有在作业状态更改后才会释放锁定。第二个运行者现在获得了锁,但作业的状态已更改,因此它从该方法返回,只是稍后重新进入;但是cjob-查询不会再次返回相同的作业,因为它的状态现在已被过滤器排除。

我是否误解了 FOR UPDATE 子句,或者我是否遗漏了其他内容?

应该注意的是,我将 MySQL 与 InnoDB 结合使用,而 Celery 不适合此解决方案。

In Django, I have ran into some serious race condition. The trouble starts when two runners try to execute some_method() at the same time. The logging created is as follows:

Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
(et cetera for 18 MB)

The following method is giving me trouble. It should be noted that the method is re-ran until the method returns False:

def some_method():
    conditions = #(amongst others, excludes jobs with status EXECUTING)

    try:
        cjob = Job.objects.filter(conditions).order_by(some_fields)[0]
    except IndexError:
        return False

    print 'Job %s: Candidate' % cjob.id

    job = cjob.for_update()

    if cjob.status != job.status:
        print 'Job %s: Already taken' % cjob.id
        return True

    print 'Job %s: Starting...' % job.id

    job.status = Job.EXECUTING
    job.save()
    # Critical section

# In models.py:
class Job(models.Model):
    # ...

    def for_update(self):
        return Job.objects.raw('SELECT * FROM `backend_job` WHERE `id` = %s FOR UPDATE', (self.id, ))[0]

Currently, Django doesn't have a dedicated for_update-method and to prevent creating the query with all the conditions which we use to determine whether the job must be ran, we do the difficult query before the simple FOR UPDATE-query.

I don't really see how this could cause the trouble we see, we do the query, followed by statement that blocks when another runner holds the lock on the job. The lock is only released after the job's status has been changed. The second runner now gets the lock, but the job's status was changed, so it returns from the method, only to re-enter it later; but the cjob-query will not return the same job again, as its status is now excluded by the filter.

Do I misinterpret the FOR UPDATE-clause, or am I missing something else?

It should be noted that I use MySQL with InnoDB and that Celery does not fit for this solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

戏剧牡丹亭 2024-10-29 13:36:52

该问题已通过手动更新交易得到解决。自事务开始以来,查询集似乎没有更新。当两个 QuerySet 以某种方式同时启动,并且两个 QuerySet 中都会发生一项作业时,就会破坏运行程序。

阅读这个答案后,我想出了一个解决方案:在返回 True 之前,事务已提交。

The problem has been fixed by manually updating the transaction. It seems that the QuerySet did not update since the start of the transaction. When two QuerySets would somehow start at the same time, and one job would occur in both QuerySets, it would break up the runners.

After reading this answer, I came up with a solution: just before the return True, the transaction is committed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文