Django 中的竞争条件

发布于 2024-10-22 13:36:52 字数 1369 浏览 3 评论 0原文

在 Django 中，我遇到了一些严重的竞争情况。当两个运行程序尝试同时执行 some_method() 时，问题就开始了。创建的日志记录如下：

Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
(et cetera for 18 MB)

以下方法给我带来了麻烦。应该注意的是，该方法会重新运行，直到该方法返回 False：

def some_method():
    conditions = #(amongst others, excludes jobs with status EXECUTING)

    try:
        cjob = Job.objects.filter(conditions).order_by(some_fields)[0]
    except IndexError:
        return False

    print 'Job %s: Candidate' % cjob.id

    job = cjob.for_update()

    if cjob.status != job.status:
        print 'Job %s: Already taken' % cjob.id
        return True

    print 'Job %s: Starting...' % job.id

    job.status = Job.EXECUTING
    job.save()
    # Critical section

# In models.py:
class Job(models.Model):
    # ...

    def for_update(self):
        return Job.objects.raw('SELECT * FROM `backend_job` WHERE `id` = %s FOR UPDATE', (self.id, ))[0]

目前，Django 没有专用的 for_update 方法并防止使用我们用来创建的所有条件创建查询确定是否必须运行作业，我们在简单的 FOR UPDATE 查询之前执行困难的查询。

我真的不明白这会如何导致我们看到的麻烦，我们执行查询，然后是当另一个运行者持有作业锁时阻塞的语句。只有在作业状态更改后才会释放锁定。第二个运行者现在获得了锁，但作业的状态已更改，因此它从该方法返回，只是稍后重新进入；但是cjob-查询不会再次返回相同的作业，因为它的状态现在已被过滤器排除。

我是否误解了 FOR UPDATE 子句，或者我是否遗漏了其他内容？

应该注意的是，我将 MySQL 与 InnoDB 结合使用，而 Celery 不适合此解决方案。

原文

In Django, I have ran into some serious race condition. The trouble starts when two runners try to execute some_method() at the same time. The logging created is as follows:

Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
Job 3: Candidate
Job 3: Already taken
(et cetera for 18 MB)

The following method is giving me trouble. It should be noted that the method is re-ran until the method returns False:

def some_method():
    conditions = #(amongst others, excludes jobs with status EXECUTING)

    try:
        cjob = Job.objects.filter(conditions).order_by(some_fields)[0]
    except IndexError:
        return False

    print 'Job %s: Candidate' % cjob.id

    job = cjob.for_update()

    if cjob.status != job.status:
        print 'Job %s: Already taken' % cjob.id
        return True

    print 'Job %s: Starting...' % job.id

    job.status = Job.EXECUTING
    job.save()
    # Critical section

# In models.py:
class Job(models.Model):
    # ...

    def for_update(self):
        return Job.objects.raw('SELECT * FROM `backend_job` WHERE `id` = %s FOR UPDATE', (self.id, ))[0]

Currently, Django doesn't have a dedicated for_update-method and to prevent creating the query with all the conditions which we use to determine whether the job must be ran, we do the difficult query before the simple FOR UPDATE-query.

I don't really see how this could cause the trouble we see, we do the query, followed by statement that blocks when another runner holds the lock on the job. The lock is only released after the job's status has been changed. The second runner now gets the lock, but the job's status was changed, so it returns from the method, only to re-enter it later; but the cjob-query will not return the same job again, as its status is now excluded by the filter.

Do I misinterpret the FOR UPDATE-clause, or am I missing something else?

It should be noted that I use MySQL with InnoDB and that Celery does not fit for this solution.

分享到QQ

分享到微博