对 QuerySet 中的第一个实例进行原子更新

发布于 2024-08-16 17:26:25 字数 1826 浏览 14 评论 0原文

我正在开发一个系统，该系统在为许多工作机器提供工作时必须处理许多竞争条件。

客户端将在系统中查询状态=“0”（待办事项）的作业，然后以原子方式更新状态=“1”（锁定）的“最旧”行，并检索该行的 ID（用于更新带有工人信息的作业，例如哪台机器正在处理它等）。

这里的主要问题是可能有任意数量的客户端同时更新。一种解决方案是锁定大约 20 个 status='0' 的行，更新最旧的行，然后再次释放所有锁定。我一直在研究 TransactionMiddleware，但我不知道这将如何防止在我查询它之后从我下面更新最旧的情况。

我研究了 QuerySet.update() 的事情，它看起来很有希望，但是在两个客户端获得相同记录的情况下，状态只会更新，并且我们将有两个工作人员从事同一项工作..我真的很茫然。

我还找到了票证 #2705 ，它似乎可以很好地处理这种情况，但我不知道如何由于我有限的 SVN 经验，从那里获取代码（最后的更新只是差异，但我不知道如何将其与代码主干合并）。

代码：结果=作业

class Result(models.Model):
"""
Result: completed- and pending runs

'ToDo': job hasn't been acquired by a client
'Locked': job has been acquired
'Paused'
"""
# relations
run = models.ForeignKey(Run)
input = models.ForeignKey(Input)

PROOF_CHOICES = (
    (1, 'Maybe'),
    (2, 'No'),
    (3, 'Yes'),
    (4, 'Killed'),
    (5, 'Error'),
    (6, 'NA'),
)
proof_status = models.IntegerField(
    choices=PROOF_CHOICES,
    default=6,
    editable=False)

STATUS_CHOICES = (
    (0, 'ToDo'),
    (1, 'Locked'),
    (2, 'Done'),
)
result_status = models.IntegerField(choices=STATUS_CHOICES, editable=False, default=0)

# != 'None' => status = 'Done'
proof_data = models.FileField(upload_to='results/',
    null=True, blank=True)
# part of the proof_data
stderr = models.TextField(editable=False,
    null=True, blank=True)

realtime = models.TimeField(editable=False,
    null=True, blank=True)
usertime = models.TimeField(editable=False,
    null=True, blank=True)
systemtime = models.TimeField(editable=False,
    null=True, blank=True)

# updated when client sets status to locked
start_time = models.DateTimeField(editable=False)

worker = models.ForeignKey('Worker', related_name='solved',
    null=True, blank=True)

原文

I'm working on a system which has to handle a number of race-conditions when serving jobs to a number of worker-machines.

The clients would query the system for jobs with status='0' (ToDo), then, in an atomic way, update the 'oldest' row with status='1' (Locked) and retrieve the id for that row (for updating the job with worker information like which machine is working on it etc.).

The main issue here is that there might be any number of clients updating at the same time. A solution would be to lock around 20 of the rows with status='0', update the oldest one and release all the locks again afterwards. I've been looking into the TransactionMiddleware but I don't see how this would prevent the case of the oldest one being updated from under me after I query it.

I've looked into the QuerySet.update() thing, and it looks promising, but in the case of two clients getting a hold of the same record, the status would simply updated, and we would have two workers working on the same job.. I'm really at a loss here.

I also found ticket #2705 which seems to handle the case nicely, but I have no idea how to get the code from there because of my limited SVN experience (the last updates are simply diffs, but I don't know how to merge that with the trunk of the code).

Code: Result = Job

class Result(models.Model):
"""
Result: completed- and pending runs

'ToDo': job hasn't been acquired by a client
'Locked': job has been acquired
'Paused'
"""
# relations
run = models.ForeignKey(Run)
input = models.ForeignKey(Input)

PROOF_CHOICES = (
    (1, 'Maybe'),
    (2, 'No'),
    (3, 'Yes'),
    (4, 'Killed'),
    (5, 'Error'),
    (6, 'NA'),
)
proof_status = models.IntegerField(
    choices=PROOF_CHOICES,
    default=6,
    editable=False)

STATUS_CHOICES = (
    (0, 'ToDo'),
    (1, 'Locked'),
    (2, 'Done'),
)
result_status = models.IntegerField(choices=STATUS_CHOICES, editable=False, default=0)

# != 'None' => status = 'Done'
proof_data = models.FileField(upload_to='results/',
    null=True, blank=True)
# part of the proof_data
stderr = models.TextField(editable=False,
    null=True, blank=True)

realtime = models.TimeField(editable=False,
    null=True, blank=True)
usertime = models.TimeField(editable=False,
    null=True, blank=True)
systemtime = models.TimeField(editable=False,
    null=True, blank=True)

# updated when client sets status to locked
start_time = models.DateTimeField(editable=False)

worker = models.ForeignKey('Worker', related_name='solved',
    null=True, blank=True)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

北斗星光 2024-08-23 17:26:25

要将 #2705 合并到您的 django 中，您需要先下载它：

cd <django-dir>
wget http://code.djangoproject.com/attachment/ticket/2705/for_update_11366_cdestigter.diff?format=raw

然后将 svn 倒带到必要的 django 版本：

svn update -r11366

然后应用它：

patch -p1 for_update_11366_cdestigter.diff

它将告诉您哪些文件已成功修补，哪些文件未成功修补。在不太可能发生冲突的情况下，您可以手动修复它们，查看 http://code.djangoproject.com/attachment/ticket/2705/for_update_11366_cdestigter.diff

要取消应用补丁，只需编写

svn revert --recursive .

To merge #2705 into your django, you need to download it first:

cd <django-dir>
wget http://code.djangoproject.com/attachment/ticket/2705/for_update_11366_cdestigter.diff?format=raw

then rewind svn to the necessary django version:

svn update -r11366

then apply it:

patch -p1 for_update_11366_cdestigter.diff

It will inform you which files were patched successfully and which were not. In the unlikely case of conflicts you can fix them manually looking at http://code.djangoproject.com/attachment/ticket/2705/for_update_11366_cdestigter.diff

To unapply the patch, just write

svn revert --recursive .

回复收藏 0 原文

滥情稳全场 2024-08-23 17:26:25

如果您的 django 在一台机器上运行，则有一种更简单的方法可以做到这一点...请原谅伪代码，因为您的实现细节尚不清楚。

from threading import Lock

workers_lock = Lock()

def get_work(request):
    workers_lock.acquire()
    try:
        # Imagine this method exists for brevity
        work_item = WorkItem.get_oldest()
        work_item.result_status = 1
        work_item.save()
    finally:
        workers_lock.release()

    return work_item

If your django is running on one machine, there is a much simpler way to do it... Excuse the pseudo-code as the details of your implementation aren't clear.

from threading import Lock

workers_lock = Lock()

def get_work(request):
    workers_lock.acquire()
    try:
        # Imagine this method exists for brevity
        work_item = WorkItem.get_oldest()
        work_item.result_status = 1
        work_item.save()
    finally:
        workers_lock.release()

    return work_item

回复收藏 0 原文

苦妄 2024-08-23 17:26:25

我脑子里有两个选择。一种是在检索行时立即锁定行，并且只有在相应的行被标记为正在使用时才释放锁。这里的问题是，没有其他客户端进程可以查看未选择的作业。如果您总是自动选择最后一个，那么它可能是一个足够简短的窗口，适合您。

另一种选择是带回查询时打开的行，但每当客户端尝试获取要处理的作业时再次检查。当客户端尝试更新作业以对其进行操作时，首先会检查该作业是否仍然可用。如果其他人已经抓住了它，则会向客户端发回通知。这允许所有客户端将所有作业视为快照，但如果他们不断获取最新作业，那么您可能会让客户端不断收到作业已在使用中的通知。也许这就是您所指的竞争条件？

解决这个问题的一种方法是将特定组中的工作返回给客户，这样他们就不会总是得到相同的列表。例如，按地理区域甚至随机地分解它们。例如，每个客户端的 ID 可以为 0 到 9。对作业中的 ID 取模，并将具有相同结尾数字的作业发送回客户端。不过，不要将其仅限于这些工作，因为您不希望出现您无法完成的工作。例如，如果您有 1、2 和 3 的客户以及 104 的工作，那么没有人能够获得它。因此，一旦没有足够的具有正确结尾数字的职位，职位就会开始返回其他数字以填充列表。您可能需要在这里尝试一下确切的算法，但希望这能给您一个想法。

如何锁定数据库中的行以更新它们和/或发回通知将在很大程度上取决于您的 RDBMS。在 MS SQL Server 中，只要中间不需要用户干预，您就可以将所有这些工作很好地包装在存储过程中。

我希望这有帮助。

You have two choices off the top of my head. One is to lock rows immediately upon retrieval and only release the lock once the appropriate one has been marked as in use. The problem here is that no other client process can even look at the jobs which don't get selected. If you're always just automatically selecting the last one then it may be a brief enough of a window to be o.k. for you.

The other option would be to bring back the rows that are open at the time of the query, but to then check again whenever the client tries to grab a job to work with. When a client attempts to update a job to work on it a check would first be done to see if it's still available. If someone else has already grabbed it then a notification would be sent back to the client. This allows all of the clients to see all of the jobs as snapshots, but if they are constantly grabbing the latest one then you might have the clients constantly receiving notifications that a job is already in use. Maybe this is the race condition to which you're referring?

One way to get around that would be to return the jobs in specific groups to the clients so that they are not always getting the same lists. For example, break them down by geographic area or even just randomly. For example, each client could have an ID of 0 to 9. Take the mod of an ID on the jobs and send back those jobs with the same ending digit to the client. Don't limit it to just those jobs though, as you don't want there to be jobs that you can't reach. So for example if you had clients of 1, 2, and 3 and a job of 104 then no one would be able to get to it. So, once there aren't enough jobs with the correct ending digit jobs would start coming back with other digits just to fill the list. You might need to play around with the exact algorithm here, but hopefully this gives you an idea.

How you lock the rows in your database in order to update them and/or send back the notifications will largely depend on your RDBMS. In MS SQL Server you could wrap all of that work nicely in a stored procedure as long as user intervention isn't needed in the middle of it.

I hope this helps.

回复收藏 0 原文

~没有更多了~