MATLAB 中的信号量和锁

发布于 2024-11-16 02:35:06 字数 1168 浏览 3 评论 0原文

我正在开发一个 MATLAB 项目,希望有两个 MATLAB 实例并行运行并共享数据。我将这些实例称为 MAT_1MAT_2。更具体地说,系统的架构是:

  1. MAT_1顺序处理图像,使用imread一张一张地读取它们,并使用imwrite<输出每张图像的结果。 /代码>。
  2. MAT_2 使用 imread 读取 MAT_1 输出的图像,并将其结果输出到其他位置。

我认为我需要解决的问题之一是保证 MAT_2MAT_1 完全完成写入后读取 MAT_1 输出的图像。

我的问题是:

  1. 你将如何解决这个问题?我是否需要使用信号量或锁来防止竞争条件?
  2. MATLAB 是否提供任何锁定文件的机制? (即类似于 flock 的内容,但提供直接由 MATLAB 编写,并且可以在多个平台上运行,例如 Windows 和 Linux)。如果没有,您知道我可以使用任何第三方库在 MATLAB 中构建此机制吗?

编辑:

  • 正如 @yoda 在下面指出的,并行计算工具箱 (PCT) 允许阻止 MATLAB 工作程序之间的调用,这非常棒。也就是说,我对不需要 PCT 的解决方案特别感兴趣。
  • 为什么需要 MAT_1MAT_2 在并行线程中运行?:

    MAT_2 中完成的处理平均比 MAT_1 慢(并且更容易崩溃),并且 MAT_1 的输出提供给其他不需要等待 MAT_2 完成其工作的程序和进程(包括人工检查)。

答案:

  • 对于允许实现信号量但不依赖于 PCT 的解决方案,请参阅下面乔纳斯的答案
  • 对于解决该问题的其他好方法,请参阅下面尤达的答案

I am working on a MATLAB project where I would like to have two instances of MATLAB running in parallel and sharing data. I will call these instances MAT_1 and MAT_2. More specifically, the architecture of the system is:

  1. MAT_1 processes images sequentially, reading them one by one using imread, and outputs the result for each image using imwrite.
  2. MAT_2 reads the images output by MAT_1 using imread and outputs its result somewhere else.

One of the problems I think I need to address is to guarantee that MAT_2 reads an image output by MAT_1 once MAT_1 has fully finished writing to it.

My questions are:

  1. How would you approach this problem? Do I need to use semaphores or locks to prevent race conditions?
  2. Does MATLAB provide any mechanism to lock files? (i.e. something similar to flock, but provided by MATLAB directly, and that works on multiple platforms, e.g. Windows & Linux). If not, do you know of any third-party library that I can use to build this mechanism in MATLAB?

EDIT :

  • As @yoda points out below, the Parallel Computing Toolbox (PCT) allows for blocking calls between MATLAB workers, which is great. That said, I am particularly interested in solutions that do not require the PCT.
  • Why do I require MAT_1 and MAT_2 to run in parallel threads?:

    The processing done in MAT_2 is slower on average (and more prone to crashing) than MAT_1, and the output of MAT_1 feeds other programs and processes (including human inspection) that do not need to wait for MAT_2 to do its job.

Answers :

  • For a solution that allows for the implementation of semaphores but does not rely on the PCT see Jonas' answer below
  • For other good approaches to the problem, see Yoda's answer below

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

水水月牙 2024-11-23 02:35:06

我会使用信号量来解决这个问题;根据我的经验,PCT 的同步速度慢得不合理。

dfacto(另一个答案)有一个很好的 MATLAB 信号量实现,但它不适用于 MS Windows;我改进了那项工作,这样它就可以了。改进的工作在这里:http://www.mathworks.com/matlabcentral/fileexchange/45504 -semaphoreposixandwindows

这比与 Java、.NET、PCT 或文件锁交互的性能更好。这不使用并行计算工具箱(PCT),并且 AFAIK 信号量功能无论如何也不在 PCT 中(令人费解的是他们遗漏了它!)。可以使用 PCT 进行同步,但我尝试过的所有操作都慢得不合理。

要将此高性能信号量库安装到 MATLAB 中,请在 MATLAB 解释器中运行以下命令:
mex -O -v semaphore.c

您需要安装 C++ 编译器才能将 semaphore.c 编译为二进制 MEX 文件。然后可以从 MATLAB 代码调用该 MEX 文件,如下例所示。

使用示例:

function Example()
    semkey=1234;
    semaphore('create',semkey,1);
    funList = {@fun,@fun,@fun};
    parfor i=1:length(funList)
        funList{i}(semkey);
    end
end
function fun(semkey)
    semaphore('wait',semkey)
    disp('hey');
    semaphore('post',semkey)
end

I would approach this using semaphores; in my experience the PCT is unreasonably slow at synchronization.

dfacto (another answer) has a great implementation of semaphores for MATLAB, however it will not work on MS Windows; I improved on that work so that it would. The improved work is here: http://www.mathworks.com/matlabcentral/fileexchange/45504-semaphoreposixandwindows

This will be better performing than interfacing with Java, .NET, the PCT, or file locks. This does not use the Parallel Computing Toolbox (PCT), and AFAIK semaphore functionality isn't in the PCT anyway (puzzling that they left it out!). It is possible to use the PCT for synchronization but everything I'd tried in it was unreasonably slow.

To install this high-performance semaphore library into MATLAB, run this within the MATLAB interpreter:
mex -O -v semaphore.c

You'll need a C++ compiler installed to compile semaphore.c into a binary MEX-file. That MEX-file is then callable from your MATLAB code as shown in the example below.

Usage example:

function Example()
    semkey=1234;
    semaphore('create',semkey,1);
    funList = {@fun,@fun,@fun};
    parfor i=1:length(funList)
        funList{i}(semkey);
    end
end
function fun(semkey)
    semaphore('wait',semkey)
    disp('hey');
    semaphore('post',semkey)
end
黎歌 2024-11-23 02:35:06

就我个人而言,我会为此使用并行处理工具箱。

据我所知,Matlab 中没有直接的方法来获得系统范围的文件锁。但是,为了确保 Matlab #2 在文件写入完成后仅读取 Matlab #1 的输出,我建议在写入文件 results_1.mat 后,Matlab #1 写入第二个文件, results_1.finished,这是一个空文本文件。由于第二个文件是在第一个文件之后写入的,因此它的存在表明结果文件已被写入。因此,您可以搜索扩展名为 finished 的文件,即 dir('*.finished'),并使用 fileparts 获取文件名您想要使用 Matlab #2 加载的 .mat 文件。

Personally, I'd use the parallel processing toolbox for this.

As far as I know, there is no straightforward way in Matlab to have systemwide file locks. However, in order to ensure that Matlab #2 only reads output of Matlab #1 when the file has finished writing, I suggest that after writing e.g the file results_1.mat, Matlab #1 writes a second file, results_1.finished, which is an empty text file. Since the second file is written after the first, its existence signals that the results-file has been written. You can thus search for files with the extension finished, i.e. dir('*.finished'), and use fileparts to get the name of the .mat file you'd like to load with Matlab #2.

翻身的咸鱼 2024-11-23 02:35:06

我不确定您是否正在寻找仅限 matlab 的解决方案,但我刚刚提交了一个用于 Matlab 的信号量包装器。它作为通用信号量工作,但它主要是根据 sharedmatrix 设计的。

Mathworks 接受提交后,我将更新我的研究小组 博客

请注意,此 mex 文件是 POSIX 信号量功能的包装器。因此,它可以在 Linux、Unix、MacOS 上运行,但不能在 Windows 上开箱即用。当针对 cygwin 库编译时它可能会起作用。

I am not sure if you are looking for matlab-only solution but I have just submitted a semaphore wrapper for use in Matlab. It works as a generic semaphore, but it was mainly designed with sharedmatrix in mind.

As soon as Mathworks accepts the submission, I will update the link on my research group's blog.

Please note that this mex file is a wrapper for the POSIX semaphore functionality. As such it will work in Linux, Unix, MacOS but will not work out-of-the-box on Windows. It may work when compiled against cygwin libraries.

花辞树 2024-11-23 02:35:06

我认为除了使用操作系统特定的锁之外,没有其他万无一失的方法。一种方法可能是让 MAT_1 执行以下操作:

imwrite(fileName);
movefile(fileName, completedFileName);

并让 MAT_2 仅处理completedFileName。

I dont think there is a fool-proof way other than using the OS specific locks. One approach might be to have MAT_1 do:

imwrite(fileName);
movefile(fileName, completedFileName);

And have MAT_2 only process completedFileName.

初雪 2024-11-23 02:35:06

编辑:

看到您的编辑后,不涉及使用任何工具箱的简单解决方案如下:

由于 MAT_2MAT_1 慢得多,因此启动 MAT_2代码> 有延迟。即,当 MAT_1 完成处理(例如 5 个图像左右)时启动它。如果您这样做,MAT_2 将永远不会赶上 MAT_1,因此永远不会处于必须“等待”来自 MAT_1 的图像的情况代码>.


我仍然不清楚您的问题中的一些事情:

  1. 您说 MAT_1 按顺序处理图像,但是是否必须?换句话说,它们的处理顺序重要吗?
  2. 您说 MAT_2 读取 MAT_1 的输出...它必须按照 MAT_1 完成的顺序还是可以是任何顺序?
  3. 您说 MAT_2 使用 imread 读取图像并将其输出到其他位置。是否有任何原因导致任务无法合并到 MAT_1 中?

无论如何,您都可以使用并行计算工具箱来实现某种形式的执行阻塞;但您必须创建一个分布式作业(示例)。

需要注意的重要一点是,每个工作人员(实验室)都有一个 labindex,您可以使用 labSend 将数据从工作人员 1(相当于 MAT_1)发送到工作人员 2(相当于 MAT_2),工作人员 2然后使用 labReceive。来自 labReceive 的文档:

此函数会阻止实验室中的执行,直到发送实验室中发生对 labSend 的相应调用。

这几乎正​​是您想要对 MAT_1MAT_2 执行的操作。

另一种方法是在当前会话中生成一个额外的工作线程,但仅将 MAT_1 执行的任务分配给它。然后设置 FinishedFcn任务的属性来执行由 MAT_2 执行的一组函数,但我不推荐它,因为我认为这不是 FinishedFcn 的意图,而且我不知道在某些情况下是否会崩溃。

EDIT:

After seeing your edit, a simple solution not involving the use of any toolboxes is the following:

Since MAT_2 is much slower than MAT_1, start MAT_2 with a delay. i.e., start it when MAT_1 has finished processing say 5 images or so. If you do this, MAT_2 will never catch up with MAT_1 and hence will never be in a situation where it has to "wait" for images from MAT_1.


I'm still not clear on a few things from your question:

  1. You say MAT_1 processes images sequentially, but does it have to? In other words, does the order in which they are processed matter?
  2. You say MAT_2 reads the output from MAT_1... Does it have to be in the order that MAT_1 finishes or can that be any order?
  3. You say MAT_2 reads the image using imread and outputs it some where else. Is there any reason that task cannot be combined into MAT_1?

In any case, you can implement some form of execution blocking using the parallel computing toolbox; but instead of using parfor loops (which is what most people use), you'll have to create a distributed job (example).

The important thing to note is that each worker (lab) has a labindex, and you can use labSend to send data from worker 1 (equivalent of MAT_1) to worker 2 (equivalent of MAT_2), who then receives it using labReceive. From the documentation on labReceive:

This function blocks execution in the lab until the corresponding call to labSend occurs in the sending lab.

which is pretty much what you wanted to do with MAT_1 and MAT_2.

Another way to do this would be to spawn one additional worker in your current session, but only assign tasks performed by MAT_1 to it. You then set the FinishedFcn property for the tasks to execute the set of functions performed by MAT_2, but I wouldn't recommend it as I don't think this was the intent for FinishedFcn, and I don't know if it will break in certain cases.

天涯离梦残月幽梦 2024-11-23 02:35:06

我还建议查看并行处理工具箱来解决这样的问题,您想要的功能应该在其中的某个地方。我认为这种方式比尝试同步 MATLAB 的两个实例更干净(除非您被迫使用两个实例)。

在没有这样的事情的奇怪情况下,您还可能会考虑不同的环境来实现您想要的。这可能是一种解决方法,但您始终可以将 MATLAB 代码与其他语言(例如 Java、.NET、C 等)连接并使用您习惯的功能。使用 Java,您可以确信您的解决方案是独立于平台的,.NET 只能在 Windows 上运行(至少与 MATLAB 结合使用)。

I would also recommend to look at the parallel processing toolbox for such a thing, the functionality you want should be in there somewhere. I think it's cleaner that way than trying to synchronize two instances of MATLAB (unless you are forced to use two instances).

In the odd case that there is no such thing, you might also look at different environments to implement what you want. It might be a bit of a workaround, but you can always interface your MATLAB code with other languages (e.g. Java, .NET, C, ...) and use the functionality you are accustomed to there. With Java you are quite sure that your solution is platform independent, .NET only works on Windows (at least in combination with MATLAB).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文