如何从 MPI 进程向 C 上的另一个进程发送信号(或通知)?

发布于 2024-09-02 22:17:25 字数 86 浏览 2 评论 0原文

例如,我如何使 MPI 进程向其他进程通知错误,特别是在所有 MPI 进程彼此独立的 MPI 程序上(不同 MPI 进程之间没有同步)?

谢谢

How can i make MPI process notify the others about an error for example, specially on an MPI program where all the MPI processees are independant from each others ( There no synchronisation between the different MPI processees ) ?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

绮筵 2024-09-09 22:17:25

我发现你关于所有进程都是独立的 MPI 程序的想法非常奇怪。我认为,根据定义,MPI 程序中的所有进程都不是独立的,例如,在调用 MPI_INIT 之后,它们都在同一个通信器中,因此它们都“知道”彼此的存在。您可能编写了代码,以便进程在此之后不会同步,但进程之间相互通信的方法仍然存在。

一种需要研究的机制(确实需要同步)是 MPI_BCAST(广播)。另一种方法是使用 MPI_ISEND,即非阻塞发送操作,但是迟早,一个进程或另一个进程必须接收,并且您的发送进程应该测试发送是否成功。

I find your idea of an MPI program in which all the processes are independent very strange. I think that, by definition, all the processes in an MPI program are not independent, they are all, for example, in the same communicator after you have called MPI_INIT so they all 'know' of each others existence. You may have written your code so that the processes do not synchronise after that, but the means still exist for processes to communicate with each other.

One mechanism to look into (which does require synchronisation) is MPI_BCAST (broadcast). Another approach would be to use MPI_ISEND, the non-blocking send operation but, sooner or later, one process or another will have to receive and your sending process ought to test whether the send has succeeded or not.

妳是的陽光 2024-09-09 22:17:25

你指出的差异让我想知道:你为什么使用 MPI?它似乎不适合你的问题,没有比试图将方钉塞入 MPI 的圆孔更糟糕的了。 “MPI 进程之间没有同步”听起来就像您已经承担了本质上串行农业的工作负载,并试图将其转换为 MPI。

也就是说,您可以通过定期使用 MPI_Irecv 和 MPI_Test 进行轮询来完成您想要的操作。

the disparity you point out makes me wonder: why are you using MPI? it doesn't seem to fit your problem, and there's not much worse than trying to shove a square peg into MPI's round hole(s). "no synchronization between MPI processes" makes it sound like you've taken a workload that is inherently serial-farming, and are trying to turn it into MPI.

that said, you can probably do what you want simply by polling periodically with MPI_Irecv and MPI_Test.

成熟稳重的好男人 2024-09-09 22:17:25

在处理 MPI 时,独立和不同步是两种完全不同的情况,这要归功于 非阻塞通信

在我看来,你想要的可以这样实现:当发生错误时,进程广播带有指定“错误”标签的消息,并且每个进程定期发布带有此标签的消息的非阻塞接收。如果他们收到这样的消息,则意味着最近发生了错误,他们可以做出相应的反应,否则他们将继续正常执行。

(请注意,本例中的“广播”并不是指 MPI_Bcast,因为这是一个集体通信操作,也是这样的块。相反,它只是意味着向可能涉及的每个人发送相同的消息。如果您想在进程之间保持不同步,那么此发送也必须是非阻塞的。)

Being independent and having no synchronisation are two entirely different scenarios when dealing with MPI, thanks to non-blocking communication.

It seems to me that what you want can be implemented this way: when an error occurs, a process broadcasts a message with a designated "error" tag, and each process periodically posts non-blocking receives for a message with this tag. If they receive such a message, it means that an error occured recently and they can react accordingly, otherwise they continue their normal execution.

(Note that "broadcasting" in this case doesn't refer to MPI_Bcast, since that's a collective communication operation, and as such blocks. Instead, it simply means sending the same message to everyone it may concern. If you want to maintain no synchronisation between the processes, then this sending will have to be non-blocking as well.)

影子是时光的心 2024-09-09 22:17:25

MPI 标准中没有任何内容允许将“中断”从一个级别发送到另一个级别(或多个级别)。一般来说,进展要求用户代码不时进入 MPI 库。如果没有晋升,队伍之间就没有标准的沟通方式。

同步要求时不时地有一些条目进入 MPI 库。 MPI_Barrier 是同步的“大锤”方法。结合 MPI_Reduce_Scatter,可以知道至少一个等级上存在一些错误。

There is nothing in the MPI Standard that allows for an "interrupt" to be sent from one rank to another rank (or ranks). In general, progression requires that user code enter the MPI library from time to time. Absent progression, there is no standard way to communicate between the ranks.

Synchronization requires that from time to time there is some entry into the MPI library. MPI_Barrier is the "big hammer" approach to synchronization. Combined with MPI_Reduce_Scatter, it would be possible to know there is some error on at least one rank.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文