如何让我的 Perl 脚本对子进程使用多个核心?
我正在研究一个数学模型,该模型使用 XFOIL 生成的数据,XFOIL 是一种流行的航空航天工具,用于查找机翼的升力和阻力系数。
我有一个 Perl 脚本,它使用不同的输入参数重复调用 XFOIL 以生成我需要的数据。我需要 XFOIL 运行 5,600 次,每次运行大约 100 秒,大约需要 6.5 天才能完成。
我有一台四核机器,但我作为程序员的经验有限,而且我真的只知道如何使用基本的 Perl。
我想一次运行四个 XFOIL 实例,全部运行在它们自己的核心上。像这样的事情:
while ( 1 ) {
for ( i = 1..4 ) {
if ( ! exists XFOIL_instance(i) ) {
start_new_XFOIL_instance(i, input_parameter_list);
}
}
}
所以程序正在检查(或者最好是休眠),直到 XFOIL 实例空闲,此时我们可以使用新的输入参数列表启动一个新实例。
I'm working on a mathematical model that uses data generated from XFOIL, a popular aerospace tool used to find the lift and drag coefficients on airfoils.
I have a Perl script that calls XFOIL repeatedly with different input parameters to generate the data I need. I need XFOIL to run 5,600 times, at around 100 seconds per run, soabout 6.5 days to complete.
I have a quad-core machine, but my experience as a programmer is limited, and I really only know how to use basic Perl.
I would like to run four instances of XFOIL at a time, all on their own core. Something like this:
while ( 1 ) {
for ( i = 1..4 ) {
if ( ! exists XFOIL_instance(i) ) {
start_new_XFOIL_instance(i, input_parameter_list);
}
}
}
So the program is checking (or preferably sleeping) until an XFOIL instance is free, when we can start a new instance with the new input parameter list.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
尝试 Parallel::ForkManager。它是一个模块,提供了一个简单的接口来分叉这样的进程。
以下是一些示例代码:
您需要为 start_new_XFOIL_instance 和 output_exists 函数提供自己的实现,并且还需要定义自己的参数集以传递给 XFOIL。
Try Parallel::ForkManager. It's a module that provides a simple interface for forking off processes like this.
Here's some example code:
You'll need to supply your own implementations for the start_new_XFOIL_instance and the output_exists functions, and you'll also want to define your own sets of parameters to pass to XFOIL.
Perl 线程 将利用多个内核和处理器。线程的主要优点是在线程之间共享数据并协调它们的活动相当容易。分叉进程不能轻易地将数据返回给父进程,也不能在它们之间进行协调。
Perl 线程的主要缺点是与 fork 相比,它们的创建成本相对较高,它们必须复制整个程序及其所有数据;你必须将它们编译到你的 Perl 中;它们可能有问题,Perl 越老,线程就越有问题。如果您的工作很昂贵,那么创建时间应该不重要。
以下是如何使用线程执行此操作的示例。有很多方法可以做到这一点,这个使用 Thread::Queue 创建一个大列表您的工作线程可以共享的工作量。当队列为空时,线程退出。主要优点是更容易控制有多少线程处于活动状态,并且您不必为每一项工作创建一个新的、昂贵的线程。
此示例将所有工作一次性推入队列,但您没有理由不能随时添加到队列中。如果您要这样做,您将使用
dequeue
而不是dequeue_nb
,后者将等待更多输入。Perl threads will take advantage of multiple cores and processors. The main pro of threads is its fairly easy to share data between the threads and coordinate their activities. A forked process cannot easily return data to the parent nor coordinate amongst themselves.
The main cons of Perl threads is they are relatively expensive to create compared to a fork, they must copy the entire program and all its data; you must have them compiled into your Perl; and they can be buggy, the older the Perl, the buggier the threads. If your work is expensive, the creation time should not matter.
Here's an example of how you might do it with threads. There's many ways to do it, this one uses Thread::Queue to create a big list of work your worker threads can share. When the queue is empty, the threads exit. The main advantages are that its easier to control how many threads are active, and you don't have to create a new, expensive thread for each bit of work.
This example shoves all the work into the queue at once, but there's no reason you can't add to the queue as you go. If you were to do that, you'd use
dequeue
instead ofdequeue_nb
which will wait around for more input.看起来您可以使用 gearman 来完成这个项目。
www.gearman.org
Gearman 是一个作业队列。您可以将工作流程分成许多小部分。
我建议使用 amazon.com 甚至他们的拍卖服务器来完成这个项目。
每个计算小时花费 10 美分或更少,可以显着加快您的项目速度。
我会在本地使用 gearman,确保在将其交给亚马逊计算场之前,您可以“完美”运行 5-10 个子作业。
This looks like you can use gearman for this project.
www.gearman.org
Gearman is a job queue. You can split your work flow into a lot of mini parts.
I would recommend using amazon.com or even their auction able servers to complete this project.
Spending 10cents per computing hour or less, can significantly spead up your project.
I would use gearman locally, make sure you have a "perfect" run for 5-10 of your subjobs before handing it off to an amazon compute farm.
您是否考虑过 gnu 并行并行。
它将允许您使用不同的输入运行程序的多个安装实例
在 CPU 核心开始可用时填充它们。它通常是实现简单任务并行化的非常简单有效的方法。
Did you consider gnu parallel parallel.
It will allow you to run several install instances of your program with different inputs and
fill your CPU cores as they begin available. It's often a very simple an efficient way to achieve parallelization of simple tasks.
这已经很老了,但如果有人仍在寻找这个问题的合适答案,您可能需要考虑 Perl Many-Core-发动机(MCE)
This is quite old but if someone is still looking for suitable answers to this question, you might want to consider Perl Many-Core-Engine (MCE)