如何使用线程代替 perl/pdl 中的子例程循环

发布于 2024-09-17 22:48:30 字数 481 浏览 13 评论 0原文

我有一个非常好的 Perl 子例程,作为 Perl 模块的一部分编写。在不涉及太多细节的情况下,它需要一个字符串和一个短列表作为参数(通常从终端获取)并吐出一个值(现在总是一个浮点,但情况可能并不总是如此。)

现在,我的参数的列表部分采用两个值,例如 (val1,val2)。我使用 for 循环将子例程的输出保存为 val1 和 val2 的数百个不同值。每次迭代几乎需要一秒钟才能完成——因此完成整个过程需要几个小时。

我最近读到了一种(对我来说)神秘的计算工具,称为“线程”,它显然可以以极快的执行时间取代 for 循环。我一直很难理解这些是什么和做什么,但我想它们与并行计算有关(并且我希望我的模块针对并行处理器尽可能优化。)

如果我保存我想要的所有值要以列表形式传递给 val1,比如说 @val1,对于 val2 也是如此,我如何使用这些“线程”来为 val1 和 val2 元素的每个组合执行我的子例程? 此外,了解如何将此过程推广到也采用 val3、val4 等的子例程也会很有帮助。

I have a perfectly good perl subroutine written as part of a perl module. Without going into too many details, it takes a string and a short list as arguments (often taken from terminal) and spits out a value (right now, always a floating point, but this may not always be the case.)

Right now, the list portion of my argument takes two values, say (val1,val2). I save the output of my subroutine for hundreds of different values for val1 and val2 using for loops. Each iteration takes almost a second to complete--so completing this entire process takes hours.

I recently read of a mystical (to me) computational tool called "threading" that apparently can replace for loops with blazing fast execution time. I have been having trouble understanding what these are and do, but I imagine they have something to do with parallel computing (and I would like to have my module as optimized as possible for parallel processors.)

If I save all the values I would like to pass to val1 as a list, say @val1 and the same for val2, how can I use these "threads" to execute my subroutine for every combination of the elements of val1 and val2?
Also, it would be helpful to know how to generalize this procedure to a subroutine that also takes val3, val4, etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

煮酒 2024-09-24 22:48:31

更新:

我不使用 PDL,所以我不知道 PDL 中的线程与我一直在谈论的线程概念不完全对应。请参阅 PDL 线程和签名

首先,我们必须解释 PDL 上下文中线程的含义,特别是因为术语“线程”在计算机科学中已经具有独特的含义,但仅部分与其在 PDL 中的用法一致。

然而,我认为下面的解释对您仍然有用,因为人们需要知道常规意义上的线程是什么才能理解 PDL 线程的不同之处。

以下是维基百科上的线程条目作为背景。

使用线程并不能让你的程序神奇地变得更快。 如果您有多个 CPU/核心,并且如果您正在执行的计算可以分为独立的块,那么使用线程可以让您的程序在同一时间执行多个计算时间并减少总执行时间。

最简单的情况是子任务极其并行,不需要线程之间的通信/协调。

关于可能的性能提升,请考虑以下程序:

#!/usr/bin/perl

use strict; use warnings;
use threads;

my ($n) = @ARGV;

my @threads = map { threads->create(\&act_busy) } 1 .. $n;

$_->join for @threads;

sub act_busy {
    for (1 .. 10_000_000) {
        my $x = 2 * 2;
    }
}

在我运行 Windows XP 的双核笔记本电脑上:

C:\> timethis t.pl 1
TimeThis :  Elapsed Time :  00:00:02.375
C:\> timethis t.pl 2
TimeThis :  Elapsed Time :  00:00:02.515
C:\> timethis t.pl 3
TimeThis :  Elapsed Time :  00:00:03.734
C:\> timethis t.pl 4
TimeThis :  Elapsed Time :  00:00:04.703
...
C:\> timethis t.pl 10
TimeThis :  Elapsed Time :  00:00:11.703

现在,将其与以下内容进行比较:

#!/usr/bin/perl

use strict; use warnings;

my ($n) = @ARGV;

act_busy() for 1 .. $n;

sub act_busy {
    for (1 .. 10_000_000) {
        my $x = 2 * 2;
    }
}
C:\> timethis s.pl 10
TimeThis :  Elapsed Time :  00:00:22.312

Update:

I do not use PDL so I did not know a thread in PDL does not correspond exactly to the notion of threading I have been talking about. See PDL threading and signatures:

First we have to explain what we mean by threading in the context of PDL, especially since the term threading already has a distinct meaning in computer science that only partly agrees with its usage within PDL.

However, I think the explanation below is still useful to you as one would need to know what threading in the regular sense is to understand how PDL threads are different.

Here is the Threads entry on Wikipedia for background.

Using threads cannot make your program magically faster. If you have multiple CPUs/cores and if the computations you are carrying out can be divided into independent chunks, using threads can allow your program to carry more than one computation at a time and cut down on the total execution time.

The easiest case is when the subtasks are embarrassingly parallel requiring no communication/coordination between threads.

Regarding possible performance gains, consider the following program:

#!/usr/bin/perl

use strict; use warnings;
use threads;

my ($n) = @ARGV;

my @threads = map { threads->create(\&act_busy) } 1 .. $n;

$_->join for @threads;

sub act_busy {
    for (1 .. 10_000_000) {
        my $x = 2 * 2;
    }
}

On my dual core laptop running Windows XP:

C:\> timethis t.pl 1
TimeThis :  Elapsed Time :  00:00:02.375
C:\> timethis t.pl 2
TimeThis :  Elapsed Time :  00:00:02.515
C:\> timethis t.pl 3
TimeThis :  Elapsed Time :  00:00:03.734
C:\> timethis t.pl 4
TimeThis :  Elapsed Time :  00:00:04.703
...
C:\> timethis t.pl 10
TimeThis :  Elapsed Time :  00:00:11.703

Now, compare that to:

#!/usr/bin/perl

use strict; use warnings;

my ($n) = @ARGV;

act_busy() for 1 .. $n;

sub act_busy {
    for (1 .. 10_000_000) {
        my $x = 2 * 2;
    }
}
C:\> timethis s.pl 10
TimeThis :  Elapsed Time :  00:00:22.312
牵你的手,一向走下去 2024-09-24 22:48:31

正如 Sinan 所说,您可能想到的“线程”是“PDL 线程”,现在(从 2.075 开始)重命名为“广播”以匹配通用术语(请参阅 文档)。它允许您替换这样的内容:

$x = sequence(5);
$x->set($_, $x->at($_)+2) for 0..$x->dim(0)-1;

仅此而已,因为“+=”从根本上对一件事(零维标量)进行操作,因此对于比标量更多的维度(例如这个一维序列),它可以“ Broadcast”:

$x += 2; # does whole ndarray at once

这也更快,因为与 for 循环不同,它不必不断离开并重新进入 Perl 环境(又名“Perl-land”),但可以保持极快的速度“C-land”无需任何开销即可进行计算。

其最初名称背后的动机是这些“广播”计算都是独立的,因此“令人尴尬地并行”,因此可以自动并行化。请参阅 doc - 从 2.059 开始,PDL 默认情况下将并行处理设置为自动发生,在可用的 CPU 核心数。

As Sinan says, the "threading" you were probably thinking of is "PDL threading", now renamed (as of 2.075) to "broadcasting" to match the general terminology (see docs). It allows you to replace something like this:

$x = sequence(5);
$x->set($_, $x->at($_)+2) for 0..$x->dim(0)-1;

with just this, since "+=" fundamentally operates on one thing (a zero-dimensional scalar), so with more dimensions than a scalar (such as this 1-dimensional sequence) it can "broadcast":

$x += 2; # does whole ndarray at once

This is also faster because unlike the for loop, it doesn't have to keep leaving and re-entering the Perl environment (aka "Perl-land"), but can stay in extremely fast "C-land" to do the calculations with no overhead.

The motivation behind its original name was that these "broadcasted" calculations are all independent, and therefore "embarrassingly parallel", so can be automatically parallelised. See doc - as of 2.059, PDL by default sets parallel processing to happen automatically, on the number of CPU cores available.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文