条件、块、模块 - 哪种方式的内存和计算效率最高?

发布于 2024-12-07 08:08:08 字数 1060 浏览 2 评论 0原文

在 Mathematica 中总是有多种方法可以完成同一件事。例如,在针对我最近的问题调整 WReach 的解决方案时,我 使用 < code>Condition

ClearAll[ff];
SetAttributes[ff, HoldAllComplete];
ff[expr_] /; (Unset[done]; True) := 
 Internal`WithLocalSettings[Null, done = f[expr], 
  AbortProtect[If[! ValueQ[done], Print["Interrupt!"]]; Unset[done]]]

但是,我们可以使用 Block 做同样的事情:

ClearAll[ff];
SetAttributes[ff, HoldAllComplete];
ff[expr_] := 
 Block[{done}, 
  Internal`WithLocalSettings[Null, done = f[expr], 
   AbortProtect[If[! ValueQ[done], Print["Interrupt!"]]]]]

或者使用模块

ClearAll[ff];
SetAttributes[ff, HoldAllComplete];
ff[expr_] := 
 Module[{done}, 
  Internal`WithLocalSettings[Null, done = f[expr], 
   AbortProtect[If[! ValueQ[done], Print["Interrupt!"]]]]]

可能还有其他几种方法可以做到同样的事情。从内存和 CPU 使用的角度来看,哪种方式最有效(f 可能返回非常大的数据数组 - 但可能返回非常小)?

There are always several ways to do the same thing in Mathematica. For example, when adapting WReach's solution for my recent problem I used Condition:

ClearAll[ff];
SetAttributes[ff, HoldAllComplete];
ff[expr_] /; (Unset[done]; True) := 
 Internal`WithLocalSettings[Null, done = f[expr], 
  AbortProtect[If[! ValueQ[done], Print["Interrupt!"]]; Unset[done]]]

However, we can do the same thing with Block:

ClearAll[ff];
SetAttributes[ff, HoldAllComplete];
ff[expr_] := 
 Block[{done}, 
  Internal`WithLocalSettings[Null, done = f[expr], 
   AbortProtect[If[! ValueQ[done], Print["Interrupt!"]]]]]

Or with Module:

ClearAll[ff];
SetAttributes[ff, HoldAllComplete];
ff[expr_] := 
 Module[{done}, 
  Internal`WithLocalSettings[Null, done = f[expr], 
   AbortProtect[If[! ValueQ[done], Print["Interrupt!"]]]]]

Probably there are several other ways to do the same. Which way is the most efficient from the point of view of memory and CPU use (f may return very large arrays of data - but may return very small)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

耳根太软 2024-12-14 08:08:08

ModuleBlock 都非常高效,因此只有当您本地化的变量的函数体执行很少的操作时,它们引起的开销才会明显。造成开销的主要原因有两个: 作用域构造开销(作用域构造必须分析它们所包含的代码,以解决可能的名称冲突和绑定变量 - 这发生在 ModuleBlock),以及在符号表中创建和销毁新符号的开销(仅适用于 Module)。因此,Block 速度要快一些。要了解速度有多快,您可以做一个简单的实验:

In[14]:= 
Clear[f,fm,fb,fmp]; 
f[x_]:=x;
fm[x_]:=Module[{xl = x},xl];
fb[x_]:=Block[{xl = x},xl];
Module[{xl},fmp[x_]:= xl=x]

我们在这里定义了 4 个函数,其函数体尽可能简单 - 只需返回参数,可能会分配给局部变量。我们可以预期这里的效果最为明显,因为身体的作用非常小。

In[19]:= f/@Range[100000];//Timing
Out[19]= {0.063,Null}

In[20]:= fm/@Range[100000];//Timing
Out[20]= {0.343,Null}

In[21]:= fb/@Range[100000];//Timing
Out[21]= {0.172,Null}

In[22]:= fmp/@Range[100000];//Timing
Out[22]= {0.109,Null} 

从这些计时中,我们看到 Block 大约比 Module 快两倍,但是使用 Module 最后创建的持久变量的版本函数只需一次,比 Block 效率高出大约两倍,并且几乎与简单函数调用一样快(因为持久变量只创建一次,并且应用函数时没有作用域开销)。

对于真正的函数,大多数时候,ModuleBlock 的开销应该不重要,所以我会使用更安全的东西(通常,Module )。如果确实重要,一种选择是仅使用模块创建的持久局部变量一次。如果即使这个开销很大,我也会重新考虑设计 - 从那时起,显然你的函数做得太少了。在某些情况下,Block 更有利,例如,当你想确保所有局部变量使用的内存将被自动释放(这对于具有 DownValues 的局部变量尤其重要,因为它们在由 Module 创建时并不总是被垃圾收集)。使用 Block 的另一个原因是当您预计可能会出现异常或中止等中断,并希望自动重置局部变量(Block 会这样做)。然而,通过使用 Block,您可能会面临名称冲突的风险,因为它是动态绑定变量而不是词法绑定变量。

所以,总结一下:在大多数情况下,我的建议是这样的:如果您觉得您的函数内存严重或运行时效率低下,请看看其他地方 - 作用域构造很少成为主要瓶颈。例外情况包括未进行垃圾收集的具有累积数据的 Module 变量、非常频繁使用的轻量级函数,以及在非常高效的低级结构(例如压缩数组和稀疏数组)上运行的函数,其中符号范围界定开销可能与函数处理其数据所需的时间相当,因为主体非常高效并且使用绕过主求值器的快速函数。

编辑

通过按照建议的方式组合BlockModule此处

Module[{xl}, fmbp[x_] := Block[{xl = x}, xl]]

您可以两全其美:与 Block 一样快的函数 - 作用域函数,并且与使用 Module 的函数一样安全。

Both Module and Block are quite efficient, so the overhead induced by them is only noticable when the body of a function whose variables you localize does very little. There are two major reasons for the overhead: scoping construct overhead (scoping constructs must analyze the code they enclose to resolve possible name conflicts and bind variables - this takes place for both Module and Block), and the overhead of creation and destruction of new symbols in a symbol table (only for Module). For this reason, Block is somewhat faster. To see how much faster, you can do a simple experiment:

In[14]:= 
Clear[f,fm,fb,fmp]; 
f[x_]:=x;
fm[x_]:=Module[{xl = x},xl];
fb[x_]:=Block[{xl = x},xl];
Module[{xl},fmp[x_]:= xl=x]

We defined here 4 functions, with the simplest body possible - just return the argument, possibly assigned to a local variable. We can expect the effect to be most pronounced here, since the body does very little.

In[19]:= f/@Range[100000];//Timing
Out[19]= {0.063,Null}

In[20]:= fm/@Range[100000];//Timing
Out[20]= {0.343,Null}

In[21]:= fb/@Range[100000];//Timing
Out[21]= {0.172,Null}

In[22]:= fmp/@Range[100000];//Timing
Out[22]= {0.109,Null} 

From these timings, we see that Block is about twice faster than Module, but that the version that uses persistent variable created by Module in the last function only once, is about twice more efficient than Block, and almost as fast as a simple function invokation (because persistent variable is only created once, and there is no scoping overhead when applying the function).

For real functions, and most of the time, the overhead of either Module or Block should not matter, so I'd use whatever is safer (usually, Module). If it does matter, one option is to use persistent local variables created by Module only once. If even this overhead is significant, I'd reconsider the design - since then obviously your function does too little.There are cases when Block is more beneficial, for example when you want to be sure that all the memory used by local variables will be automatically released (this is particularly relevant for local variables with DownValues, since they are not always garbage - collected when created by Module). Another reason to use Block is when you expect a possibility of interrupts such as exceptions or aborts, and want the local variables to automatically be reset (which Block does). By using Block, however, you risk name collisions, since it binds variables dynamically rather than lexically.

So, to summarize: in most cases, my suggestion is this: if you feel that your function has serious memory or run-time inefficiency, look elsewhere - it is very rare for scoping constructs to be the major bottleneck. Exceptions would include not garbage-collected Module variables with accumulated data, very light-weight functions used very frequently, and functions which operate on very efficient low-level structures such as packed arrays and sparse arrays, where symbolic scoping overhead may be comparable to the time it takes a function to process its data, since the body is very efficient and uses fast functions that by-pass the main evaluator.

EDIT

By combining Block and Module in the fashion suggested here:

Module[{xl}, fmbp[x_] := Block[{xl = x}, xl]]

you can have the best of both worlds: a function as fast as Block - scoped one and as safe as the one that uses Module.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文