价值观与背景的封闭

发布于 2024-12-25 12:38:22 字数 295 浏览 2 评论 0原文

我正在思考闭包的各种实现，并且想知道不同风格的优点。似乎有两个选择，关闭执行上下文或值。例如，在上下文中，我们有：

a = 1
def f():
  return a
f() # returns 1
a = 2
f() # returns 2

或者，我们可以关闭值并有：

a = 1
def f():
  return a
f() # returns 1
a = 2
f() # returns 1

是否有实现第二个的语言？有优点和缺点吗？

原文

I'm thinking through various implementations of closures and am wondering about the merits of different styles. It seems there are two choices, closing over the execution context or the values. For instance, over the context we have:

a = 1
def f():
  return a
f() # returns 1
a = 2
f() # returns 2

Alternatively, we can close over values and have:

a = 1
def f():
  return a
f() # returns 1
a = 2
f() # returns 1

Are there languages that implement the second? Are there advantages vs. disadvantages?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

生生漫 2025-01-01 12:38:22

我认为在这种情况下，这不是上下文与值的问题，而是您是否将变量作为引用单元格或变量包含的值关闭。

如果您真正指的是上下文，那么您指的是动态范围与词法范围。请参阅这篇维基百科文章进行深入比较。

大多数语言都实现了词法作用域（或尝试实现）。有些语言确实实现了动态作用域：尤其是较旧的 Lisp，例如 emacs 的 ELisp。大多数带有闭包的语言（例如，Scheme、Haskell、ML 等）都会关闭词法范围内的值。动态范围通常被认为是一个坏主意，因为它更难以推理（它是“远距离的幽灵行动”）。

请注意，即使在词法范围的语言中，如果关闭引用单元格，您也可以获得与第一个示例类似的行为。这就是为什么Scheme 和JavaScript 闭包的行为就像它们一样（因为变量是引用单元格）。

回复收藏 0 原文

星軌x 2025-01-01 12:38:22

C++ lambda 可以通过值显式捕获：

int a = 1;
auto f1 = [a]() -> int { return a; }
f1() == 1;
a = 2;
f1() == 1;

或通过引用：

a = 1;
auto f2 = [&a]() -> int { return a; }
f2() == 1;
a = 2;
f2() == 2;

您也可以隐式捕获任一方式：

auto f1 = [=]() -> int { return a; }
auto f2 = [&]() -> int { return a; }

优点是您可以控制复制或引用哪些变量以及是否复制或引用变量。一个潜在的缺点是您必须注意生命周期问题，因为 C++ 引用是非拥有的：如果 a 超出范围，则调用 f1 仍然有效，但调用f2 未定义。如果这是自然的并且您不介意开销，您始终可以捕获 shared_ptr （具有共享所有权的指针）。

因此，对于不可变值：

按值捕获会强制进行副本。通过引用捕获则不会。
按价值获取不存在所有权问题。通过引用捕获可以。

对于可变值，您当然必须通过引用捕获。这是一个类似于 std::partial_sum() 的人为示例：

int sum = 0;
auto f = [&sum](int i) -> int { sum += i; return sum; }

vector<int> input{1, 2, 3, 4, 5};
vector<int> output;
transform(begin(input), end(input), back_inserter(output), f);

sum == 15;
output == vector{1, 3, 6, 10, 15};

C++ lambdas can capture explicitly by value:

int a = 1;
auto f1 = [a]() -> int { return a; }
f1() == 1;
a = 2;
f1() == 1;

Or by reference:

a = 1;
auto f2 = [&a]() -> int { return a; }
f2() == 1;
a = 2;
f2() == 2;

You can also implicitly capture either way:

auto f1 = [=]() -> int { return a; }
auto f2 = [&]() -> int { return a; }

The advantage is that you control which and whether variables are copied or referenced. A potential disadvantage is that you must beware of lifetime issues, because C++ references are non-owning: if a goes out of scope, then calling f1 is still valid, but calling f2 is undefined. If it’s natural and you don’t mind the overhead, you could always capture a shared_ptr<T> (pointer with shared ownership).

So for immutable values:

Capturing by value forces a copy. Capturing by reference does not.
Capturing by value has no ownership issues. Capturing by reference does.

For mutable values, you must of course capture by reference. Here’s a contrived example similar to std::partial_sum():

int sum = 0;
auto f = [&sum](int i) -> int { sum += i; return sum; }

vector<int> input{1, 2, 3, 4, 5};
vector<int> output;
transform(begin(input), end(input), back_inserter(output), f);

sum == 15;
output == vector{1, 3, 6, 10, 15};

回复收藏 0 原文

蓝眸 2025-01-01 12:38:22

在大多数同时具有闭包和可变变量的语言中，闭包捕获位置，而不是值（即第一个行为）。示例包括Scheme、Python 和Javascript。

为了安全地做到这一点，在许多情况下，语言必须对闭包捕获的可变变量进行堆分配。这通常是通过编译器传递来实现的，该编译器传递将实际变异的变量转换为显式分配的可变单元，之后编译器可以忘记该问题。

为了避免隐式堆分配，Java 要求（必需？）捕获的变量（通过内部类）声明为最终的（即不可变的）。其他语言，如 ML 和 Haskell，完全避免了这个问题，因为变量总是不可变的。正如 Jon 在他的回答中指出的那样，在 C++ 中，按引用捕获可能是不安全的。

回复收藏 0 原文

来日方长 2025-01-01 12:38:22

闭包的行为应与第一种情况相同，但某些语言提供第二种情况。

Smalltalk 按照第一种情况工作。假设一个类定义了方法 m 和 test：

m
| counter c |  "temporary vars"
counter = 0.
c = [ counter = counter + 1. counter ]. 
^ c. "returns the closure"

test
| c | "temporary vars"
c = self m. "obtain a closure that increments a counter"
c value. "return 1"
c value. " returns 2"

要考虑闭包，您必须考虑堆栈。如果在方法m中定义闭包c并通过临时变量counter关闭，则m的堆栈帧可以在闭包被垃圾收集之前不会被删除。闭包是一流的，所以你不知道什么时候就不再引用它了。

但是许多闭包不会关闭任何临时变量，或者关闭定义闭包后未修改的临时变量。在后一种情况下，定义闭包时临时变量的值可以复制到闭包中，这样它们就不需要对m的堆栈帧的引用。

在上面的闭包c的情况下，闭包可以复制counter的值。 Java 通过强制封闭的临时变量为最终变量来强制执行此操作。

如果方法m是

m
| counter c |  "temporary vars"
counter = 0.
c = [ counter = counter + 1. counter ].
counter = 1. 
^ c. "returns the closure"

我猜它会失败优化，因为counter在创建闭包后发生了变化。

至少我是这么理解闭包的。

Closures should behave as in the first case, but some languages provide the 2nd case.

Smalltalk works according to the first case. Let's assume a class defines methods m and test:

m
| counter c |  "temporary vars"
counter = 0.
c = [ counter = counter + 1. counter ]. 
^ c. "returns the closure"

test
| c | "temporary vars"
c = self m. "obtain a closure that increments a counter"
c value. "return 1"
c value. " returns 2"

To think about closure, you must think about the stack. If closure c is defined in method m and closes over the temporary variable counter, the stack frame of m can not be removed until the closure is garbage collected. Closure are first-class, so you don't know when there will be no reference to it anylonger.

But many closures do no close over any temporary variable, or close over temporary variables that are not modified after the closure is defined. In the latter case, the value of the temporary variable at the moment the closure is defined can be copied into the closure, so that they don't need a reference to the stack frame of m.

In the case of the closure c above, the closure can copy the value of counter. This what Java mandates by forcing tempory variables that are closed over to be final.

If method m was

m
| counter c |  "temporary vars"
counter = 0.
c = [ counter = counter + 1. counter ].
counter = 1. 
^ c. "returns the closure"

I guess it would defeat the optimization, because counter is mutated after the creation of the closure.

That's how I understand closures at least.

回复收藏 0 原文

So要识趣 2025-01-01 12:38:22

Felix 实际上提供了相当复杂的语义，有时是违反直觉的。闭包通过指向上下文框架的指针捕获上下文。在闭包形成时。因此，您会期望捕获的变量始终反映执行闭包时变量的当前值。

情况并非如此，因为优化器可以用变量的值替换变量，特别是如果“变量”声明如下：

val x = 1;

它被视为不可变值，并且这种替换被认为是安全的。即使该值作为参数传递也是如此！例如：

fun f(x:int) () => x;
val y = 1;
val fy = f y;  // closure formed
println$ fy();

我们很可能将 fy 定义为 if:

val fy = fun () => 1;

已被写入。在这种情况下，对于变量来说可能是相同的：

var z = 1;
val fz = f z;
z = 2;
println$ fz (); // prints 1 .. maybe

通过在闭包形成时将 x 替换为 z 的值，但它也可以通过将 x 替换为变量名 z 来打印 2。

在 Felix 中，不确定应用哪种优化，这是经过深思熟虑的：它允许编译器自由选择（它认为是）最佳优化。

如果你想强制解释，你可以：对于参数参数：

fun f(var x:int) () => x; // 强制急切求值，将参数复制到参数
fun f( x: 单位 -> int ) => x(); // 强制惰性求值

对于最初的问题：您可以通过简单地使用指针来强制惰性解释：

var x = 1;
fun f()=> *&x;

强制急切解释是无意义的。如果你愿意，你可以这样做：

var x = 1;
val y = x;
var x = 2;
fun f() => y; // prints 1

我必须说我对这些语义不满意，但这就是目前发生的情况，而且看起来很合乎逻辑。更麻烦的是：

var g : unit -> int;

for var i = 0 upto 10 do
   val x = i;
   fun f()() => x;
   if i == 3 do
     g = f();
   done
done

for循环是扁平的，没有栈帧。这里的“x”是一个值，但它不是一成不变的！
如果您可以预测 g() 打印的值，那么您会比我做得更好（并且我设计了该语言:)

不幸的是，通过这些语义获得的优化是强制性的：我们不希望最终得到以下性能：呃，好吧，哈斯克尔（无意冒犯）。

这个故事的寓意是：如果你的代码取决于OP问题的答案，那就由你自己决定吧！如果需要，请编写语义确定的代码。

Felix actually provides quite a complex semantics which is sometimes counter-intuitive. Closures capture the context via a pointer to the context's frame .. at the point closures are formed. Therefore you would expect that the captured variable always reflect the current value of the variable at the time the closure is executed.

This is not the case, because the optimiser may replace the variable with its value, in particular, if the "variable" is declared like:

val x = 1;

it is taken as an immutable value, and such a substitution is deemed safe. This is true even if the value is passed as an argument! For example:

fun f(x:int) () => x;
val y = 1;
val fy = f y;  // closure formed
println$ fy();

It's likely we have fy defined as if:

val fy = fun () => 1;

had been written. In this case it may be the same for a variable:

var z = 1;
val fz = f z;
z = 2;
println$ fz (); // prints 1 .. maybe

by replacing the x with the value of z at the time of closure formation BUT it could also print 2, by replacing the x with the variable name z instead.

In Felix, it is not determinate which optimisation is applied and that is deliberate: it allows the compiler the freedom to choose (what it thinks is) the best optimisation.

If you want to force an interpretation you can: for the parameter argument:

fun f(var x:int) () => x; // forces eager evaluation, copies argument to parameter
fun f( x: unit -> int ) => x(); // forces lazy evaluation

And for the original question: you can force the lazy interpretation by simply using a pointer:

var x = 1;
fun f()=> *&x;

It is nonsense to force the eager interpretation. If you want that you do this:

var x = 1;
val y = x;
var x = 2;
fun f() => y; // prints 1

I must say I am NOT HAPPY with these semantics, but that's what happens at the moment, and it seems quite logical. What is more troubling is this:

var g : unit -> int;

for var i = 0 upto 10 do
   val x = i;
   fun f()() => x;
   if i == 3 do
     g = f();
   done
done

The for loop is flat, no stack frame. Here 'x' is a value, but it isn't immutable!
If you can predict the value printed by g() you're doing better than me (and I designed the language :)

Unfortunately the optimisations obtained by these semantics are mandatory: we do not want to end up with the performance of, er, well, Haskell (no offense intended).

The moral of the story is: if your code depends on the answer to the OP's question, on your head be it! Write code where the semantics are determinate if you require that.

回复收藏 0 原文

笑，眼淚并存 2025-01-01 12:38:22

各种语言都以这两种方式之一或两者都有。

主要区别在于分配给变量时会发生什么。因此，正如其他人指出的那样，在变量不可变的语言中

，在按值捕获的语言中，一个问题是如何处理对该变量的赋值。由于它是按值捕获的

其他人指出的那样，许多语言没有明确的语法来处理按值捕获与按引用捕获的引用，包括：Python、Ruby、JavaScript、Scheme、Perl、Go、Smalltalk 等。
正如指出，ML语言（SML，OCaml）和Haskell可以说是按值捕获，因为它们的变量是不可变的，所以两者之间没有真正的区别，并且按值捕获更简单
正如其他人指出的那样，Java需要捕获的变量为final，本质上是为了按值捕获，否则在同一作用域中拥有变量的两个单独的可变副本会造成混乱；但是当它们是final时，它们就无法修改，因此拥有一份副本和拥有多个副本之间没有区别，
C++11 允许您选择是按值捕获还是按引用捕获。您在括号中列出要捕获的变量。带有 & 的变量是通过引用；否则，按价值计算。 = 本身按值捕获所有未列出的变量； & 本身通过引用捕获所有未列出的变量。通过引用捕获变量时必须小心，不要捕获超出范围的变量。有趣的是（与 Java 不同），通过在匿名函数上使用 mutable 修饰符，可以按值捕获变量，但使其可变。
同样，PHP 允许您选择何时声明要捕获的变量。 &表示通过引用捕获；否则按值。
Apple 开发工具中的块（适用于 C、C++ 和 Objective-C 语言；适用于 Mac OS X 10.6+ 和 iOS 4+）也允许您进行选择。当您第一次创建块时，它可以通过引用访问捕获的变量；然而，如果这样的块捕获局部变量，则不允许它离开作用域（例如返回），因为它们将超出作用域。必须复制一个块才能使其离开作用域；捕获的变量是在复制块时按值捕获的。还可以通过在声明变量时使用 __block 修饰符来指示在复制时通过块引用捕获局部变量。这可能会在堆上分配它。

Various languages have it in one of those two ways, or both.

The main distinction is what happens when you assign to the variable. Thus, as others have pointed out, in languages where variables are immutable

In languages that capture by value, one issue is how to deal with assignments to that variable. Since it's captured by value

As others have pointed out, many languages without explicit syntax for dealing with capturing by value vs. reference capture by reference, including: Python, Ruby, JavaScript, Scheme, Perl, Go, Smalltalk, etc.
As others have pointed out, ML languages (SML, OCaml) and Haskell can be said to capture by value, because their variables are immutable, so there is no real difference between the two, and capture by value is simpler
As others have pointed out, Java requires captured variables to be final, essentially for the purpose of capturing by value, because otherwise there would be confusion at having two separate mutable copies of a variable in the same scope; but when they are final, they can't be modified so there is no difference between having one copy and many copies
C++11 lets you choose whether to capture by value, or by reference. You list the variables to capture in brackets. Variables with & are by reference; otherwise, it's by value. = by itself captures all unlisted variables by value; & by itself captures all unlisted variables by reference. One has to be careful when capturing variables by reference, to not capture variables that go out of scope. Interestingly (unlike Java), it is possible to capture a variable by value, but have it be mutable, by using the mutable modifier on the anonymous function.
PHP, likewise, lets you choose when you declared variables to capture. & indicates capture by reference; otherwise by value.
Blocks in Apple's development tools (for languages C, C++, and Objective-C; available in Mac OS X 10.6+ and iOS 4+) also allow you to choose. When you first create a block, it has access to captured variables by reference; however, such a block is not allowed to leave the scope (e.g. be returned) if it captures local variables since they will go out of scope. One must copy a block in order to have it leave the scope; captured variables are captured by value when the block is copied. It is also possible to indicate that a local variable is to be captured by reference by blocks when copied, by using the __block modifier when declaring that variable. This probably allocates it on the heap.