为什么 C 标准不支持嵌套函数?

发布于 2024-08-03 17:26:17 字数 68 浏览 1 评论 0原文

在组装中实现起来似乎并不太难。

gcc 还有一个标志(-fnested-functions)来启用它们。

It doesn't seem like it would be too hard to implement in assembly.

gcc also has a flag (-fnested-functions) to enable their use.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

栩栩如生 2024-08-10 17:26:17

事实证明,它们实际上并不那么容易正确实施。

内部函数是否应该有权访问包含范围的变量?
如果不是,那么嵌套它就没有意义了;只需将其设为静态(以限制其所在翻译单元的可见性)并添加一条注释“这是仅由 myfunc() 使用的辅助函数”。

但是,如果您想访问包含范围的变量,那么您基本上是在强制它生成闭包(另一种方法是限制您可以使用嵌套函数执行的操作,足以使它们变得无用)。
我认为 GCC 实际上是通过(在运行时)为包含函数的每次调用生成一个唯一的 thunk 来处理这个问题的,它设置一个上下文指针,然后调用嵌套函数。这最终成为一个相当令人讨厌的黑客行为,并且一些完全合理的实现无法做到这一点(例如,在禁止执行可写内存的系统上 - 许多现代操作系统出于安全原因而这样做)。
使其正常工作的唯一合理方法是强制所有函数指针携带隐藏的上下文参数,并且所有函数接受它(因为在一般情况下,您不知道何时调用它是闭包还是闭包)未闭函数)。出于技术和文化原因,这在 C 中是不合适的,因此我们只能选择使用显式上下文指针来伪造闭包而不是嵌套函数,或者使用具有所需基础设施的高级语言正确地做。

It turns out they're not actually all that easy to implement properly.

Should an internal function have access to the containing scope's variables?
If not, there's no point in nesting it; just make it static (to limit visibility to the translation unit it's in) and add a comment saying "This is a helper function used only by myfunc()".

If you want access to the containing scope's variables, though, you're basically forcing it to generate closures (the alternative is restricting what you can do with nested functions enough to make them useless).
I think GCC actually handles this by generating (at runtime) a unique thunk for every invocation of the containing function, that sets up a context pointer and then calls the nested function. This ends up being a rather Icky hack, and something that some perfectly reasonable implementations can't do (for example, on a system that forbids execution of writable memory - which a lot of modern OSs do for security reasons).
The only reasonable way to make it work in general is to force all function pointers to carry around a hidden context argument, and all functions to accept it (because in the general case you don't know when you call it whether it's a closure or an unclosed function). This is inappropriate to require in C for both technical and cultural reasons, so we're stuck with the option of either using explicit context pointers to fake a closure instead of nesting functions, or using a higher-level language that has the infrastructure needed to do it properly.

燃情 2024-08-10 17:26:17

我想引用 BDFL 的一些内容罗苏姆):

这是因为嵌套函数定义无权访问
周围块的局部变量——仅适用于该块的全局变量
包含模块。这样做是为了避免全局变量的查找
必须遍历一系列字典——就像 C 中的那样,只有两个
嵌套作用域:局部变量和全局变量(除此之外,还有内置变量)。
因此,嵌套函数的用途有限。这是一个
深思熟虑的决定,基于语言的经验,允许
任意嵌套,例如 Pascal 和 Algols - 代码也使用
许多嵌套作用域与具有太多 GOTO 的代码一样可读。

重点是我的。

我相信他指的是 Python 中的嵌套作用域(正如 David 在评论中指出的那样,这是从 1993 年开始的,Python 现在确实支持完全嵌套函数)——但我认为该声明仍然适用。

它的另一部分可能是闭包

如果您有一个类似 C 代码的函数:

(*int()) foo() {
    int x = 5;
    int bar() {
        x = x + 1;
        return x;
    }
    return &bar;
}

如果您在某种回调中使用 bar,那么 x 会发生什么?这在许多更新的高级语言中都有明确的定义,但据我所知,在 C 中没有明确定义的方法来跟踪 x —— bar 每次都返回 6 ,或者连续调用 bar 是否返回递增值?这可能会给 C 语言相对简单的定义增加一层全新的复杂性。

I'd like to quote something from the BDFL (Guido van Rossum):

This is because nested function definitions don't have access to the
local variables of the surrounding block -- only to the globals of the
containing module. This is done so that lookup of globals doesn't
have to walk a chain of dictionaries -- as in C, there are just two
nested scopes: locals and globals (and beyond this, built-ins).
Therefore, nested functions have only a limited use. This was a
deliberate decision, based upon experience with languages allowing
arbitraries nesting such as Pascal and both Algols -- code with too
many nested scopes is about as readable as code with too many GOTOs.

Emphasis is mine.

I believe he was referring to nested scope in Python (and as David points out in the comments, this was from 1993, and Python does support fully nested functions now) -- but I think the statement still applies.

The other part of it could have been closures.

If you have a function like this C-like code:

(*int()) foo() {
    int x = 5;
    int bar() {
        x = x + 1;
        return x;
    }
    return &bar;
}

If you use bar in a callback of some sort, what happens with x? This is well-defined in many newer, higher-level languages, but AFAIK there's no well-defined way to track that x in C -- does bar return 6 every time, or do successive calls to bar return incrementing values? That could have potentially added a whole new layer of complication to C's relatively simple definition.

南汐寒笙箫 2024-08-10 17:26:17

请参阅 C 常见问题解答 20.24GCC 手册了解潜在问题:

如果您尝试调用嵌套函数
通过其地址后
包含函数已退出,所有
地狱将会崩溃。如果你尝试
在包含范围级别之后调用它
已经退出,并且如果它指的是某些
不再存在的变量
范围,你可能很幸运,但事实并非如此
明智地冒险。然而,如果
嵌套函数不引用
任何超出范围的事情,
你应该安全了。

这实际上并不比 C 标准的其他一些有问题的部分更严重,所以我想说原因主要是历史原因(C99 在功能方面与 K&RC 并没有什么不同)。

在某些情况下,具有词法作用域的嵌套函数可能很有用(考虑递归内部函数,它不需要外部作用域中的变量的额外堆栈空间,而不需要静态变量),但希望您可以信任编译器要正确内联此类函数,即具有单独函数的解决方案将更加冗长。

See C FAQ 20.24 and the GCC manual for potential problems:

If you try to call the nested function
through its address after the
containing function has exited, all
hell will break loose. If you try to
call it after a containing scope level
has exited, and if it refers to some
of the variables that are no longer in
scope, you may be lucky, but it's not
wise to take the risk. If, however,
the nested function does not refer to
anything that has gone out of scope,
you should be safe.

This is not really more severe than some other problematic parts of the C standard, so I'd say the reasons are mostly historical (C99 isn't really that different from K&R C feature-wise).

There are some cases where nested functions with lexical scope might be useful (consider a recursive inner function which doesn't need extra stack space for the variables in the outer scope without the need for a static variable), but hopefully you can trust the compiler to correctly inline such functions, ie a solution with a seperate function will just be more verbose.

清旖 2024-08-10 17:26:17

嵌套函数是一个非常微妙的事情。你会让他们关闭吗?如果不是,那么它们比常规函数没有优势,因为它们无法访问任何局部变量。如果确实如此,那么您如何处理堆栈分配的变量?您必须将它们放在其他地方,这样如果您稍后调用嵌套函数,变量仍然存在。这意味着它们会占用内存,因此您必须在堆上为它们分配空间。由于没有 GC,这意味着程序员现在负责清理函数。等等... C# 做到了这一点,但他们有一个 GC,而且它是一种比 C 更新得多的语言。

Nested functions are a very delicate thing. Will you make them closures? If not, then they have no advantage to regular functions, since they can't access any local variables. If they do, then what do you do to stack-allocated variables? You have to put them somewhere else so that if you call the nested function later, the variable is still there. This means they'll take memory, so you have to allocate room for them on the heap. With no GC, this means that the programmer is now in charge of cleaning up the functions. Etc... C# does this, but they have a GC, and it's a considerably newer language than C.

月依秋水 2024-08-10 17:26:17

将成员函数添加到结构中也不会太难,但它们也不在标准中。

功能添加到C标准中并不是仅仅基于它们是否易于实现。它是许多其他因素的组合,包括编写标准的时间点以及当时的常见/实用内容。

It also wouldn't be too hard to add members functions to structs but they are not in the standard either.

Features are not added to C standard based on soley whether or not they are easy to implement. It's a combination of many other factors including the point in time in which the standard was written and what was common / practical then.

请止步禁区 2024-08-10 17:26:17

还有一个原因:嵌套函数是否有价值还不清楚。二十多年前,我曾经使用 (VAX) Pascal 进行大规模编程和维护。我们有很多旧代码大量使用了嵌套函数。起初,我认为这很酷(与我之前工作过的 K&RC 相比),并开始自己做。过了一会儿,我认为这是一场灾难,于是停止了。

问题是一个函数的作用域内可能有很多变量,需要计算它嵌套的所有函数的变量。 (一些旧代码有十层嵌套;五层很常见,直到我改变主意,我自己编写了其中的一些。)嵌套堆栈中的变量可以具有相同的名称,因此“内部”函数局部变量可以在更多“外部”函数中屏蔽同名的变量。函数的局部变量(在类 C 语言中是完全私有的)可以通过调用嵌套函数来修改。这种爵士乐的可能组合几乎是无限的,在阅读代码时理解起来是一场噩梦。

因此,我开始将此编程结构称为“半全局变量”而不是“嵌套函数”,并告诉其他处理代码的人,唯一比全局变量更糟糕的是半全局变量,并且请不要创建不再有。如果可以的话,我会禁止它出现在语言中。可悲的是,编译器没有这样的选项......

One more reason: it is not at all clear that nested functions are valuable. Twenty-odd years ago I used to do large scale programming and maintenance in (VAX) Pascal. We had lots of old code that made heavy use of nested functions. At first, I thought this was way cool (compared to K&R C, which I had been working in before) and started doing it myself. After awhile, I decided it was a disaster, and stopped.

The problem was that a function could have a great many variables in scope, counting the variables of all the functions in which it was nested. (Some old code had ten levels of nesting; five was quite common, and until I changed my mind I coded a few of the latter myself.) Variables in the nesting stack could have the same names, so that "inner" function local variables could mask variables of the same name in more "outer" functions. A local variable of a function, that in C-like languages is totally private to it, could be modified by a call to a nested function. The set of possible combinations of this jazz was near infinite, and a nightmare to comprehend when reading code.

So, I started calling this programming construct "semi-global variables" instead of "nested functions", and telling other people working on the code that the only thing worse than a global variable was a semi-global variable, and please do not create any more. I would have banned it from the language, if I could. Sadly, there was no such option for the compiler...

单身狗的梦 2024-08-10 17:26:17

ANSI C 已成立 20 年。也许在 1983 年至 1989 年间,委员会可能根据当时的编译器技术状况进行了讨论,但如果他们这样做了,他们的推理就会消失在昏暗而遥远的过去。

ANSI C has been established for 20 years. Perhaps between 1983 and 1989 the committee may have discussed it in the light of the state of compiler technology at the time but if they did their reasoning is lost in dim and distant past.

对不⑦ 2024-08-10 17:26:17

我不同意戴夫·范德维斯的观点。

定义嵌套函数是比在全局范围内定义它、使其静态并添加注释“这是仅由 myfunc() 使用的辅助函数”更好的编码风格。

如果您需要此辅助函数的辅助函数怎么办?您会添加注释“这是仅由 myfunc 使用的第一个辅助函数的辅助函数”吗?在不完全污染命名空间的情况下,您从哪里获取所有这些函数所需的名称?

代码写得有多混乱?

但是,当然,存在如何处理闭包的问题,​​即返回一个指向函数的指针,该函数可以访问返回该函数的函数中定义的变量。

I disagree with Dave Vandervies.

Defining a nested function is much better coding style than defining it in global scope, making it static and adding a comment saying "This is a helper function used only by myfunc()".

What if you needed a helper function for this helper function? Would you add a comment "This is a helper function for the first helper function used only by myfunc"? Where do you take the names from needed for all those functions without polluting the namespace completely?

How confusing can code be written?

But of course, there is the problem with how to deal with closuring, i.e. returning a pointer to a function that has access to variables defined in the function from which it is returned.

慕烟庭风 2024-08-10 17:26:17

要么您不允许在包含的函数中引用包含函数的局部变量,并且嵌套只是一种没有多大用处的作用域功能,要么您这样做。如果这样做,这不是一个如此简单的功能:您必须能够在访问正确的数据时从另一个函数调用嵌套函数,并且还必须考虑递归调用。这并非不可能——技术是众所周知的,并且在设计 C 时就得到了很好的掌握(Algol 60 已经具有该功能)。但它使运行时组织和编译器变得复杂,并阻止了到汇编语言的简单映射(函数指针必须携带有关该信息的信息;还有其他选择,例如使用一个 gcc)。它超出了系统实现语言 C 的设计范围。

Either you don't allow references to local variables of the containing function in the contained one, and the nesting is just a scoping feature without much use, or you do. If you do, it is not a so simple feature: you have to be able to call a nested function from another one while accessing the correct data, and you also have to take into account recursive calls. That's not impossible -- techniques are well known for that and where well mastered when C was designed (Algol 60 had already the feature). But it complicates the run-time organization and the compiler and prevent a simple mapping to assembly language (a function pointer must carry on information about that; well there are alternatives such as the one gcc use). It was out of scope for the system implementation language C was designed to be.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文