为什么 Pascal 禁止修改 for 块内的计数器?
是因为 Pascal 就是这样设计的,还是有任何权衡?
或者禁止或不禁止修改 for 块内的计数器有哪些优点和缺点?恕我直言,修改 for 块内的计数器几乎没有用处。
编辑:
您能否提供一个我们需要修改 for 块内的计数器的示例?
在wallyk的答案和cartoonfox的答案之间很难做出选择,因为两个答案都很好。Cartoonfox从语言方面分析问题,而wallyk从历史和现实世界方面分析问题。无论如何,感谢您的所有回答,我要特别感谢 wallyk。
Is it because Pascal was designed to be so, or are there any tradeoffs?
Or what are the pros and cons to forbid or not forbid modification of the counter inside a for-block? IMHO, there is little use to modify the counter inside a for-block.
EDIT:
Could you provide one example where we need to modify the counter inside the for-block?
It is hard to choose between wallyk's answer and cartoonfox's answer,since both answer are so nice.Cartoonfox analysis the problem from language aspect,while wallyk analysis the problem from the history and the real-world aspect.Anyway,thanks for all of your answers and I'd like to give my special thanks to wallyk.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
在编程语言理论(以及可计算性理论)中,WHILE 和 FOR 循环具有不同的理论属性:
C 中存在的 FOR 循环从技术上讲并不算作 FOR 循环,因为您不一定知道循环在执行之前会迭代多少次它。 (即,您可以修改循环计数器使其永远运行)
使用 WHILE 循环可以解决的问题类比使用 Pascal 中的严格 FOR 循环可以解决的问题更强大。
Pascal 的设计方式是这样的,以便学生拥有两种具有不同计算属性的不同循环结构。 (如果您以 C 方式实现 FOR,则 FOR 循环将只是 while... 的替代语法。)
严格来说,您不需要修改 for 循环内的计数器。 如果您能摆脱它,您只需为 WHILE 循环提供一种替代语法即可。
您可以在这些 CS 讲义中找到有关“while 循环可计算性”和“for 循环可计算性”的更多信息:http://www-compsci.swan.ac.uk/~csjvt/JVTTeaching/TPL.html
另一个这样的属性顺便说一句是循环变量在 for 循环之后未定义。这也让优化变得更加容易
In programming language theory (and in computability theory) WHILE and FOR loops have different theoretical properties:
The FOR loop present in C doesn't technically count as a FOR loop because you don't necessarily know how many times the loop will iterate before executing it. (i.e. you can hack the loop counter to run forever)
The class of problems you can solve with WHILE loops is strictly more powerful than those you could have solved with the strict FOR loop found in Pascal.
Pascal is designed this way so that students have two different loop constructs with different computational properties. (If you implemented FOR the C-way, the FOR loop would just be an alternative syntax for while...)
In strictly theoretical terms, you shouldn't ever need to modify the counter within a for loop. If you could get away with it, you'd just have an alternative syntax for a WHILE loop.
You can find out more about "while loop computability" and "for loop computability" in these CS lecture notes: http://www-compsci.swan.ac.uk/~csjvt/JVTTeaching/TPL.html
Another such property btw is that the loopvariable is undefined after the for loop. This also makes optimization easier
Pascal 最初是为 CDC Cyber(一款 1960 年代和 1970 年代的大型机)实现的,与当今的许多 CPU 一样,它具有出色的顺序指令执行性能,但对于分支来说也有显着的性能损失。 Cyber 架构的这一特征和其他特征可能严重影响了 Pascal 的
for
循环设计。简短的回答是,允许分配循环变量将需要额外的保护代码,并且会扰乱循环变量的优化,而这些变量通常可以在 18 位索引寄存器中很好地处理。在那些日子里,由于硬件的昂贵且无法以任何其他方式加速,软件性能受到高度重视。
长答案
Control Data Corporation 6600 系列(包括 Cyber)是一种 RISC 架构,使用由 18 位地址引用的 60 位中央存储器字。某些型号有一个(昂贵,因此不常见)选项,即比较移动单元 (CMU),用于直接寻址 6 位字符字段,但除此之外不支持任何类型的“字节”。由于一般情况下不能指望 CMU,因此大多数网络代码都是因为它的缺席而生成的。每个字 10 个字符是常用的数据格式,直到对小写字符的支持被暂定的 12 位字符表示取代。
指令的长度为 15 位或 30 位,但 CMU 指令的有效长度为 60 位。因此,每个字最多包含 4 条指令,或者两个 30 位,或者一对 15 位和一个 30 位。 30 位指令不能跨字。由于分支目的地可能仅引用单词,因此跳转目标是单词对齐的。
该架构没有堆栈。事实上,过程调用指令
RJ
本质上是不可重入的。RJ
通过在 RJ 指令所在位置写入跳转到下一条指令来修改被调用过程的第一个字。被调用的过程通过跳转到其开头返回到调用者,该开头是为返回链接保留的。程序从第二个字开始。为了实现递归,大多数编译器都使用辅助函数。寄存器文件有八个实例,每种寄存器三种类型,A0..A7 用于地址操作,B0..B7 用于索引,X0..X7 用于一般算术。 A、B寄存器均为18位; X 寄存器是 60 位。设置 A1 到 A5 会产生副作用,即使用加载地址的内容加载相应的 X1 到 X5 寄存器。设置A6或A7会将相应的X6或X7内容写入到加载到A寄存器的地址中。 A0和X0没有连接。 B 寄存器几乎可以在每条指令中用作与任何其他 A、B 或 X 寄存器相加或相减的值。因此它们非常适合小型柜台。
为了提高代码效率,B 寄存器用于循环变量,因为可以对它们使用直接比较指令(B2 < 100 等);与 X 寄存器的比较仅限于与零的关系,因此将 X 寄存器与 100 进行比较,例如,需要减去 100 并测试结果是否小于零,等等。如果允许对循环变量进行赋值,则为 60 位值在分配给 B 寄存器之前必须进行范围检查。这真是一个麻烦。 Herr Wirth 可能认为麻烦和低效率都不值得使用——程序员总是可以使用
while
或repeat
...until< /code> 在这种情况下循环。
额外的怪异
Pascal 语言独有的几个特性与网络的各个方面直接相关:
pack
关键字:单个“字符”消耗 60 位单词,或者每个单词包含十个字符。alfa
类型:char 的压缩数组 [1..10]
pack()
和unpack()
处理压缩字符。它们不对现代架构执行任何转换,仅执行类型转换。text
文件与char 文件
的怪异之处在于writeln
显式调用的,set of char
在 CDC 上非常有用,但由于其过多的内存使用(32 字节),它在许多后续 8 位计算机上不受支持。 8 位 ASCII 的变量/常量)。相比之下,单个网络单词可以通过省略换行符和其他内容来管理本机 62 个字符集。Pascal was first implemented for the CDC Cyber—a 1960s and 1970s mainframe—which like many CPUs today, had excellent sequential instruction execution performance, but also a significant performance penalty for branches. This and other characteristics of the Cyber architecture probably heavily influenced Pascal's design of
for
loops.The Short Answer is that allowing assignment of a loop variable would require extra guard code and messed up optimization for loop variables which could ordinarily be handled well in 18-bit index registers. In those days, software performance was highly valued due to the expense of the hardware and inability to speed it up any other way.
Long Answer
The Control Data Corporation 6600 family, which includes the Cyber, is a RISC architecture using 60-bit central memory words referenced by 18-bit addresses. Some models had an (expensive, therefore uncommon) option, the Compare-Move Unit (CMU), for directly addressing 6-bit character fields, but otherwise there was no support for "bytes" of any sort. Since the CMU could not be counted on in general, most Cyber code was generated for its absence. Ten characters per word was the usual data format until support for lowercase characters gave way to a tentative 12-bit character representation.
Instructions are 15 bits or 30 bits long, except for the CMU instructions being effectively 60 bits long. So up to 4 instructions packed into each word, or two 30 bit, or a pair of 15 bit and one 30 bit. 30 bit instructions cannot span words. Since branch destinations may only reference words, jump targets are word-aligned.
The architecture has no stack. In fact, the procedure call instruction
RJ
is intrinsically non-re-entrant.RJ
modifies the first word of the called procedure by writing a jump to the next instruction after where the RJ instruction is. Called procedures return to the caller by jumping to their beginning, which is reserved for return linkage. Procedures begin at the second word. To implement recursion, most compilers made use of a helper function.The register file has eight instances each of three kinds of register, A0..A7 for address manipulation, B0..B7 for indexing, and X0..X7 for general arithmetic. A and B registers are 18 bits; X registers are 60 bits. Setting A1 through A5 has the side effect of loading the corresponding X1 through X5 register with the contents of the loaded address. Setting A6 or A7 writes the corresponding X6 or X7 contents to the address loaded into the A register. A0 and X0 are not connected. The B registers can be used in virtually every instruction as a value to add or subtract from any other A, B, or X register. Hence they are great for small counters.
For efficient code, a B register is used for loop variables since direct comparison instructions can be used on them (B2 < 100, etc.); comparisons with X registers are limited to relations to zero, so comparing an X register to 100, say, requires subtracting 100 and testing the result for less than zero, etc. If an assignment to the loop variable were allowed, a 60-bit value would have to be range-checked before assignment to the B register. This is a real hassle. Herr Wirth probably figured that both the hassle and the inefficiency wasn't worth the utility--the programmer can always use a
while
orrepeat
...until
loop in that situation.Additional weirdness
Several unique-to-Pascal language features relate directly to aspects of the Cyber:
pack
keyword: either a single "character" consumes a 60-bit word, or it is packed ten characters per word.alfa
type:packed array [1..10] of char
pack()
andunpack()
to deal with packed characters. These perform no transformation on modern architectures, only type conversion.text
files vs.file of char
writeln
set of char
was very useful on CDCs, it was unsupported on many subsequent 8 bit machines due to its excess memory use (32-byte variables/constants for 8-bit ASCII). In contrast, a single Cyber word could manage the native 62-character set by omitting newline and something else.Pascal 最初被设计为一种教学语言,以鼓励块结构编程。 Kernighan(K&R 中的 K)写了一篇关于 Pascal 局限性的文章(可以理解是有偏见的),为什么 Pascal 不是我最喜欢的编程语言。
禁止修改 Pascal 所称的
for
循环的控制变量,再加上缺少break
语句,这意味着可以无需研究其内容即可知道循环体执行了多少次。如果没有
break
语句,并且在循环终止后无法使用控制变量,则比无法在循环内修改控制变量更具限制,因为它会阻止某些字符串和数组处理算法不再以“显而易见”的方式编写。Pascal 和 C 之间的这些和其他差异反映了它们最初设计时的不同哲学:Pascal 强制执行“正确”设计的概念,C 允许或多或少的任何东西,无论有多么危险。
(注意:Delphi 确实有一个
Break
语句,以及Continue
和Exit
,它类似于return
)显然,我们永远需要能够在
for
循环中修改控制变量,因为我们总是可以使用while
重写环形。使用这种行为的 C 语言示例可以在 K&R 第 7.3 节中找到,其中介绍了printf()
的简单版本。处理格式字符串fmt
中的'%'
序列的代码是:虽然它使用指针作为循环变量,但它同样可以用整数索引写入字符串:
Pascal was originally designed as a teaching language to encourage block-structured programming. Kernighan (the K of K&R) wrote an (understandably biased) essay on Pascal's limitations, Why Pascal is Not My Favorite Programming Language.
The prohibition on modifying what Pascal calls the control variable of a
for
loop, combined with the lack of abreak
statement means that it is possible to know how many times the loop body is executed without studying its contents.Without a
break
statement, and not being able to use the control variable after the loop terminates is more of a restriction than not being able to modify the control variable inside the loop as it prevents some string and array processing algorithms from being written in the "obvious" way.These and other difference between Pascal and C reflect the different philosophies with which they were first designed: Pascal to enforce a concept of "correct" design, C to permit more or less anything, no matter how dangerous.
(Note: Delphi does have a
Break
statement however, as well asContinue
, andExit
which is likereturn
in C.)Clearly we never need to be able to modify the control variable in a
for
loop, because we can always rewrite using awhile
loop. An example in C where such behaviour is used can be found in K&R section 7.3, where a simple version ofprintf()
is introduced. The code that handles'%'
sequences within a format stringfmt
is:Although this uses a pointer as the loop variable, it could equally have been written with an integer index into the string:
它可以使一些优化(例如循环展开)变得更容易:不需要复杂的静态分析来确定循环行为是否可预测。
It can make some optimizations (loop unrolling for instance) easier: no need for complicated static analysis to determine if the loop behavior is predictable or not.
来自 For 循环
所以这似乎是为了帮助你以后不要烫伤你的手。
From For loop
So this seems to be to help you not burn your hand later on.
免责声明:自从我上次使用 PASCAL 以来已经有几十年了,所以我的语法可能不完全正确。
您必须记住,PASCAL 是 Nicklaus Wirth 的孩子,而 Wirth 在设计 PASCAL(及其所有后继者)时非常关心可靠性和可理解性。
考虑以下代码片段:
在不查看过程 FOO 的情况下,回答以下问题: 该循环是否会结束?你怎么知道?过程 FOO 在循环中被调用了多少次?你怎么知道?
PASCAL 禁止修改循环体中的索引变量,以便可以知道这些问题的答案,并且知道当过程 FOO 更改时答案不会更改。
Disclaimer: It has been decades since I last did PASCAL, so my syntax may not be exactly correct.
You have to remember that PASCAL is Nicklaus Wirth's child, and Wirth cared very strongly about reliability and understandability when he designed PASCAL (and all of its successors).
Consider the following code fragment:
Without looking at procedure FOO, answer these questions: Does this loop ever end? How do you know? How many times is procedure FOO called in the loop? How do you know?
PASCAL forbids modifying the index variable in the loop body so that it is POSSIBLE to know the answers to those questions, and know that the answers won't change when and if procedure FOO changes.
可以肯定地得出这样的结论:Pascal 的设计目的是防止修改循环内的 for 循环索引。值得注意的是,Pascal 绝不是唯一阻止程序员这样做的语言,Fortran 是另一个例子。
以这种方式设计语言有两个令人信服的理由:
对于许多算法来说,这种行为是必需的行为;例如,更新数组中的所有元素。如果没记错的话,Pascal 还提供了 do-while 循环和 Repeat-until 循环。我猜想,大多数以 C 风格语言实现并修改循环索引变量或跳出循环的算法都可以使用这些替代形式的循环轻松实现。
我绞尽脑汁,未能找到允许在循环内修改循环索引变量的令人信服的理由,但我一直认为这样做是糟糕的设计,并且选择正确的循环构造作为元素良好的设计。
问候
马克
It's probably safe to conclude that Pascal was designed to prevent modification of a for loop index inside the loop. It's worth noting that Pascal is by no means the only language which prevents programmers doing this, Fortran is another example.
There are two compelling reasons for designing a language that way:
For many algorithms this behaviour is the required behaviour; updating all the elements in an array for example. If memory serves Pascal also provides do-while loops and repeat-until loops. Most, I guess, algorithms which are implemented in C-style languages with modifications to the loop index variable or breaks out of the loop could just as easily be implemented with these alternative forms of loop.
I've scratched my head and failed to find a compelling reason for allowing the modification of a loop index variable inside the loop, but then I've always regarded doing so as bad design, and the selection of the right loop construct as an element of good design.
Regards
Mark