为什么 Pascal 禁止修改 for 块内的计数器?

发布于 2024-08-17 23:27:24 字数 300 浏览 4 评论 0原文

是因为 Pascal 就是这样设计的,还是有任何权衡?

或者禁止或不禁止修改 for 块内的计数器有哪些优点和缺点?恕我直言,修改 for 块内的计数器几乎没有用处。

编辑
您能否提供一个我们需要修改 for 块内的计数器的示例?

在wallyk的答案和cartoonfox的答案之间很难做出选择,因为两个答案都很好。Cartoonfox从语言方面分析问题,而wallyk从历史和现实世界方面分析问题。无论如何,感谢您的所有回答,我要特别感谢 wallyk。

Is it because Pascal was designed to be so, or are there any tradeoffs?

Or what are the pros and cons to forbid or not forbid modification of the counter inside a for-block? IMHO, there is little use to modify the counter inside a for-block.

EDIT:
Could you provide one example where we need to modify the counter inside the for-block?

It is hard to choose between wallyk's answer and cartoonfox's answer,since both answer are so nice.Cartoonfox analysis the problem from language aspect,while wallyk analysis the problem from the history and the real-world aspect.Anyway,thanks for all of your answers and I'd like to give my special thanks to wallyk.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

没有心的人 2024-08-24 23:27:24

在编程语言理论(以及可计算性理论)中,WHILE 和 FOR 循环具有不同的理论属性

  • WHILE 循环可能永远不会终止(表达式可能只是 TRUE)
  • FOR 循环执行的次数应该在开始执行之前就已知。 您应该知道 FOR 循环总是终止。

C 中存在的 FOR 循环从技术上讲并不算作 FOR 循环,因为您不一定知道循环在执行之前会迭代多少次它。 (即,您可以修改循环计数器使其永远运行)

使用 WHILE 循环可以解决的问题类比使用 Pascal 中的严格 FOR 循环可以解决的问题更强大。

Pascal 的设计方式是这样的,以便学生拥有两种具有不同计算属性的不同循环结构。 (如果您以 C 方式实现 FOR,则 FOR 循环将只是 while... 的替代语法。)

严格来说,您不需要修改 for 循环内的计数器。 如果您能摆脱它,您只需为 WHILE 循环提供一种替代语法即可。

您可以在这些 CS 讲义中找到有关“while 循环可计算性”和“for 循环可计算性”的更多信息:http://www-compsci.swan.ac.uk/~csjvt/JVTTeaching/TPL.html

另一个这样的属性顺便说一句是循环变量在 for 循环之后未定义。这也让优化变得更加容易

In programming language theory (and in computability theory) WHILE and FOR loops have different theoretical properties:

  • a WHILE loop may never terminate (the expression could just be TRUE)
  • the finite number of times a FOR loop is to execute is supposed to be known before it starts executing. You're supposed to know that FOR loops always terminate.

The FOR loop present in C doesn't technically count as a FOR loop because you don't necessarily know how many times the loop will iterate before executing it. (i.e. you can hack the loop counter to run forever)

The class of problems you can solve with WHILE loops is strictly more powerful than those you could have solved with the strict FOR loop found in Pascal.

Pascal is designed this way so that students have two different loop constructs with different computational properties. (If you implemented FOR the C-way, the FOR loop would just be an alternative syntax for while...)

In strictly theoretical terms, you shouldn't ever need to modify the counter within a for loop. If you could get away with it, you'd just have an alternative syntax for a WHILE loop.

You can find out more about "while loop computability" and "for loop computability" in these CS lecture notes: http://www-compsci.swan.ac.uk/~csjvt/JVTTeaching/TPL.html

Another such property btw is that the loopvariable is undefined after the for loop. This also makes optimization easier

心安伴我暖 2024-08-24 23:27:24

Pascal 最初是为 CDC Cyber​​(一款 1960 年代和 1970 年代的大型机)实现的,与当今的许多 CPU 一样,它具有出色的顺序指令执行性能,但对于分支来说也有显着的性能损失。 Cyber​​ 架构的这一特征和其他特征可能严重影响了 Pascal 的 for 循环设计。

简短的回答是,允许分配循环变量将需要额外的保护代码,并且会扰乱循环变量的优化,而这些变量通常可以在 18 位索引寄存器中很好地处理。在那些日子里,由于硬件的昂贵且无法以任何其他方式加速,软件性能受到高度重视。

长答案

Control Data Corporation 6600 系列(包括 Cyber​​)是一种 RISC 架构,使用由 18 位地址引用的 60 位中央存储器字。某些型号有一个(昂贵,因此不常见)选项,即比较移动单元 (CMU),用于直接寻址 6 位字符字段,但除此之外不支持任何类型的“字节”。由于一般情况下不能指望 CMU,因此大多数网络代码都是因为它的缺席而生成的。每个字 10 个字符是常用的数据格式,直到对小写字符的支持被暂定的 12 位字符表示取代。

指令的长度为 15 位或 30 位,但 CMU 指令的有效长度为 60 位。因此,每个字最多包含 4 条指令,或者两个 30 位,或者一对 15 位和一个 30 位。 30 位指令不能跨字。由于分支目的地可能仅引用单词,因此跳转目标是单词对齐的。

该架构没有堆栈。事实上,过程调用指令RJ本质上是不可重入的。 RJ 通过在 RJ 指令所在位置写入跳转到下一条指令来修改被调用过程的第一个字。被调用的过程通过跳转到其开头返回到调用者,该开头是为返回链接保留的。程序从第二个字开始。为了实现递归,大多数编译器都使用辅助函数。

寄存器文件有八个实例,每种寄存器三种类型,A0..A7 用于地址操作,B0..B7 用于索引,X0..X7 用于一般算术。 A、B寄存器均为18位; X 寄存器是 60 位。设置 A1 到 A5 会产生副作用,即使用加载地址的内容加载相应的 X1 到 X5 寄存器。设置A6或A7会将相应的X6或X7内容写入到加载到A寄存器的地址中。 A0和X0没有连接。 B 寄存器几乎可以在每条指令中用作与任何其他 A、B 或 X 寄存器相加或相减的值。因此它们非常适合小型柜台。

为了提高代码效率,B 寄存器用于循环变量,因为可以对它们使用直接比较指令(B2 < 100 等);与 X 寄存器的比较仅限于与零的关系,因此将 X 寄存器与 100 进行比较,例如,需要减去 100 并测试结果是否小于零,等等。如果允许对循环变量进行赋值,则为 60 位值在分配给 B 寄存器之前必须进行范围检查。这真是一个麻烦。 Herr Wirth 可能认为麻烦和低效率都不值得使用——程序员总是可以使用 whilerepeat...until< /code> 在这种情况下循环。

额外的怪异

Pascal 语言独有的几个特性与网络的各个方面直接相关:

  • pack 关键字:单个“字符”消耗 60 位单词,或者每个单词包含十个字符。
  • (不寻常的)alfa 类型:char 的压缩数组 [1..10]
  • 内部过程 pack()unpack() 处理压缩字符。它们不对现代架构执行任何转换,仅执行类型转换。
  • text 文件与 char 文件 的怪异之处在于
  • 没有明确的换行符。记录管理是用 writeln 显式调用的,
  • 虽然 set of char 在 CDC 上非常有用,但由于其过多的内存使用(32 字节),它在许多后续 8 位计算机上不受支持。 8 位 ASCII 的变量/常量)。相比之下,单个网络单词可以通过省略换行符和其他内容来管理本机 62 个字符集。
  • 完整表达式评估(与快捷方式相比​​)。这些不是通过跳转和设置 1 或 0 来实现的(就像当今大多数代码生成器所做的那样),而是通过使用实现布尔算术的 CPU 指令来实现。

Pascal was first implemented for the CDC Cyber—a 1960s and 1970s mainframe—which like many CPUs today, had excellent sequential instruction execution performance, but also a significant performance penalty for branches. This and other characteristics of the Cyber architecture probably heavily influenced Pascal's design of for loops.

The Short Answer is that allowing assignment of a loop variable would require extra guard code and messed up optimization for loop variables which could ordinarily be handled well in 18-bit index registers. In those days, software performance was highly valued due to the expense of the hardware and inability to speed it up any other way.

Long Answer

The Control Data Corporation 6600 family, which includes the Cyber, is a RISC architecture using 60-bit central memory words referenced by 18-bit addresses. Some models had an (expensive, therefore uncommon) option, the Compare-Move Unit (CMU), for directly addressing 6-bit character fields, but otherwise there was no support for "bytes" of any sort. Since the CMU could not be counted on in general, most Cyber code was generated for its absence. Ten characters per word was the usual data format until support for lowercase characters gave way to a tentative 12-bit character representation.

Instructions are 15 bits or 30 bits long, except for the CMU instructions being effectively 60 bits long. So up to 4 instructions packed into each word, or two 30 bit, or a pair of 15 bit and one 30 bit. 30 bit instructions cannot span words. Since branch destinations may only reference words, jump targets are word-aligned.

The architecture has no stack. In fact, the procedure call instruction RJ is intrinsically non-re-entrant. RJ modifies the first word of the called procedure by writing a jump to the next instruction after where the RJ instruction is. Called procedures return to the caller by jumping to their beginning, which is reserved for return linkage. Procedures begin at the second word. To implement recursion, most compilers made use of a helper function.

The register file has eight instances each of three kinds of register, A0..A7 for address manipulation, B0..B7 for indexing, and X0..X7 for general arithmetic. A and B registers are 18 bits; X registers are 60 bits. Setting A1 through A5 has the side effect of loading the corresponding X1 through X5 register with the contents of the loaded address. Setting A6 or A7 writes the corresponding X6 or X7 contents to the address loaded into the A register. A0 and X0 are not connected. The B registers can be used in virtually every instruction as a value to add or subtract from any other A, B, or X register. Hence they are great for small counters.

For efficient code, a B register is used for loop variables since direct comparison instructions can be used on them (B2 < 100, etc.); comparisons with X registers are limited to relations to zero, so comparing an X register to 100, say, requires subtracting 100 and testing the result for less than zero, etc. If an assignment to the loop variable were allowed, a 60-bit value would have to be range-checked before assignment to the B register. This is a real hassle. Herr Wirth probably figured that both the hassle and the inefficiency wasn't worth the utility--the programmer can always use a while or repeat...until loop in that situation.

Additional weirdness

Several unique-to-Pascal language features relate directly to aspects of the Cyber:

  • the pack keyword: either a single "character" consumes a 60-bit word, or it is packed ten characters per word.
  • the (unusual) alfa type: packed array [1..10] of char
  • intrinsic procedures pack() and unpack() to deal with packed characters. These perform no transformation on modern architectures, only type conversion.
  • the weirdness of text files vs. file of char
  • no explicit newline character. Record management was explicitly invoked with writeln
  • While set of char was very useful on CDCs, it was unsupported on many subsequent 8 bit machines due to its excess memory use (32-byte variables/constants for 8-bit ASCII). In contrast, a single Cyber word could manage the native 62-character set by omitting newline and something else.
  • full expression evaluation (versus shortcuts). These were implemented not by jumping and setting one or zero (as most code generators do today), but by using CPU instructions implementing Boolean arithmetic.
幽蝶幻影 2024-08-24 23:27:24

Pascal 最初被设计为一种教学语言,以鼓励块结构编程。 Kernighan(K&R 中的 K)写了一篇关于 Pascal 局限性的文章(可以理解是有偏见的),为什么 Pascal 不是我最喜欢的编程语言

禁止修改 Pascal 所称的 for 循环的控制变量,再加上缺少 break 语句,这意味着可以无需研究其内容即可知道循环体执行了多少次。

如果没有 break 语句,并且在循环终止后无法使用控制变量,则比无法在循环内修改控制变量更具限制,因为它会阻止某些字符串和数组处理算法不再以“显而易见”的方式编写。

Pascal 和 C 之间的这些和其他差异反映了它们最初设计时的不同哲学:Pascal 强制执行“正确”设计的概念,C 允许或多或少的任何东西,无论有多么危险。

(注意:Delphi 确实有一个 Break 语句,以及 ContinueExit ,它类似于 return

显然,我们永远需要能够在for循环中修改控制变量,因为我们总是可以使用while重写环形。使用这种行为的 C 语言示例可以在 K&R 第 7.3 节中找到,其中介绍了 printf() 的简单版本。处理格式字符串 fmt 中的 '%' 序列的代码是:

for (p = fmt; *p; p++) {
    if (*p != '%') {
        putchar(*p);
        continue;
    }
    switch (*++p) {
    case 'd':
        /* handle integers */
        break;
    case 'f':
        /* handle floats */
        break;
    case 's':
        /* handle strings */
        break;
    default:
        putchar(*p);
        break;
    }
}

虽然它使用指针作为循环变量,但它同样可以用整数索引写入字符串:

for (i = 0; i < strlen(fmt); i++) {
    if (fmt[i] != '%') {
        putchar(fmt[i]);
        continue;
    }
    switch (fmt[++i]) {
    case 'd':
        /* handle integers */
        break;
    case 'f':
        /* handle floats */
        break;
    case 's':
        /* handle strings */
        break;
    default:
        putchar(fmt[i]);
        break;
    }
}

Pascal was originally designed as a teaching language to encourage block-structured programming. Kernighan (the K of K&R) wrote an (understandably biased) essay on Pascal's limitations, Why Pascal is Not My Favorite Programming Language.

The prohibition on modifying what Pascal calls the control variable of a for loop, combined with the lack of a break statement means that it is possible to know how many times the loop body is executed without studying its contents.

Without a break statement, and not being able to use the control variable after the loop terminates is more of a restriction than not being able to modify the control variable inside the loop as it prevents some string and array processing algorithms from being written in the "obvious" way.

These and other difference between Pascal and C reflect the different philosophies with which they were first designed: Pascal to enforce a concept of "correct" design, C to permit more or less anything, no matter how dangerous.

(Note: Delphi does have a Break statement however, as well as Continue, and Exit which is like return in C.)

Clearly we never need to be able to modify the control variable in a for loop, because we can always rewrite using a while loop. An example in C where such behaviour is used can be found in K&R section 7.3, where a simple version of printf() is introduced. The code that handles '%' sequences within a format string fmt is:

for (p = fmt; *p; p++) {
    if (*p != '%') {
        putchar(*p);
        continue;
    }
    switch (*++p) {
    case 'd':
        /* handle integers */
        break;
    case 'f':
        /* handle floats */
        break;
    case 's':
        /* handle strings */
        break;
    default:
        putchar(*p);
        break;
    }
}

Although this uses a pointer as the loop variable, it could equally have been written with an integer index into the string:

for (i = 0; i < strlen(fmt); i++) {
    if (fmt[i] != '%') {
        putchar(fmt[i]);
        continue;
    }
    switch (fmt[++i]) {
    case 'd':
        /* handle integers */
        break;
    case 'f':
        /* handle floats */
        break;
    case 's':
        /* handle strings */
        break;
    default:
        putchar(fmt[i]);
        break;
    }
}
南渊 2024-08-24 23:27:24

它可以使一些优化(例如循环展开)变得更容易:不需要复杂的静态分析来确定循环行为是否可预测。

It can make some optimizations (loop unrolling for instance) easier: no need for complicated static analysis to determine if the loop behavior is predictable or not.

单身情人 2024-08-24 23:27:24

来自 For 循环

在某些语言(非 C 或 C++)中
循环变量在循环内是不可变的
循环体的范围,与任何
尝试修改其值
视为语义错误。这样的
修改有时是
程序员错误的后果,
这可能很难
一旦制作完成即可识别。然而只是公开的
变化可能会被检测到
编译器。情况
传递循环变量的地址
作为子程序的参数
非常难以检查,因为
例行公事的行为一般
编译器不知道。

所以这似乎是为了帮助你以后不要烫伤你的手。

From For loop

In some languages (not C or C++) the
loop variable is immutable within the
scope of the loop body, with any
attempt to modify its value being
regarded as a semantic error. Such
modifications are sometimes a
consequence of a programmer error,
which can be very difficult to
identify once made. However only overt
changes are likely to be detected by
the compiler. Situations where the
address of the loop variable is passed
as an argument to a subroutine make it
very difficult to check, because the
routine's behaviour is in general
unknowable to the compiler.

So this seems to be to help you not burn your hand later on.

执手闯天涯 2024-08-24 23:27:24

免责声明:自从我上次使用 PASCAL 以来已经有几十年了,所以我的语法可能不完全正确。

您必须记住,PASCAL 是 Nicklaus Wirth 的孩子,而 Wirth 在设计 PASCAL(及其所有后继者)时非常关心可靠性和可理解性。

考虑以下代码片段:

FOR I := 1 TO 42 (* THE UNIVERSAL ANSWER *) DO FOO(I);

在不查看过程 FOO 的情况下,回答以下问题: 该循环是否会结束?你怎么知道?过程 FOO 在循环中被调用了多少次?你怎么知道?

PASCAL 禁止修改循环体中的索引变量,以便可以知道这些问题的答案,并且知道当过程 FOO 更改时答案不会更改。

Disclaimer: It has been decades since I last did PASCAL, so my syntax may not be exactly correct.

You have to remember that PASCAL is Nicklaus Wirth's child, and Wirth cared very strongly about reliability and understandability when he designed PASCAL (and all of its successors).

Consider the following code fragment:

FOR I := 1 TO 42 (* THE UNIVERSAL ANSWER *) DO FOO(I);

Without looking at procedure FOO, answer these questions: Does this loop ever end? How do you know? How many times is procedure FOO called in the loop? How do you know?

PASCAL forbids modifying the index variable in the loop body so that it is POSSIBLE to know the answers to those questions, and know that the answers won't change when and if procedure FOO changes.

不甘平庸 2024-08-24 23:27:24

可以肯定地得出这样的结论:Pascal 的设计目的是防止修改循环内的 for 循环索引。值得注意的是,Pascal 绝不是唯一阻止程序员这样做的语言,Fortran 是另一个例子。

以这种方式设计语言有两个令人信服的理由:

  1. 程序,特别是其中的 for 循环,更容易理解,因此更容易编写、修改和验证。
  2. 如果编译器知道循环的行程计数是在进入循环之前建立的并且此后保持不变,则循环更容易优化。

对于许多算法来说,这种行为是必需的行为;例如,更新数组中的所有元素。如果没记错的话,Pascal 还提供了 do-while 循环和 Repeat-until 循环。我猜想,大多数以 C 风格语言实现并修改循环索引变量或跳出循环的算法都可以使用这些替代形式的循环轻松实现。

我绞尽脑汁,未能找到允许在循环内修改循环索引变量的令人信服的理由,但我一直认为这样做是糟糕的设计,并且选择正确的循环构造作为元素良好的设计。

问候

马克

It's probably safe to conclude that Pascal was designed to prevent modification of a for loop index inside the loop. It's worth noting that Pascal is by no means the only language which prevents programmers doing this, Fortran is another example.

There are two compelling reasons for designing a language that way:

  1. Programs, specifically the for loops in them, are easier to understand and therefore easier to write and to modify and to verify.
  2. Loops are easier to optimise if the compiler knows that the trip count through a loop is established before entry to the loop and invariant thereafter.

For many algorithms this behaviour is the required behaviour; updating all the elements in an array for example. If memory serves Pascal also provides do-while loops and repeat-until loops. Most, I guess, algorithms which are implemented in C-style languages with modifications to the loop index variable or breaks out of the loop could just as easily be implemented with these alternative forms of loop.

I've scratched my head and failed to find a compelling reason for allowing the modification of a loop index variable inside the loop, but then I've always regarded doing so as bad design, and the selection of the right loop construct as an element of good design.

Regards

Mark

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文