Perl 可以“静态”运行吗?解析?

发布于 2024-08-02 11:50:40 字数 315 浏览 3 评论 0原文

一篇名为“Perl 无法解析,形式化证明”的文章正在流传。那么,Perl 是否在“运行时”或“编​​译时”决定其解析代码的含义?

在我读过的一些讨论中,我的印象是这些论点源于不精确的术语,所以请尝试在答案中定义您的技术术语。我故意没有定义“运行时”、“静态”或“解析”,以便我可以从那些对这些术语的定义可能与我不同的人那里获得观点。

编辑:

这与静态分析无关。这是一个关于 Perl 行为的理论问题。

An article called "Perl cannot be parsed, a formal proof" is doing the rounds. So, does Perl decide the meaning of its parsed code at "run-time" or "compile-time"?

In some discussions I've read, I get the impression the arguments stem from imprecise terminology, so please try to define your technical terms in your answer. I have deliberately not defined "run-time", "statically" or "parsed" so that I can get perspectives from people who perhaps define those terms differently to me.

Edit:

This isn't about static analysis. Its a theoretical question about Perl's behaviour.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

踏月而来 2024-08-09 11:50:40

Perl 有一个明确定义的“编译时”阶段,随后是一个明确定义的“运行时”阶段。然而,有一些方法可以从一种过渡到另一种。许多动态语言都有 eval 结构,允许在运行时阶段编译新代码;在 Perl 中,相反的情况也是可能的——而且很常见。 BEGIN 块(以及由 use 引起的隐式 BEGIN 块)在编译时调用临时运行时阶段。 BEGIN 块在编译后立即执行,而不是等待编译单元的其余部分(即当前文件或当前eval)进行编译。由于 BEGIN 在编译它们后面的代码之前运行,因此它们几乎可以以任何方式影响后续代码的编译(尽管实际上它们所做的主要事情是导入或定义子例程,或者以启用严格性或警告)。

use Foo; 基本上等同于 BEGIN { require foo; foo->导入(); },require 是从运行时调用编译时的方法之一(如eval STRING),这意味着我们现在处于编译时内的编译时内,并且整个事情是递归的。

无论如何,解析 Perl 的可判定性归结为,由于一位代码的编译可能会受到前一段代码的执行的影响(理论上可以 >任何事情),我们遇到了停机问题类型的情况;一般来说,正确解析给定 Perl 文件的唯一方法是执行它。

Perl has a well-defined "compile time" phase, which is followed by a well-defined "runtime" phase. However, there are ways of transitioning from one to the other. Many dynamic languages have eval constructs that allow compilation of new code during the runtime phase; in Perl the inverse is possible as well -- and common. BEGIN blocks (and the implicit BEGIN block caused by use) invoke a temporary runtime phase during compile-time. A BEGIN block is executed as soon as it's compiled, instead of waiting for the rest of the compilation unit (i.e. current file or current eval) to compile. Since BEGINs run before the code that follows them is compiled, they can influence the compilation of the following code in practically any way (although in practice the main things they do are to import or define subroutines, or to enable strictness or warnings).

A use Foo; is basically equivalent to BEGIN { require foo; foo->import(); }, with require being (like eval STRING) one of the ways to invoke compile-time from runtime, meaning that we're now within compile-time within runtime within compile-time and the whole thing is recursive.

Anyway, what it boils down to for the decidability of parsing Perl is that since the compilation of one bit of code can be influenced by the execution of a preceding piece of code (which can in theory do anything), we've got ourselves a halting-problem type situation; the only way to correctly parse a given Perl file in general is by executing it.

_失温 2024-08-09 11:50:40

Perl 有 BEGIN 块,它在编译时运行用户 Perl 代码。这段代码会影响其他要编译的代码的含义,从而使得解析 Perl 变得“不可能”。

例如,代码:

sub foo { return "OH HAI" }

是“真的”:

BEGIN {
    *{"${package}::foo"} = sub { return "OH HAI" };
}

这意味着有人可以这样编写 Perl:

BEGIN {
    print "Hi user, type the code for foo: ";
    my $code = <>;
    *{"${package}::foo"} = eval $code;
}

显然,没有静态分析工具可以猜测用户将在此处键入什么代码。 (如果用户说 sub ($) {} 而不是 sub {},它甚至会影响对 foo 的调用在整个过程中的解释方式程序的其余部分,可能会导致解析失败。)

好消息是,不可能的情况是非常极端的情况;技术上可行,但几乎可以肯定在实际代码中毫无用处。因此,如果您正在编写静态分析工具,这可能不会给您带来任何麻烦。

公平地说,每种有价值的语言都存在这个问题或类似的问题。举个例子,将你最喜欢的代码遍历器扔到这个 Lisp 代码上:

(iter (for i from 1 to 10) (collect i))

你可能无法预测这是一个生成列表的循环,因为 iter 宏是不透明的,需要特殊知识才能理解。现实情况是,这在理论上很烦人(如果不运行它,或者至少运行 iter 宏,我就无法理解我的代码,它可能永远不会停止使用此输入运行),但非常有用在实践中(迭代对于程序员来说很容易编写并且未来的程序员很容易阅读)。

最后,很多人认为 Perl 缺乏像 Java 那样的静态分析和重构工具,因为解析它相对困难。我怀疑这是真的,我只是认为没有必要,也没有人费心去写它。 (人们确实需要“lint”,所以有 Perl::Critic,例如。)

我需要对 Perl 进行的任何静态分析来生成代码(一些用于维护测试计数器和 Makefile.PL 的 emacs 宏)都工作得很好。奇怪的极端情况会扰乱我的代码吗?当然,但我不会特意编写无法维护的代码,即使我可以。

Perl has BEGIN blocks, which runs user Perl code at compile-time. This code can affect the meaning of other code to be compiled, thus making it "impossible" to parse Perl.

For example, the code:

sub foo { return "OH HAI" }

is "really":

BEGIN {
    *{"${package}::foo"} = sub { return "OH HAI" };
}

That means that someone could write Perl like:

BEGIN {
    print "Hi user, type the code for foo: ";
    my $code = <>;
    *{"${package}::foo"} = eval $code;
}

Obviously, no static analysis tool can guess what code the user is going to type in here. (And if the user says sub ($) {} instead of sub {}, it will even affect how calls to foo are interpreted throughout the rest of the program, potentially throwing off the parsing.)

The good news is that the impossible cases are very corner-casey; technically possible, but almost certainly useless in real code. So if you are writing a static analysis tool, this will probably cause you no trouble.

To be fair, every language worth its salt has this problem, or something similar. As an example, throw your favorite code walker at this Lisp code:

(iter (for i from 1 to 10) (collect i))

You probably can't predict that this is a loop that produces a list, because the iter macro is opaque and would require special knowledge to understand. The reality is that this is annoying in theory (I can't understand my code without running it, or at least running the iter macro, which may not ever stop running with this input), but very useful in practice (iteration is easy for the programmer to write and the future programmer to read).

Finally, a lot of people think that Perl lacks static analysis and refactoring tools, like Java has, because of the relative difficulty in parsing it. I doubt this is true, I just think the need is not there and nobody has bothered to write it. (People do need a "lint", so there is Perl::Critic, for example.)

Any static analysis I have needed to do of Perl to generate code (some emacs macros for maintaining test counters and Makefile.PL) has worked fine. Could weird corner cases throw off my code? Of course, but I don't go out of my way to write code that's impossible to maintain, even though I could.

原野 2024-08-09 11:50:40

人们用了很多词语来解释各个阶段,但其实这是一件简单的事情。在编译 Perl 源代码时,perl 解释器最终可能会运行改变其余代码解析方式的代码。不运行代码的静态分析会错过这一点。

在 Perlmonks 的帖子中,Jeffrey 谈到了他在 The Perl Review 中的文章,这些文章更详细,包括一个示例每次运行时都不会以相同的方式解析的程序。

People have used a lot of words to explain various phases, but it's really a simple matter. While compiling Perl source, the perl intrepreter may end up running code that changes how the rest of the code will parse. Static analysis, which runs no code, will miss this.

In that Perlmonks post, Jeffrey talks about his articles in The Perl Review that go into much more detail, including a sample program that doesn't parse the same way every time you run it.

七堇年 2024-08-09 11:50:40

C++ 在其模板系统中也有类似的问题,但这并不能阻止编译器对其进行编译。它们只会在适用此类论证的极端情况下爆发或永远运行。

C++ has a similar problem in its template system, but that doesn't stop compilers from compiling it. They will just break out or run forever on the corner cases where this sort of argument would apply.

别低头,皇冠会掉 2024-08-09 11:50:40

Perl 有一个编译阶段,但在代码方面它与大多数普通编译阶段不同。 Perl 的词法分析器将代码转换为标记,然后解析器分析标记以形成操作树。但是,BEGIN {} 块可以中断此过程并允许您执行代码。当进行使用时。所有 BEGIN 块都在其他任何事情之前执行,为您提供了一种设置模块和命名空间的方法。在脚本的整体“编译”过程中,您很可能会使用 Perl 来确定 Perl 模块完成后的外观。 sub、bare 意味着将其添加到包的 glob 中,但您不必这样做。例如,这是在模块中设置方法的一种(尽管很奇怪)方式:

package Foo;

use strict;
use warnings;
use List::Util qw/shuffle/;

my @names = qw(foo bar baz bill barn);
my @subs = (
    sub { print "baz!" },
    sub { die; },
    sub { return sub { die } },
);
@names = shuffle @names;
foreach my $index (0..$#subs) {
   no strict 'refs';
   *{$names[$index]} = $subs[$index];
}

1;

必须解释它甚至知道它的作用!它不是很有用,但也不是你可以提前确定的。但它是 100% 有效的 perl。尽管此功能可能被滥用,但它也可以完成出色的任务,例如以编程方式构建看起来非常相似的复杂子系统。这也使得我们很难确切地知道每件事物的作用。

这并不是说 perl 脚本不能被“编译”——在 perl 中,编译只是确定模块应该是什么样子。您可以使用 a 来做到这一点

perl -c myscript.pl

,它会告诉您是否可以到达开始执行主模块的位置。您不能仅通过“静态”查看来了解。

然而,正如 PPI 所示,我们可以接近。真的很接近。足够接近做非常有趣的事情,比如(几乎静态的)代码分析。

那么,“运行时”就成为所有 BEGIN 块执行后发生的事情。 (这是一种简化;还有更多内容。请参阅 perlmod 了解更多信息。)它仍然是 Perl 代码正在运行,但它是一个单独的执行阶段,在所有更高优先级的块运行之后完成。

chromatic 在他的 Modern::Perl 博客上有一些详细的帖子:

Perl has a compile phase, but it's different than most normal compile phases when it comes to code. Perl's lexer turns the code into tokens, then a parser analyzes tokens to form an op tree. However, BEGIN {} blocks can interrupt this process and allow you to execute code. When doing a use. All BEGIN blocks execute before anything else, giving you a way to set up modules and namespaces. During the overall "compile" of a script, you most likely will use Perl to determine how the Perl module should look when it's done. sub, bare, implies adding it to the glob for the package, but you don't have to. For example, this is a (albeit, odd) way of setting up methods in a module:

package Foo;

use strict;
use warnings;
use List::Util qw/shuffle/;

my @names = qw(foo bar baz bill barn);
my @subs = (
    sub { print "baz!" },
    sub { die; },
    sub { return sub { die } },
);
@names = shuffle @names;
foreach my $index (0..$#subs) {
   no strict 'refs';
   *{$names[$index]} = $subs[$index];
}

1;

You have to interpret this to even know what it does! It's not very useful, but it's not something you can determine ahead of time. But it's 100% valid perl. Even though this feature can be abused, it can also do great tasks, like build complicated subs that all look very similar, programatically. It also makes it hard to know, for certain, what everything does.

That's not to say that a perl script can't be 'compiled' - in perl, compiling is merely determining, what right then, the module should look like. You can do that with a

perl -c myscript.pl

and it will tell you whether or not it can get to the point where it will start executing the main module. You just can't merely know from looking at it 'statically'.

However, as PPI demonstrates, we can get close. Really close. Close enough to do very interesting things, like (almost static) code analysis.

"Run Time", then, becomes what happens after all the BEGIN blocks have executed. (This is a simplification; there is a lot more to this. See perlmod for more.) It's still perl code being run, but it's a separate phase of execution, done after all the higher priority blocks have run.

chromatic has some detailed posts on his Modern::Perl blog:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文