语言构造和“内置”构造之间有什么区别？ PHP 中的函数？

发布于 2024-07-29 00:44:37 字数 363 浏览 18 评论 0原文

我知道 include、isset、require、print、echo 和一些其他的不是函数而是语言结构。

其中一些语言结构需要括号，另一些则不需要。

require 'file.php';
isset($x);

有些有返回值，有些则没有。

print 'foo'; //1
echo  'foo'; //no return value

那么语言构造和内置函数之间的内部区别是什么？

原文

I know that include, isset, require, print, echo, and some others are not functions but language constructs.

Some of these language constructs need parentheses, others don't.

require 'file.php';
isset($x);

Some have a return value, others do not.

print 'foo'; //1
echo  'foo'; //no return value

So what is the internal difference between a language construct and a built-in function?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

怎会甘心 2024-08-05 00:44:37

（这比我预期的要长；请耐心等待。）

大多数语言都是由所谓的“语法”组成：语言由几个定义明确的关键字以及您可以在其中构造的完整表达式范围组成。语言是根据该语法构建的。

例如，假设您有一种简单的四函数算术“语言”，它只接受单位数整数作为输入，并完全忽略运算顺序（我告诉过您这是一种简单的语言）。该语言可以通过以下语法来定义：

// The | means "or" and the := represents definition
$expression := $number | $expression $operator $expression
$number := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
$operator := + | - | * | /

根据这三个规则，您可以构建任意数量的单位数输入算术表达式。然后，您可以为此语法编写一个解析器，将任何有效输入分解为其组件类型（$expression、$number 或 $operator ）并处理结果。例如，表达式 3 + 4 * 5 可以分解如下：

// Parentheses used for ease of explanation; they have no true syntactical meaning
$expression = 3 + 4 * 5
            = $expression $operator (4 * 5) // Expand into $exp $op $exp
            = $number $operator $expression // Rewrite: $exp -> $num
            = $number $operator $expression $operator $expression // Expand again
            = $number $operator $number $operator $number // Rewrite again

现在我们有了一个用我们定义的语言完全解析原始表达式的语法。一旦我们有了这个，我们就可以编写一个解析器来查找 $number $operator $number 的所有组合的结果，并在只有一个 $ 时输出结果剩余数量。

请注意，原始表达式的最终解析版本中没有留下任何 $expression 结构。这是因为 $expression 总是可以简化为我们语言中其他事物的组合。

PHP 也很相似：语言结构被认为相当于我们的 $number 或 $operator。它们不能简化为其他语言结构；相反，它们是构建语言的基本单位。函数和语言结构之间的主要区别在于：解析器直接处理语言结构。它将函数简化为语言结构。

语言构造可能需要也可能不需要括号的原因以及某些具有返回值而其他则没有的原因完全取决于 PHP 解析器实现的具体技术细节。我不太熟悉解析器的工作原理，所以我无法具体解决这些问题，但想象一下以此开头的语言：

$expression := ($expression) | ...

实际上，这种语言可以自由地采用它找到并获取的任何表达式去掉周围的括号。 PHP（这里我使用的是纯粹的猜测）可能会使用类似的语言结构：print("Hello") 可能会简化为 print "Hello" 之前它被解析，反之亦然（语言定义可以添加括号以及删除它们）。

这就是为什么您无法重新定义诸如 echo 或 print 之类的语言构造的根本原因：它们被有效地硬编码到解析器中，而函数则映射到一组语言构造和解析器允许您在编译或运行时更改该映射以替换您自己的语言构造或表达式集。

归根结底，构造和表达式之间的内部区别是：语言构造由解析器理解和处理。内置函数虽然由语言提供，但在解析之前会被映射并简化为一组语言结构。

更多信息：

Backus-Naur 形式，用于定义形式语言的语法（yacc 使用这种形式）

编辑：通读其他一些答案，人们提出了很好的观点。其中：

语言内置函数的调用速度比函数更快。这是正确的，即使只是轻微的，因为 PHP 解释器不需要在解析之前将该函数映射到其语言内置的等效函数。但在现代机器上，差异可以忽略不计。
内置语言绕过错误检查。这可能是真的，也可能不是，具体取决于每个内置函数的 PHP 内部实现。确实，函数通常会具有更高级的错误检查和内置函数所没有的其他功能。
语言构造不能用作函数回调。这是真的，因为构造不是函数。他们是独立的实体。当您编写内置函数时，您并不是在编写带有参数的函数 - 内置函数的语法由解析器直接处理，并且被识别为内置函数，而不是函数。（如果您考虑具有一流函数的语言，这可能更容易理解：实际上，您可以将函数作为对象传递。您不能使用内置函数来做到这一点。）

(This is longer than I intended; please bear with me.)

Most languages are made up of something called a "syntax": the language is comprised of several well-defined keywords, and the complete range of expressions that you can construct in that language is built up from that syntax.

For example, let's say you have a simple four-function arithmetic "language" that only takes single-digit integers as input and completely ignores order of operations (I told you it was a simple language). That language could be defined by the syntax:

// The | means "or" and the := represents definition
$expression := $number | $expression $operator $expression
$number := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
$operator := + | - | * | /

From these three rules, you can build any number of single-digit-input arithmetic expressions. You can then write a parser for this syntax that breaks down any valid input into its component types ($expression, $number, or $operator) and deals with the result. For example, the expression 3 + 4 * 5 can be broken down as follows:

// Parentheses used for ease of explanation; they have no true syntactical meaning
$expression = 3 + 4 * 5
            = $expression $operator (4 * 5) // Expand into $exp $op $exp
            = $number $operator $expression // Rewrite: $exp -> $num
            = $number $operator $expression $operator $expression // Expand again
            = $number $operator $number $operator $number // Rewrite again

Now we have a fully parsed syntax, in our defined language, for the original expression. Once we have this, we can go through and write a parser to find the results of all the combinations of $number $operator $number, and spit out a result when we only have one $number left.

Take note that there are no $expression constructs left in the final parsed version of our original expression. That's because $expression can always be reduced to a combination of other things in our language.

PHP is much the same: language constructs are recognized as the equivalent of our $number or $operator. They cannot be reduced into other language constructs; instead, they're the base units from which the language is built up. The key difference between functions and language constructs is this: the parser deals directly with language constructs. It simplifies functions into language constructs.

The reason that language constructs may or may not require parentheses and the reason some have return values while others don't depends entirely on the specific technical details of the PHP parser implementation. I'm not that well-versed in how the parser works, so I can't address these questions specifically, but imagine for a second a language that starts with this:

$expression := ($expression) | ...

Effectively, this language is free to take any expressions it finds and get rid of the surrounding parentheses. PHP (and here I'm employing pure guesswork) may employ something similar for its language constructs: print("Hello") might get reduced down to print "Hello" before it's parsed, or vice-versa (language definitions can add parentheses as well as get rid of them).

This is the root of why you can't redefine language constructs like echo or print: they're effectively hardcoded into the parser, whereas functions are mapped to a set of language constructs and the parser allows you to change that mapping at compile- or runtime to substitute your own set of language constructs or expressions.

At the end of the day, the internal difference between constructs and expressions is this: language constructs are understood and dealt with by the parser. Built-in functions, while provided by the language, are mapped and simplified to a set of language constructs before parsing.

More info:

Backus-Naur form, the syntax used to define formal languages (yacc uses this form)

Edit: Reading through some of the other answers, people make good points. Among them:

A language builtin is faster to call than a function. This is true, if only marginally, because the PHP interpreter doesn't need to map that function to its language-builtin equivalents before parsing. On a modern machine, though, the difference is fairly negligible.
A language builtin bypasses error-checking. This may or may not be true, depending on the PHP internal implementation for each builtin. It is certainly true that more often than not, functions will have more advanced error-checking and other functionality that builtins don't.
Language constructs can't be used as function callbacks. This is true, because a construct is not a function. They're separate entities. When you code a builtin, you're not coding a function that takes arguments - the syntax of the builtin is handled directly by the parser, and is recognized as a builtin, rather than a function. (This may be easier to understand if you consider languages with first-class functions: effectively, you can pass functions around as objects. You can't do that with builtins.)

回复收藏 0 原文

愿与i 2024-08-05 00:44:37

语言结构由语言本身提供（如“if”、“while”等指令）；因此他们的名字。

这样做的一个后果是它们比预定义或用户定义的函数调用速度更快（或者我已经听过/读过几次）

我不知道它是如何完成的，但有一件事他们可以做的（因为直接集成到语言中）是“绕过”某种错误处理机制。例如，isset() 可以与不存在的变量一起使用，而不会引起任何通知、警告或错误。

function test($param) {}
if (test($a)) {
    // Notice: Undefined variable: a
}

if (isset($b)) {
    // No notice
}

*请注意，并非所有语言的结构都是如此。

函数和语言结构之间的另一个区别是，其中一些函数可以在没有括号的情况下调用，例如关键字。

例如：

echo 'test'; // language construct => OK

function my_function($param) {}
my_function 'test'; // function => Parse error: syntax error, unexpected T_CONSTANT_ENCAPSED_STRING

这里也不是所有语言构造都是如此。

我认为绝对没有办法“禁用”语言构造，因为它是语言本身的一部分。另一方面，许多“内置”PHP 函数并不是真正内置的，因为它们是由扩展提供的，因此它们始终处于活动状态（但不是全部）

另一个区别是语言构造不能用作“函数指针”（例如，我的意思是回调）：

$a = array(10, 20);

function test($param) {echo $param . '<br />';}
array_map('test', $a);  // OK (function)

array_map('echo', $a);  // Warning: array_map() expects parameter 1 to be a valid callback, function 'echo' not found or invalid function name

我现在没有任何其他想法......而且我对内部结构了解不多PHP...所以现在就这样了 ^^

如果您在这里没有得到太多答案，也许您可以向邮件列表内部人员询问（请参阅http://www.php.net/mailing-lists.php ），这里有很多 PHP 核心 -开发商；他们可能会知道这些东西 ^^

（我对其他答案真的很感兴趣，顺便说一句 ^^ ）

作为参考：PHP 中的关键字和语言结构列表

Language constructs are provided by the language itself (like instructions like "if", "while", ...) ; hence their name.

One consequence of that is they are faster to be invoked than pre-defined or user-defined functions (or so I've heard/read several times)

I have no idea how it's done, but one thing they can do (because of being integrated directly into the langage) is "bypass" some kind of error handling mechanism. For instance, isset() can be used with non-existing variables without causing any notice, warning or error.

function test($param) {}
if (test($a)) {
    // Notice: Undefined variable: a
}

if (isset($b)) {
    // No notice
}

*Note it's not the case for the constructs of all languages.

Another difference between functions and language constructs is that some of those can be called without parenthesis, like a keyword.

For instance :

echo 'test'; // language construct => OK

function my_function($param) {}
my_function 'test'; // function => Parse error: syntax error, unexpected T_CONSTANT_ENCAPSED_STRING

Here too, it's not the case for all language constructs.

I suppose there is absolutely no way to "disable" a language construct because it is part of the language itself. On the other hand, lots of "built-in" PHP functions are not really built-in because they are provided by extensions such that they are always active (but not all of them)

Another difference is that language constructs can't be used as "function pointers" (I mean, callbacks, for instance) :

$a = array(10, 20);

function test($param) {echo $param . '<br />';}
array_map('test', $a);  // OK (function)

array_map('echo', $a);  // Warning: array_map() expects parameter 1 to be a valid callback, function 'echo' not found or invalid function name

I don't have any other idea coming to my mind right now... and I don't know much about the internals of PHP... So that'll be it right now ^^

If you don't get much answers here, maybe you could ask this to the mailing-list internals (see http://www.php.net/mailing-lists.php ), where there are many PHP core-developers ; they are the ones who would probably know about that stuff ^^

(And I'm really interested by the other answers, btw ^^ )

As a reference : list of keywords and language constructs in PHP

回复收藏 0 原文