PHP的内置函数内部是如何实现的?

发布于 2024-11-02 04:12:50 字数 354 浏览 8 评论 0原文

这些函数的编写方式与用户函数相同吗?我的意思是使用 PHP 代码和正则表达式之类的东西?

例如:

filter_var($email, FILTER_VALIDATE_EMAIL);

http ://www.totallyphp.co.uk/code/validate_an_email_address_using_regular_expressions.htm

are these functions written the same way as user functions? I mean with PHP code and with regular expressions and stuff like that?

For example:

filter_var($email, FILTER_VALIDATE_EMAIL);

vs.

http://www.totallyphp.co.uk/code/validate_an_email_address_using_regular_expressions.htm

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

演出会有结束 2024-11-09 04:12:50

PHP 是用 C 编写的。PHP 函数是用高质量的 C 代码编写的,然后编译形成 PHP 语言库,

如果您想扩展 PHP(编辑/编写)自己的函数,请查看:http://www.php.net/~wez/extending-php.pdf

编辑:

给你:

这个是该函数的原始 C 代码:

/* {{{ proto mixed filter_var(mixed variable [, long filter [, mixed options]])
 * Returns the filtered version of the vriable.
 */
PHP_FUNCTION(filter_var)
{
    long filter = FILTER_DEFAULT;
    zval **filter_args = NULL, *data;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z/|lZ", &data, &filter, &filter_args) == FAILURE) {
        return;
    }

    if (!PHP_FILTER_ID_EXISTS(filter)) {
        RETURN_FALSE;
    }

    MAKE_COPY_ZVAL(&data, return_value);

    php_filter_call(&return_value, filter, filter_args, 1, FILTER_REQUIRE_SCALAR TSRMLS_CC);
}
/* }}} */



static void php_filter_call(zval **filtered, long filter, zval **filter_args, const int copy, long filter_flags TSRMLS_DC) /* {{{ */
{
    zval  *options = NULL;
    zval **option;
    char  *charset = NULL;

    if (filter_args && Z_TYPE_PP(filter_args) != IS_ARRAY) {
        long lval;

        PHP_FILTER_GET_LONG_OPT(filter_args, lval);

        if (filter != -1) { /* handler for array apply */
            /* filter_args is the filter_flags */
            filter_flags = lval;

            if (!(filter_flags & FILTER_REQUIRE_ARRAY ||  filter_flags & FILTER_FORCE_ARRAY)) {
                filter_flags |= FILTER_REQUIRE_SCALAR;
            }
        } else {
            filter = lval;
        }
    } else if (filter_args) {
        if (zend_hash_find(HASH_OF(*filter_args), "filter", sizeof("filter"), (void **)&option) == SUCCESS) {
            PHP_FILTER_GET_LONG_OPT(option, filter);
        }

        if (zend_hash_find(HASH_OF(*filter_args), "flags", sizeof("flags"), (void **)&option) == SUCCESS) {
            PHP_FILTER_GET_LONG_OPT(option, filter_flags);

            if (!(filter_flags & FILTER_REQUIRE_ARRAY ||  filter_flags & FILTER_FORCE_ARRAY)) {
                filter_flags |= FILTER_REQUIRE_SCALAR;
            }
        }

        if (zend_hash_find(HASH_OF(*filter_args), "options", sizeof("options"), (void **)&option) == SUCCESS) {
            if (filter != FILTER_CALLBACK) {
                if (Z_TYPE_PP(option) == IS_ARRAY) {
                    options = *option;
                }
            } else {
                options = *option;
                filter_flags = 0;
            }
        }
    }

    if (Z_TYPE_PP(filtered) == IS_ARRAY) {
        if (filter_flags & FILTER_REQUIRE_SCALAR) {
            if (copy) {
                SEPARATE_ZVAL(filtered);
            }
            zval_dtor(*filtered);
            if (filter_flags & FILTER_NULL_ON_FAILURE) {
                ZVAL_NULL(*filtered);
            } else {
                ZVAL_FALSE(*filtered);
            }
            return;
        }
        php_zval_filter_recursive(filtered, filter, filter_flags, options, charset, copy TSRMLS_CC);
        return;
    }
    if (filter_flags & FILTER_REQUIRE_ARRAY) {
        if (copy) {
            SEPARATE_ZVAL(filtered);
        }
        zval_dtor(*filtered);
        if (filter_flags & FILTER_NULL_ON_FAILURE) {
            ZVAL_NULL(*filtered);
        } else {
            ZVAL_FALSE(*filtered);
        }
        return;
    }

    php_zval_filter(filtered, filter, filter_flags, options, charset, copy TSRMLS_CC);
    if (filter_flags & FILTER_FORCE_ARRAY) {
        zval *tmp;

        ALLOC_ZVAL(tmp);
        MAKE_COPY_ZVAL(filtered, tmp);

        zval_dtor(*filtered);

        array_init(*filtered);
        add_next_index_zval(*filtered, tmp);
    }
}

这是您验证电子邮件的例程:
——这回答了你的问题。 是的,它是由正则表达式内部完成的。

void php_filter_validate_email(PHP_INPUT_FILTER_PARAM_DECL) /* {{{ */
{
    /*
     * The regex below is based on a regex by Michael Rushton.
     * However, it is not identical.  I changed it to only consider routeable
     * addresses as valid.  Michael's regex considers a@b a valid address
     * which conflicts with section 2.3.5 of RFC 5321 which states that:
     *
     *   Only resolvable, fully-qualified domain names (FQDNs) are permitted
     *   when domain names are used in SMTP.  In other words, names that can
     *   be resolved to MX RRs or address (i.e., A or AAAA) RRs (as discussed
     *   in Section 5) are permitted, as are CNAME RRs whose targets can be
     *   resolved, in turn, to MX or address RRs.  Local nicknames or
     *   unqualified names MUST NOT be used.
     *
     * This regex does not handle comments and folding whitespace.  While
     * this is technically valid in an email address, these parts aren't
     * actually part of the address itself.
     *
     * Michael's regex carries this copyright:
     *
     * Copyright © Michael Rushton 2009-10
     * http://squiloople.com/
     * Feel free to use and redistribute this code. But please keep this copyright notice.
     *
     */
    const char regexp[] = "/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD";

    pcre       *re = NULL;
    pcre_extra *pcre_extra = NULL;
    int preg_options = 0;
    int         ovector[150]; /* Needs to be a multiple of 3 */
    int         matches;


    /* The maximum length of an e-mail address is 320 octets, per RFC 2821. */
    if (Z_STRLEN_P(value) > 320) {
        RETURN_VALIDATION_FAILED
    }

    re = pcre_get_compiled_regex((char *)regexp, &pcre_extra, &preg_options TSRMLS_CC);
    if (!re) {
        RETURN_VALIDATION_FAILED
    }
    matches = pcre_exec(re, NULL, Z_STRVAL_P(value), Z_STRLEN_P(value), 0, 0, ovector, 3);

    /* 0 means that the vector is too small to hold all the captured substring offsets */
    if (matches < 0) {
        RETURN_VALIDATION_FAILED
    }

}
/* }}} */

PHP is written in C. The PHP functions are written in high quality C code then compiled to form the PHP langugae library

if you want to extend PHP (edit / write) own functions check this out: http://www.php.net/~wez/extending-php.pdf

EDIT:

here you go :

This is the original C code for the function:

/* {{{ proto mixed filter_var(mixed variable [, long filter [, mixed options]])
 * Returns the filtered version of the vriable.
 */
PHP_FUNCTION(filter_var)
{
    long filter = FILTER_DEFAULT;
    zval **filter_args = NULL, *data;

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "z/|lZ", &data, &filter, &filter_args) == FAILURE) {
        return;
    }

    if (!PHP_FILTER_ID_EXISTS(filter)) {
        RETURN_FALSE;
    }

    MAKE_COPY_ZVAL(&data, return_value);

    php_filter_call(&return_value, filter, filter_args, 1, FILTER_REQUIRE_SCALAR TSRMLS_CC);
}
/* }}} */



static void php_filter_call(zval **filtered, long filter, zval **filter_args, const int copy, long filter_flags TSRMLS_DC) /* {{{ */
{
    zval  *options = NULL;
    zval **option;
    char  *charset = NULL;

    if (filter_args && Z_TYPE_PP(filter_args) != IS_ARRAY) {
        long lval;

        PHP_FILTER_GET_LONG_OPT(filter_args, lval);

        if (filter != -1) { /* handler for array apply */
            /* filter_args is the filter_flags */
            filter_flags = lval;

            if (!(filter_flags & FILTER_REQUIRE_ARRAY ||  filter_flags & FILTER_FORCE_ARRAY)) {
                filter_flags |= FILTER_REQUIRE_SCALAR;
            }
        } else {
            filter = lval;
        }
    } else if (filter_args) {
        if (zend_hash_find(HASH_OF(*filter_args), "filter", sizeof("filter"), (void **)&option) == SUCCESS) {
            PHP_FILTER_GET_LONG_OPT(option, filter);
        }

        if (zend_hash_find(HASH_OF(*filter_args), "flags", sizeof("flags"), (void **)&option) == SUCCESS) {
            PHP_FILTER_GET_LONG_OPT(option, filter_flags);

            if (!(filter_flags & FILTER_REQUIRE_ARRAY ||  filter_flags & FILTER_FORCE_ARRAY)) {
                filter_flags |= FILTER_REQUIRE_SCALAR;
            }
        }

        if (zend_hash_find(HASH_OF(*filter_args), "options", sizeof("options"), (void **)&option) == SUCCESS) {
            if (filter != FILTER_CALLBACK) {
                if (Z_TYPE_PP(option) == IS_ARRAY) {
                    options = *option;
                }
            } else {
                options = *option;
                filter_flags = 0;
            }
        }
    }

    if (Z_TYPE_PP(filtered) == IS_ARRAY) {
        if (filter_flags & FILTER_REQUIRE_SCALAR) {
            if (copy) {
                SEPARATE_ZVAL(filtered);
            }
            zval_dtor(*filtered);
            if (filter_flags & FILTER_NULL_ON_FAILURE) {
                ZVAL_NULL(*filtered);
            } else {
                ZVAL_FALSE(*filtered);
            }
            return;
        }
        php_zval_filter_recursive(filtered, filter, filter_flags, options, charset, copy TSRMLS_CC);
        return;
    }
    if (filter_flags & FILTER_REQUIRE_ARRAY) {
        if (copy) {
            SEPARATE_ZVAL(filtered);
        }
        zval_dtor(*filtered);
        if (filter_flags & FILTER_NULL_ON_FAILURE) {
            ZVAL_NULL(*filtered);
        } else {
            ZVAL_FALSE(*filtered);
        }
        return;
    }

    php_zval_filter(filtered, filter, filter_flags, options, charset, copy TSRMLS_CC);
    if (filter_flags & FILTER_FORCE_ARRAY) {
        zval *tmp;

        ALLOC_ZVAL(tmp);
        MAKE_COPY_ZVAL(filtered, tmp);

        zval_dtor(*filtered);

        array_init(*filtered);
        add_next_index_zval(*filtered, tmp);
    }
}

AND HERE IS YOUR VALIDATE EMAIL ROUTINE:
-- this answers your question. Yes, it is done by regex internally.

void php_filter_validate_email(PHP_INPUT_FILTER_PARAM_DECL) /* {{{ */
{
    /*
     * The regex below is based on a regex by Michael Rushton.
     * However, it is not identical.  I changed it to only consider routeable
     * addresses as valid.  Michael's regex considers a@b a valid address
     * which conflicts with section 2.3.5 of RFC 5321 which states that:
     *
     *   Only resolvable, fully-qualified domain names (FQDNs) are permitted
     *   when domain names are used in SMTP.  In other words, names that can
     *   be resolved to MX RRs or address (i.e., A or AAAA) RRs (as discussed
     *   in Section 5) are permitted, as are CNAME RRs whose targets can be
     *   resolved, in turn, to MX or address RRs.  Local nicknames or
     *   unqualified names MUST NOT be used.
     *
     * This regex does not handle comments and folding whitespace.  While
     * this is technically valid in an email address, these parts aren't
     * actually part of the address itself.
     *
     * Michael's regex carries this copyright:
     *
     * Copyright © Michael Rushton 2009-10
     * http://squiloople.com/
     * Feel free to use and redistribute this code. But please keep this copyright notice.
     *
     */
    const char regexp[] = "/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD";

    pcre       *re = NULL;
    pcre_extra *pcre_extra = NULL;
    int preg_options = 0;
    int         ovector[150]; /* Needs to be a multiple of 3 */
    int         matches;


    /* The maximum length of an e-mail address is 320 octets, per RFC 2821. */
    if (Z_STRLEN_P(value) > 320) {
        RETURN_VALIDATION_FAILED
    }

    re = pcre_get_compiled_regex((char *)regexp, &pcre_extra, &preg_options TSRMLS_CC);
    if (!re) {
        RETURN_VALIDATION_FAILED
    }
    matches = pcre_exec(re, NULL, Z_STRVAL_P(value), Z_STRLEN_P(value), 0, 0, ovector, 3);

    /* 0 means that the vector is too small to hold all the captured substring offsets */
    if (matches < 0) {
        RETURN_VALIDATION_FAILED
    }

}
/* }}} */
森林迷了鹿 2024-11-09 04:12:50

PHP 函数要么是:

  • 用 C 编写,而不是用 PHP 编写
  • ,要么只是其他库提供的函数的包装器(例如,PHP 的curl 扩展只是curl 库的包装器)。

如果你好奇,你可以看一下 PHP 的源代码——这是它的 SVN:http: //svn.php.net/viewvc/

例如,filter_var() 函数应该定义在 过滤器扩展

PHP functions are either :

  • Written in C -- and not in PHP
  • Or just wrappers to functions provided by other libraries (For instance, PHP's curl extension is just a wrapper arround the curl library).

If you are curious, you can take a look at the sources of PHP -- here's its SVN : http://svn.php.net/viewvc/

For instance, the filter_var() function should be defined somewhere in the sources of the filter extension.

北方的巷 2024-11-09 04:12:50

没有。 PHP 内部函数是用 C 编写的,而不是 PHP 代码。由于有许多 Zend 运行时宏以及参数如何从 PHP 传输到 C 结构,这看起来相当笨重。

该特定函数确实使用了正则表达式。这也是一个很好的例子:
http://svn.php.net /repository/php/php-src/branches/PHP_5_3/ext/filter/tical_filters.c
在中间的某个位置查找 regexp[]

Nope. PHP-internal functions are written in C, not with PHP code. Which looks quite unwieldy due to the many Zend-runtime macros and how parameters are transferred from PHP into C structures.

That particular function does use a regular expression. It also makes a nice example:
http://svn.php.net/repository/php/php-src/branches/PHP_5_3/ext/filter/logical_filters.c
Look for regexp[] somewhere in the middle.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文