将 PHP 文档注释解析为数据结构

发布于 2024-10-12 08:33:27 字数 727 浏览 12 评论 0原文

我正在 PHP 中使用 Reflection API 从方法中提取 DocComment (PHPDoc) 字符串

$r = new ReflectionMethod($object);
$comment = $r->getDocComment();

这将返回一个看起来像这样的字符串(取决于该方法的记录情况)

/**
* Does this great things
*
* @param string $thing
* @return Some_Great_Thing
*/

是否有任何内置方法或函数可以可以将 PHP 文档注释字符串解析为数据结构吗?

$object = some_magic_function_or_method($comment_string);

echo 'Returns a: ', $object->return;

如果缺少这一点,我应该查看 PHPDoc 源代码 的哪一部分来自己执行此操作。

缺少和/或除此之外,是否有第三方代码被认为在这方面比 PHPDoc 代码“更好”?

我意识到解析这些字符串不是火箭科学,甚至不是计算机科学,但我更喜欢一个经过良好测试的库/例程/方法,它是为了处理许多混乱的、半不正确的 PHP 文档代码而构建的。可能存在于野外。

I'm using the Reflection API in PHP to pull a DocComment (PHPDoc) string from a method

$r = new ReflectionMethod($object);
$comment = $r->getDocComment();

This will return a string that looks something like this (depending on how well the method was documented)

/**
* Does this great things
*
* @param string $thing
* @return Some_Great_Thing
*/

Are there any built-in methods or functions that can parse a PHP Doc Comment String into a data structure?

$object = some_magic_function_or_method($comment_string);

echo 'Returns a: ', $object->return;

Lacking that, what part of the PHPDoc source code should I be looking at the do this myself.

Lacking and/or in addition to that, is there third party code that's considered "better" at this that the PHPDoc code?

I realize parsing these strings isn't rocket science, or even computer science, but I'd prefer a well tested library/routine/method that's been built to deal with a lot of the janky, semi-non-correct PHP Doc code that might exist in the wild.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

极致的悲 2024-10-19 08:33:27

我很惊讶这一点还没有被提及:使用 Zend Framework 的 Zend_Reflection 怎么样?这可能会派上用场,尤其是当您使用基于 Zend Framework(例如 Magento)构建的软件时。

有关一些代码示例和 API 文档 了解可用方法。

有不同的方法可以做到这一点:

  • 将文件名传递给 Zend_Reflection_File。
  • 将一个对象传递给 Zend_Reflection_Class。
  • 将一个对象和一个方法名传递给 Zend_Reflection_Method。
  • 如果您手头确实只有注释字符串,您甚至可以将一个小虚拟类的代码放在一起,将其保存到临时文件并将该文件传递给 Zend_Reflection_File。

让我们考虑简单的情况,假设您有一个想要检查的现有类。

代码如下(未经测试,请见谅):

$method = new Zend_Reflection_Method($class, 'yourMethod');
$docblock = $method->getDocBlock();

if ($docBlock->hasTag('return')) {
    $tagReturn = $docBlock->getTag('return'); // $tagReturn is an instance of Zend_Reflection_Docblock_Tag_Return
    echo "Returns a: " . $tagReturn->getType() . "<br>";
    echo "Comment for return type: " . $tagReturn->getDescription();
}

I am surprised this wasn't mentioned yet: what about using Zend_Reflection of Zend Framework? This may come in handy especially if you work with a software built on Zend Framework like Magento.

See the Zend Framework Manual for some code examples and the API Documentation for the available methods.

There are different ways to do this:

  • Pass a file name to Zend_Reflection_File.
  • Pass an object to Zend_Reflection_Class.
  • Pass an object and a method name to Zend_Reflection_Method.
  • If you really only have the comment string at hand, you even could throw together the code for a small dummy class, save it to a temporary file and pass that file to Zend_Reflection_File.

Let's go for the simple case and assume you have an existing class you want to inspect.

The code would be like this (untested, please forgive me):

$method = new Zend_Reflection_Method($class, 'yourMethod');
$docblock = $method->getDocBlock();

if ($docBlock->hasTag('return')) {
    $tagReturn = $docBlock->getTag('return'); // $tagReturn is an instance of Zend_Reflection_Docblock_Tag_Return
    echo "Returns a: " . $tagReturn->getType() . "<br>";
    echo "Comment for return type: " . $tagReturn->getDescription();
}
手心的海 2024-10-19 08:33:27

您可以使用 Fabien Potencier 中的“DocBlockParser”类 Sami(“又一个 PHP API 文档生成器”)开源项目.
首先,从 GitHub 获取 Sami。
这是如何使用它的示例:

<?php

require_once 'Sami/Parser/DocBlockParser.php';
require_once 'Sami/Parser/Node/DocBlockNode.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Node\DocBlockNode;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $doc = $dbp->parse($comment);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

这是测试页的输出:

** getDesc:
This is the short description.

This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
** getTags:
Array
(
    [param] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => foo
                    [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
                )

            [1] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => int
                                    [1] => 
                                )

                        )

                    [1] => bar
                    [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
                )

        )

    [return] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => de-html_entitied string (no entities at all)
                )

        )

)

** getTag('param'):
Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => string
                            [1] => 
                        )

                )

            [1] => foo
            [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => int
                            [1] => 
                        )

                )

            [1] => bar
            [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
        )

)

** getErrors:
Array
(
)

** getOtherTags:
Array
(
)

** getShortDesc:
This is the short description.
** getLongDesc:
This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description

You can use the "DocBlockParser" class from the Fabien Potencier Sami ("Yet Another PHP API Documentation Generator") open-source project.
First of all, get Sami from GitHub.
This is an example of how to use it:

<?php

require_once 'Sami/Parser/DocBlockParser.php';
require_once 'Sami/Parser/Node/DocBlockNode.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Node\DocBlockNode;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $doc = $dbp->parse($comment);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

And here is the output of the test page:

** getDesc:
This is the short description.

This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
** getTags:
Array
(
    [param] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => foo
                    [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
                )

            [1] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => int
                                    [1] => 
                                )

                        )

                    [1] => bar
                    [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
                )

        )

    [return] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => de-html_entitied string (no entities at all)
                )

        )

)

** getTag('param'):
Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => string
                            [1] => 
                        )

                )

            [1] => foo
            [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => int
                            [1] => 
                        )

                )

            [1] => bar
            [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
        )

)

** getErrors:
Array
(
)

** getOtherTags:
Array
(
)

** getShortDesc:
This is the short description.
** getLongDesc:
This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
ま昔日黯然 2024-10-19 08:33:27

2022 phpdoc-parser

PHPStan 现在拥有自己的基于 AST 的文档块解析器:

https://github.com /phpstan/phpdoc-parser

  • 它会长期维护
  • 它允许节点遍历
  • 它可以解析 fqn
  • 它有格式保留打印机

以下是如何使用自定义节点访问者修改它

2022 phpdoc-parser

PHPStan has now its own AST-based parser for doc blocks:

https://github.com/phpstan/phpdoc-parser

  • it's maintained for longterm
  • it allows node traversing
  • it can resolve fqn
  • it has format preserving printer

Here is how you can modify it with custom node visitor.

瞳孔里扚悲伤 2024-10-19 08:33:27

您可以使用 DocBlox (http://github.com/mvriel/docblox) 生成 XML 数据结构为你;您可以使用 PEAR 安装 DocBlox,然后运行命令:

docblox parse -d [FOLDER] -t [TARGET_LOCATION]

这将生成一个名为 structural.txt 的文件。 xml 其中包含有关源代码的所有元数据,包括解析的文档块。

您可以使用DocBlox_Reflection_DocBlock* 直接解析一段 DocBlock 文本的类。

为此,您可以确保已启用自动加载(或包含所有 DocBlox_Reflection_DocBlock* 文件)并执行以下命令:

$parsed = new DocBlox_Reflection_DocBlock($docblock);

然后,您可以使用 getters 来提取所需的信息。

注意:您不需要删除星号; Reflection 类负责处理这个问题。

You could use DocBlox (http://github.com/mvriel/docblox) to generate a XML data structure for you; you can install DocBlox using PEAR and then run the command:

docblox parse -d [FOLDER] -t [TARGET_LOCATION]

This will generate a file called structure.xml which contains all meta data about your source code, including parsed docblocks.

OR

You can use the DocBlox_Reflection_DocBlock* classes to directly parse a piece of DocBlock text.

This you can do by making sure you have autoloading enabled (or include all DocBlox_Reflection_DocBlock* files) and execute the following:

$parsed = new DocBlox_Reflection_DocBlock($docblock);

Afterwards you can use the getters to extract the information that you want.

Note: you do not need to remove the asterisks; the Reflection class takes care of this.

゛时过境迁 2024-10-19 08:33:27

查看

http://pecl.php.net/package/docblock

docblock_tokenize() 函数将我想,让你半途而废。

Check out

http://pecl.php.net/package/docblock

The docblock_tokenize() function will get you part-way there, I think.

注定孤独终老 2024-10-19 08:33:27

您始终可以从 phpDoc 查看源代码。该代码位于 LGPL 下,因此如果您决定复制它,则需要在同一许可证下许可您的软件并正确添加正确的声明。

编辑:除非,正如@Samuel Herzog 指出的那样,您将其用作库。

感谢@Samuel Herzog 的澄清。

You can always view the source from phpDoc. The code is under LGPL so if you do decide to copy it you would need to license your software under the same license AND properly add the correct notices.

EDIT: Unless, as @Samuel Herzog, noted you use it as a library.

Thanks @Samuel Herzog for the clarification.

滥情哥ㄟ 2024-10-19 08:33:27

我建议使用 addendum,它非常酷且活跃,并在许多 php5 框架中使用...

http:// code.google.com/p/addendum/

检查示例测试

http://code.google.com/p/addendum/source/browse/trunk#trunk%2Fannotations%2Ftests

I suggest addendum, its pretty cool and well alive and used in many php5 frameworks...

http://code.google.com/p/addendum/

Check the tests for examples

http://code.google.com/p/addendum/source/browse/trunk#trunk%2Fannotations%2Ftests

捂风挽笑 2024-10-19 08:33:27

phpdoc-parser (https://github.com/phpstan/phpdoc-parser) 可能是最现代、最灵活的解析 phpdoc 的方法(正如 Tomas Votruba 所说),不幸的是没有太多文档,这里有一个简单的开始方法:

<?php

use PHPStan\PhpDocParser\Lexer\Lexer;
use PHPStan\PhpDocParser\Parser\ConstExprParser;
use PHPStan\PhpDocParser\Parser\PhpDocParser;
use PHPStan\PhpDocParser\Parser\TokenIterator;
use PHPStan\PhpDocParser\Parser\TypeParser;

require 'vendor/autoload.php';

$comment = <<<'PHP'
/**
 * @param int $foo
 * @return string|false
 */
PHP;

$phpDocLexer = new Lexer();
$constantExpressionParser = new ConstExprParser();
$phpDocParser = new PhpDocParser(new TypeParser($constantExpressionParser), $constantExpressionParser);

$tokens = new TokenIterator($phpDocLexer->tokenize($comment));

$ast = $phpDocParser->parse($tokens);

print_r($ast);

// For example to get the type of the first param
// $ast->getTagsByName('@param')[0]->value->type->name; // int

phpdoc-parser (https://github.com/phpstan/phpdoc-parser) is probably the most modern and flexible way to parse phpdoc (as Tomas Votruba says), unfortunately there is not much documentation, here is a simple way to start:

<?php

use PHPStan\PhpDocParser\Lexer\Lexer;
use PHPStan\PhpDocParser\Parser\ConstExprParser;
use PHPStan\PhpDocParser\Parser\PhpDocParser;
use PHPStan\PhpDocParser\Parser\TokenIterator;
use PHPStan\PhpDocParser\Parser\TypeParser;

require 'vendor/autoload.php';

$comment = <<<'PHP'
/**
 * @param int $foo
 * @return string|false
 */
PHP;

$phpDocLexer = new Lexer();
$constantExpressionParser = new ConstExprParser();
$phpDocParser = new PhpDocParser(new TypeParser($constantExpressionParser), $constantExpressionParser);

$tokens = new TokenIterator($phpDocLexer->tokenize($comment));

$ast = $phpDocParser->parse($tokens);

print_r($ast);

// For example to get the type of the first param
// $ast->getTagsByName('@param')[0]->value->type->name; // int
揽清风入怀 2024-10-19 08:33:27

我建议您查看 http://code.google.com/p/php-注释/

如果需要的话,代码相当简单,可以修改/理解。

I suggest you to take a look at http://code.google.com/p/php-annotations/

The code is fairly simple to be modified/understood if needed.

写下不归期 2024-10-19 08:33:27

正如上面的答案之一所指出的,您可以使用 phpDocumentor。如果您使用作曲家,则只需添加
“phpdocumentor/reflection-docblock”:“~2.0”
到你的“需要”块。

请参阅此示例: https://github.com/abdulla16/de Coupled -app/blob/master/composer.json

有关使用示例,请参阅:
https://github.com/abdulla16/de Coupled-app/blob /master/Container/Container.php

As pointed out in one of the answers above, you can use phpDocumentor. If you use composer, then just add
"phpdocumentor/reflection-docblock": "~2.0"
to your "require" block.

See this for an example: https://github.com/abdulla16/decoupled-app/blob/master/composer.json

For usage examples, see:
https://github.com/abdulla16/decoupled-app/blob/master/Container/Container.php

梦途 2024-10-19 08:33:27

user1419445 代码的更新版本。 DocBlockParser::parse() 方法已更改,需要第二个上下文参数。它似乎也与 phpDocumentor 略有耦合,因此为了简单起见,我假设您通过 Composer 安装了 Sami。下面的代码适用于 Sami v4.0.16

<?php

require_once 'vendor/autoload.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Filter\PublicFilter;
use Sami\Parser\ParserContext;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $filter = new PublicFilter;
        $context = new ParserContext($filter, $dbp, NULL);
        $doc = $dbp->parse($comment, $context);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

Updated version of user1419445's code. The DocBlockParser::parse() method is changed and needs a second context parameter. It also seems to be slightly coupled with phpDocumentor, so for the sake of simplicity I would assume you have Sami installed via Composer. The code below works for Sami v4.0.16

<?php

require_once 'vendor/autoload.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Filter\PublicFilter;
use Sami\Parser\ParserContext;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $filter = new PublicFilter;
        $context = new ParserContext($filter, $dbp, NULL);
        $doc = $dbp->parse($comment, $context);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>
农村范ル 2024-10-19 08:33:27

查看 Php 评论管理器 包。它允许解析方法 DocBloc 注释。它使用 Php Reflection API 来获取方法的 DocBloc 注释

Have a look at the Php Comment Manager package. It allows parsing method DocBloc comments. It uses Php Reflection API for fetching the DocBloc comments of methods

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文