Gget 潜在嵌套大括号内的所有子字符串

发布于 2024-11-04 04:41:52 字数 207 浏览 5 评论 0原文

我正在尝试用 PHP 解析以下格式:

// This is a comment
{
this is an entry
}
{
this is another entry
}
{
entry
{entry within entry}
{entry within entry}
}

也许只是缺少咖啡因,但我想不出一种获取大括号内容的好方法。

I'm trying to parse the following format with PHP:

// This is a comment
{
this is an entry
}
{
this is another entry
}
{
entry
{entry within entry}
{entry within entry}
}

Maybe is just the lack of caffeine, but I can't think of a decent way of getting the contents of the curly braces.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

难如初 2024-11-11 04:41:52

这是一个非常常见的解析任务,基本上您需要跟踪可能处于的各种状态,并使用常量和函数调用的组合来维护它们。

下面是一些相当不优雅的代码,它就是这样做的:

<?php

$input = file_get_contents('input.txt');

define('STATE_CDATA', 0);
define('STATE_COMMENT', 1);

function parseBrace($input, &$i)
{
    $parsed = array(
        'cdata' => '',
        'children' => array()
    );
    $length = strlen($input);
    $state = STATE_CDATA;
    for(++$i; $i < $length; ++$i) {
        switch($input[$i]) {
            case '/':
                if ('/' === $input[$i+1]) {
                    $state = STATE_COMMENT;
                    ++$i;
                } if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                break;
            case '{':
                if (STATE_CDATA === $state) {
                    $parsed['children'][] = parseBrace($input, $i);
                }
                break;
            case '}':
                if (STATE_CDATA === $state) {
                    break 2; // for
                }
                break;
            case "\n":
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                $state = STATE_CDATA;
                break;
            default:
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
        }
    }
    return $parsed;
}

function parseInput($input)
{
    $parsed = array(
        'cdata' => '',
        'children' => array()
    );
    $state = STATE_CDATA;
    $length = strlen($input);
    for($i = 0; $i < $length; ++$i) {
        switch($input[$i]) {
            case '/':
                if ('/' === $input[$i+1]) {
                    $state = STATE_COMMENT;
                    ++$i;
                } if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                break;
            case '{':
                if (STATE_CDATA === $state) {
                    $parsed['children'][] = parseBrace($input, $i);
                }
                break;
            case "\n":
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                $state = STATE_CDATA;
                break;
            default:
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
        }
    }
    return $parsed;
}

print_r(parseInput($input));

这会产生以下输出:

Array
(
    [cdata] =>




    [children] => Array
    (
        [0] => Array
        (
            [cdata] =>
this is an entry

            [children] => Array
            (
            )

        )

        [1] => Array
        (
            [cdata] =>
this is another entry

            [children] => Array
            (
            )   

        )

        [2] => Array
        (
            [cdata] => 
entry



            [children] => Array
            (
                [0] => Array
                (
                    [cdata] => entry within entry
                    [children] => Array
                    (
                    )


                )

                [1] => Array
                (
                    [cdata] => entry within entry
                    [children] => Array
                    (
                    )

                )

            )

        )

    )

)

您可能想要清理所有空白,但一些放置良好的修剪会为您进行排序。

This is quite a common parsing task, basically you need to keep track of the various states you can be in and use a combination of constants and function calls to maintain them.

Here is some rather inelegant code that does just that:

<?php

$input = file_get_contents('input.txt');

define('STATE_CDATA', 0);
define('STATE_COMMENT', 1);

function parseBrace($input, &$i)
{
    $parsed = array(
        'cdata' => '',
        'children' => array()
    );
    $length = strlen($input);
    $state = STATE_CDATA;
    for(++$i; $i < $length; ++$i) {
        switch($input[$i]) {
            case '/':
                if ('/' === $input[$i+1]) {
                    $state = STATE_COMMENT;
                    ++$i;
                } if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                break;
            case '{':
                if (STATE_CDATA === $state) {
                    $parsed['children'][] = parseBrace($input, $i);
                }
                break;
            case '}':
                if (STATE_CDATA === $state) {
                    break 2; // for
                }
                break;
            case "\n":
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                $state = STATE_CDATA;
                break;
            default:
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
        }
    }
    return $parsed;
}

function parseInput($input)
{
    $parsed = array(
        'cdata' => '',
        'children' => array()
    );
    $state = STATE_CDATA;
    $length = strlen($input);
    for($i = 0; $i < $length; ++$i) {
        switch($input[$i]) {
            case '/':
                if ('/' === $input[$i+1]) {
                    $state = STATE_COMMENT;
                    ++$i;
                } if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                break;
            case '{':
                if (STATE_CDATA === $state) {
                    $parsed['children'][] = parseBrace($input, $i);
                }
                break;
            case "\n":
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
                $state = STATE_CDATA;
                break;
            default:
                if (STATE_CDATA === $state) {
                    $parsed['cdata'] .= $input[$i];
                }
        }
    }
    return $parsed;
}

print_r(parseInput($input));

This produces the following output:

Array
(
    [cdata] =>




    [children] => Array
    (
        [0] => Array
        (
            [cdata] =>
this is an entry

            [children] => Array
            (
            )

        )

        [1] => Array
        (
            [cdata] =>
this is another entry

            [children] => Array
            (
            )   

        )

        [2] => Array
        (
            [cdata] => 
entry



            [children] => Array
            (
                [0] => Array
                (
                    [cdata] => entry within entry
                    [children] => Array
                    (
                    )


                )

                [1] => Array
                (
                    [cdata] => entry within entry
                    [children] => Array
                    (
                    )

                )

            )

        )

    )

)

You'll probably want to clean up all the whitespace but some well placed trim's will sort that for you.

勿忘初心 2024-11-11 04:41:52

对于大量内容来说,这可能不是最佳解决方案,但它确实有效。

<?php
        $text = "I am out of the brackets {hi i am in the brackets} Back out { Back in}";
        print $text . '<hr />';

        $tmp = explode("{",$text);
        $tmp2 = array();
        $wantedText = array();
        for($i = 0; $i < count($tmp); $i++){
                if(stristr($tmp[$i],"}")){
                    $tmp2 = explode("}",$tmp[$i]);
                    array_push($wantedText,$tmp2[0]);
                }
        }
        print_r($wantedText);
    ?>

结果:

Array ( [0] => hi i am in the brackets [1] => Back in )

This may not be the best solution for large amount of content, but it works.

<?php
        $text = "I am out of the brackets {hi i am in the brackets} Back out { Back in}";
        print $text . '<hr />';

        $tmp = explode("{",$text);
        $tmp2 = array();
        $wantedText = array();
        for($i = 0; $i < count($tmp); $i++){
                if(stristr($tmp[$i],"}")){
                    $tmp2 = explode("}",$tmp[$i]);
                    array_push($wantedText,$tmp2[0]);
                }
        }
        print_r($wantedText);
    ?>

Results:

Array ( [0] => hi i am in the brackets [1] => Back in )
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文