递归BBCode解析

发布于 2024-11-25 12:26:16 字数 1870 浏览 1 评论 0原文

我正在尝试解析脚本中的 BBCode。现在,它可以无缝工作,直到我尝试缩进不仅仅是粗体或下划线的 BBCode - 例如剧透、网址、字体大小等 - 然后它就会搞砸。这是我的代码:

function parse_bbcode($text) {
    global $db;
    $oldtext = $text;
    $bbcodes = $db->select('*', 'bbcodes');
    foreach ($bbcodes as $bbcode) {
        switch ($bbcode->type) {
            case 'simple': {
                $find = '{content}';
                $replace = '${1}';
                $text = preg_replace(
                    '/\['.$bbcode->tag.'\](.+)\[\/'.$bbcode->tag.'\]/i',
                    str_replace($find, $replace, $bbcode->html),
                    $text);
                    break;
            }
            case 'property':
            case 'options': {
                $find = array ( '{property}', '{content}' );
                $replace = array ( '${1}', '${2}' );
                $text = preg_replace(
                    '/\['.$bbcode->tag.'\=(.[^\"]*)\](.+)\[\/'.$bbcode->tag.'\]/i',
                    str_replace($find, $replace, $bbcode->html),
                    $text);
                    break;
            }
        }
    }
    return $text;
}

现在我的猜测是正则表达式不喜欢模式中的递归性。我该如何改进它?示例 $bbcode 对象如下:

stdClass::__set_state(array(
   'id' => '2',
   'name' => 'Italic',
   'type' => 'simple',
   'tag' => 'i',
   'button_image' => NULL,
   'button_text' => '<i>I</i>',
   'options' => '',
   'prompt' => NULL,
   'html' => '<i>{content}</i>',
   'order' => '1',
))
stdClass::__set_state(array(
   'id' => '3',
   'name' => 'URL',
   'type' => 'property',
   'tag' => 'url',
   'button_image' => NULL,
   'button_text' => 'http://',
   'options' => '',
   'prompt' => 'URL address',
   'html' => '<a href="{property}">{content}</a>',
   'order' => '4',
))

I'm trying to parse BBCode in my script. Now, it works seamelessly, until I try to indent BBCode that's more than just bold or underline - such as spoiler, url, font size, etc. - then it screws up. Here's my code:

function parse_bbcode($text) {
    global $db;
    $oldtext = $text;
    $bbcodes = $db->select('*', 'bbcodes');
    foreach ($bbcodes as $bbcode) {
        switch ($bbcode->type) {
            case 'simple': {
                $find = '{content}';
                $replace = '${1}';
                $text = preg_replace(
                    '/\['.$bbcode->tag.'\](.+)\[\/'.$bbcode->tag.'\]/i',
                    str_replace($find, $replace, $bbcode->html),
                    $text);
                    break;
            }
            case 'property':
            case 'options': {
                $find = array ( '{property}', '{content}' );
                $replace = array ( '${1}', '${2}' );
                $text = preg_replace(
                    '/\['.$bbcode->tag.'\=(.[^\"]*)\](.+)\[\/'.$bbcode->tag.'\]/i',
                    str_replace($find, $replace, $bbcode->html),
                    $text);
                    break;
            }
        }
    }
    return $text;
}

Now my guess is that the RegEx doesn't like the recursiveness in the pattern. How can I improve it? A sample $bbcode object is as such:

stdClass::__set_state(array(
   'id' => '2',
   'name' => 'Italic',
   'type' => 'simple',
   'tag' => 'i',
   'button_image' => NULL,
   'button_text' => '<i>I</i>',
   'options' => '',
   'prompt' => NULL,
   'html' => '<i>{content}</i>',
   'order' => '1',
))
stdClass::__set_state(array(
   'id' => '3',
   'name' => 'URL',
   'type' => 'property',
   'tag' => 'url',
   'button_image' => NULL,
   'button_text' => 'http://',
   'options' => '',
   'prompt' => 'URL address',
   'html' => '<a href="{property}">{content}</a>',
   'order' => '4',
))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

压抑⊿情绪 2024-12-02 12:26:16

正如戈登在评论中所说PHP有一个BBCode解析器,所以没有理由重新发明轮子

不过,本机解析器是一个 PECL 包,因此您必须安装它。如果这不是一个选项(例如由于共享托管),还有一个 PEAR 包: http:// pear.php.net/package/HTML_BBCodeParser

除此之外,您还可以查看使用 BB 代码源代码的论坛,并使用他们的解析器或改进它。 http://www.bbcode.org/implementations.php 中还列出了几种 PHP 实现

As gordon said in comments PHP has a BBCode parser, so no reason to reinvent the wheel.

The native parser is a PECL package though, so you will have to install it. If that's not an option (for instance due to shared hosting), there is also a PEAR package: http://pear.php.net/package/HTML_BBCodeParser

In addition to those, you can take a look at forums using BB code source code, and either use their parser, or improve it. There is also several PHP implementations listed at http://www.bbcode.org/implementations.php

熊抱啵儿 2024-12-02 12:26:16

使用正则表达式正确解析 BBcode 并不简单。代码可以嵌套。 CODE 标签可能包含解析器必须忽略的 BBCode。某些标签可能不会出现在其他标签内。等等。但是,这是可以做到的。我最近对 ​​FluxBB 开源论坛软件的 BBCode 解析器进行了大修。您可能想实际查看一下:

新的 2011 FluxBB 解析器

请注意,这个新的解析器尚未合并到 FluxBB 代码库中。

Correctly parsing BBcode using regex is non-trival. The codes may be nested. CODE tags may contain BBCodes which must be ignored by the parser. Certain tags may not appear inside of other tags. etc. However, it can be done. I recently overhauled the BBCode parser for the FluxBB open source forum software. You may want to check it out in action:

New 2011 FluxBB Parser

Note that this new parser has not yet been incorporated into the FluxBB codebase.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文