使用正则表达式提取 BBCode 样式标签之间的文本

发布于 01-07 09:01 字数 417 浏览 3 评论 0原文

我有这样的文字

[内容][SECTION]这是 C #1 部分[/SECTION][SECTION]这是 C #2 部分[/SECTION][SECTION]这是 E #3 部分[/SECTION]

，我尝试匹配每个部分，包括带有该表达式的部分标记：

\[SECTION\][^SECTION]+(SECTION\])

但是上面的代码不起作用，因为 [^SECTION] 正在开始和结束标记之间的文本中查找不是 S、E、C、T、I 的任何字符,O 和 N

任何想法如何解决这个问题？

我正在使用 PHP 来匹配标签及其内容与 preg_match_all();我喜欢将每个部分逐一匹配，而不是一次匹配所有部分。

原文

I have a text like that

[CONTENT][SECTION]This is the section C #1[/SECTION][SECTION]This is the section C #2[/SECTION][SECTION]This is the section E #3[/SECTION]

and I try to match each section, including the section tags with that expression :

\[SECTION\][^SECTION]+(SECTION\])

but the above code does not work because the [^SECTION] is looking in the text between the start and end tags for any character that is not S, E, C, T, I ,O and N

Any idea on how to solve that issue ?

I am using PHP to match the tags and it's contents with preg_match_all(); And I like to match each section one by one, and not all the sections at once.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

橘虞初梦2025-01-14 09:01:11

\[SECTION\](.*?)\[/SECTION\]

我认为这就是您想要的，获取单个 SECTION 内容的文本？

? 使 * 变得惰性，因此它只会匹配当前的第一个 [/SECTION]。

示例：

$input = "[CONTENT][SECTION]This is the section C #1[/SECTION][SECTION]This is the section C #2[/SECTION][SECTION]This is the section E #3[/SECTION]";
var_dump(preg_match_all("(\[SECTION\](.*?)\[/SECTION\])",$input,$m),$m);

结果：

int(3)
array(2) {
    [0]=>array(3) {
        [0]=>string(43) "[SECTION]This is the section C #1[/SECTION]"
        [1]=>string(43) "[SECTION]This is the section C #2[/SECTION]"
        [2]=>string(43) "[SECTION]This is the section E #3[/SECTION]"
    }
    [1]=>array(3) {
        [0]=> string(24) "This is the section C #1"
        [1]=> string(24) "This is the section C #2"
        [2]=> string(24) "This is the section E #3"
    }
}

\[SECTION\](.*?)\[/SECTION\]

I think this is what you want, getting the text for the contents of a single SECTION?

The ? makes the * lazy, so it will only match up to the first [/SECTION] from the current one.

Example:

$input = "[CONTENT][SECTION]This is the section C #1[/SECTION][SECTION]This is the section C #2[/SECTION][SECTION]This is the section E #3[/SECTION]";
var_dump(preg_match_all("(\[SECTION\](.*?)\[/SECTION\])",$input,$m),$m);

Result:

int(3)
array(2) {
    [0]=>array(3) {
        [0]=>string(43) "[SECTION]This is the section C #1[/SECTION]"
        [1]=>string(43) "[SECTION]This is the section C #2[/SECTION]"
        [2]=>string(43) "[SECTION]This is the section E #3[/SECTION]"
    }
    [1]=>array(3) {
        [0]=> string(24) "This is the section C #1"
        [1]=> string(24) "This is the section C #2"
        [2]=> string(24) "This is the section E #3"
    }
}

回复收藏 0 原文