使用正则表达式提取 BBCode 样式标签之间的文本

发布于 01-07 09:01 字数 417 浏览 3 评论 0原文

我有这样的文字

[内容][SECTION]这是 C #1 部分[/SECTION][SECTION]这是 C #2 部分[/SECTION][SECTION]这是 E #3 部分[/SECTION]

,我尝试匹配每个部分,包括带有该表达式的部分标记:

\[SECTION\][^SECTION]+(SECTION\])

但是上面的代码不起作用,因为 [^SECTION] 正在开始和结束标记之间的文本中查找不是 S、E、C、T、I 的任何字符,O 和 N

任何想法如何解决这个问题?

我正在使用 PHP 来匹配标签及其内容与 preg_match_all();我喜欢将每个部分逐一匹配,而不是一次匹配所有部分。

I have a text like that

[CONTENT][SECTION]This is the section C #1[/SECTION][SECTION]This is the section C #2[/SECTION][SECTION]This is the section E #3[/SECTION]

and I try to match each section, including the section tags with that expression :

\[SECTION\][^SECTION]+(SECTION\])

but the above code does not work because the [^SECTION] is looking in the text between the start and end tags for any character that is not S, E, C, T, I ,O and N

Any idea on how to solve that issue ?

I am using PHP to match the tags and it's contents with preg_match_all(); And I like to match each section one by one, and not all the sections at once.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

橘虞初梦2025-01-14 09:01:11
\[SECTION\](.*?)\[/SECTION\]

我认为这就是您想要的,获取单个 SECTION 内容的文本?

? 使 * 变得惰性,因此它只会匹配当前的第一个 [/SECTION]


示例:

$input = "[CONTENT][SECTION]This is the section C #1[/SECTION][SECTION]This is the section C #2[/SECTION][SECTION]This is the section E #3[/SECTION]";
var_dump(preg_match_all("(\[SECTION\](.*?)\[/SECTION\])",$input,$m),$m);

结果:

int(3)
array(2) {
    [0]=>array(3) {
        [0]=>string(43) "[SECTION]This is the section C #1[/SECTION]"
        [1]=>string(43) "[SECTION]This is the section C #2[/SECTION]"
        [2]=>string(43) "[SECTION]This is the section E #3[/SECTION]"
    }
    [1]=>array(3) {
        [0]=> string(24) "This is the section C #1"
        [1]=> string(24) "This is the section C #2"
        [2]=> string(24) "This is the section E #3"
    }
} 
\[SECTION\](.*?)\[/SECTION\]

I think this is what you want, getting the text for the contents of a single SECTION?

The ? makes the * lazy, so it will only match up to the first [/SECTION] from the current one.


Example:

$input = "[CONTENT][SECTION]This is the section C #1[/SECTION][SECTION]This is the section C #2[/SECTION][SECTION]This is the section E #3[/SECTION]";
var_dump(preg_match_all("(\[SECTION\](.*?)\[/SECTION\])",$input,$m),$m);

Result:

int(3)
array(2) {
    [0]=>array(3) {
        [0]=>string(43) "[SECTION]This is the section C #1[/SECTION]"
        [1]=>string(43) "[SECTION]This is the section C #2[/SECTION]"
        [2]=>string(43) "[SECTION]This is the section E #3[/SECTION]"
    }
    [1]=>array(3) {
        [0]=> string(24) "This is the section C #1"
        [1]=> string(24) "This is the section C #2"
        [2]=> string(24) "This is the section E #3"
    }
} 
贪恋2025-01-14 09:01:11

试试这个:

\[SECTION\].+?\[\/SECTION\]

Try with this:

\[SECTION\].+?\[\/SECTION\]

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文