PHP 中的简单 BBparser 可让您替换标签之外的内容

发布于 2024-10-29 02:57:59 字数 1295 浏览 4 评论 0原文

我正在尝试解析表示源代码的字符串，如下所示：

[code lang="html"]
  &lt;div&gt;stuff&lt;/div&gt;
[/code]
<div>stuff</div>

正如您从我之前的 20 个问题中看到的，我尝试使用 PHP 的正则表达式函数来完成此操作，但遇到了很多问题，特别是当字符串非常大时。 ..

你们知道我可以用 PHP 编写的 BB 解析器类来代替正则表达式吗？

我需要它做的是：

能够使用 html 实体转换 [code] 标签内的所有内容
能够仅对外部内容运行某种过滤器（我的回调函数） [code] 标签的

感谢您

编辑： 我最终使用了这个：

将所有

 和  标记转换为 [pre] 和 [code] ：

str_replace(array('', '
', '', ''), array('[pre]', ' [/pre]', '[code]', '[/code]'), $content);

获取两者之间的内容[code]..[/code] 和 [pre]...[/pre] 并进行 html 实体转换
```
preg_replace_callback('/(.?)\[(pre|code)\b(.*?)(?:(\/))?\](?:(.+?)\[\ /\2\])?(.?)/s', 'self::specialchars', $content);
```
（我从 WordPress 短代码函数中窃取了这个模式:)
将实体转换后的内容存储在临时数组变量中，并将 $content 中的内容替换为唯一 ID
我现在可以在 $content 上安全地运行我的过滤器，因为其中没有代码它，只是 ID（此过滤器对整个文本执行 strip_tags 并将 http://blabla.com 之类的内容转换为链接）
< p>将 $content 中的唯一 ID 替换为数组变量中转换后的代码块

你觉得可以吗？

原文

I'm trying to parse strings that represent source code, something like this:

[code lang="html"]
  <div>stuff</div>
[/code]
<div>stuff</div>

As you can see from my previous 20 questions, I tried to do it with PHP's regex functions, but ran into many problems, especially when the string is very big...

Do you guys know a BB parser class written in PHP that I can use for this, instead of regexes?

What I need it to do is:

be able to convert all content from within [code] tags with html entities
be able to run some kind of a filter (a callback function of mine) only on content outside of the [code] tags

thank you

edit:
I ended up using this:

convert all <pre> and <code> tags to [pre] and [code]:

str_replace(array('<pre>', '</pre>', '<code>', '</code>'), array('[pre]', '[/pre]', '[code]', '[/code]'), $content);

get contents from between [code]..[/code] and [pre]...[/pre] and do the html entity conversion
```
preg_replace_callback('/(.?)\[(pre|code)\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)/s', 'self::specialchars', $content);
```
(i stole this pattern from wordpress shortcode functions :)
store the entity converted content in a temporary array variable, and replace the one from $content with a unique ID
I can now safely run my filter on $content, because there's no code in it, just the ID (this filter does a strip_tags on the entire text and converts stuff like http://blabla.com to links)
replace the unique IDs from $content with the converted code blocks from the array variable