我有一个页面,用户可以在其中发布自己的帖子,用户可以使用以下标签来标记他们的帖子:
{strong}{/strong} 或 {italic}{/italic} 或 {title}{/title} 等。 ..
对我来说困难的是让它们不互相侵入。
我的意思是这是可以的:
{strong}{/strong} {italic}{/italic}
但我需要避免所有可能的情况,比如这些:
{strong}{italic}{/strong}{/italic}
{italic}{strong}{/italic}{italic}{/strong}{/italic}
{italic}{/strong}{/italic}{/strong}{/italic}
等等...确实太多的情况需要为每个控件编写1个控件,我认为:P
逻辑应该是使它们始终分开并删除不需要或包含标签...希望问题很清楚:P
i have a page where users put their own posts, users are able to markup their posts with tags like :
{strong}{/strong} or {italic}{/italic} or {title}{/title} etc ...
the difficult for me is to make them not invading between themself.
i mean this is ok:
{strong}{/strong} {italic}{/italic}
but i need to avoid all the possible cases like these:
{strong}{italic}{/strong}{/italic}
{italic}{strong}{/italic}{italic}{/strong}{/italic}
{italic}{/strong}{/italic}{/strong}{/italic}
and so on ... really too much cases to write 1 control foreach one i think :P
the logic should be to make them always separated and remove the not needed or included tags ... hoping question is clear :P
解决您问题的一般方法是开发一个 递归下降解析器 或 基于堆栈的解析器,但对于您来说可能完全是多余的 情况。
匹配开始标签和结束标签通常与平衡括号的语言非常相似(例如: ()、(())、(()()) 正确平衡,)(、()(、)(() () 不是)。平衡括号的语言不是常规语言,因此不能使用正则表达式进行“解析”(除非您可以限制它们可以嵌套的深度,请参阅"="">这里)。
我给你的建议是将所有伪标签转换为实际的 html 标签,然后用 PHP 的文档对象模型。
The general solution to your problem is to develop a recursive descent parser or a stack based parser, but is probably complete overkill for your situation.
Matching start tags with end tags is in general very similar to the language of balanced parentheses (for instance: (), (()), (()()) are balanced correctly, )(, ()(, )(()() are not). The language of balanced parentheses is not a regular language, and therefore cannot be 'parsed' using regular expressions (unless you can limit the depth that they can be nested, see here).
My suggestion to you would be to convert all of your pseudo tags into actual html tags, and then parse that with PHP's Document Object Model.