RegEx 从字符串中去除 BBCode 标签
我正在开发一个使用 JQuery MarkItUp 的功能!编辑器作为 BBCode 编辑器。我只允许使用一小部分 BBCode,包括以下内容:
[b]
[i]
[quote]
[quote=Mr Incredible]
[img]
[url]
[youtube]
我有一个使用编辑器的 1,500 个字符的“描述”字段,但我还计划存储描述的 150 个字符摘要所有 BBCode 都被删除。
我目前正在使用一个简单的正则表达式在 C# 中执行此操作。它基本上破坏了字符串中嵌入的 BBCode,但它留下了很多“嘈杂的内容”,例如我也想从摘要中删除的 [img] URL 或 [youtube] 视频 ID。
这是我当前的正则表达式:
public static String StripBBCode(string bbCode)
{
string r = Regex.Replace(bbCode,
@"\[(.*?)\]",
String.Empty, RegexOptions.IgnoreCase);
// Finally, replace all newlines with a space
r = Regex.Replace(r,
@"(\r\n|\n\r|\r|\n)+",
@" ", RegexOptions.IgnoreCase);
return r;
}
如果我通过此函数运行以下字符串,我将得到如下所示的结果:
源
This is [b]bold[/b]. This is [i]italic[/i].
Here is an image:
[img]http://www.phatmac.com/Pics/Movies/Incredibles.jpg[/img]
Here is a link to [url=http://espn.go.com]ESPN[/url].
Here is a YouTube video:
[youtube]WJ0UkZ3W4FA[/youtube]
结果
这是粗体。这是斜体。这是一张图片:http://www.phatmac.com/Pics/Movies/Incredibles。 jpg 这是 ESPN 的链接。这是 YouTube 视频:WJ0UkZ3W4FA
这是我想要返回的内容
这是粗体。这是斜体。这是一张图片:这是 ESPN 的链接。这是 YouTube 视频:
如何修改 StripBBCode() 函数来实现此目的?
已编辑
下面第一个答案中大卫的建议是正确的。
这是我现在使用的:
string r = Regex.Replace(s,
@"\[youtube\].*\[\/youtube\]",
String.Empty, RegexOptions.IgnoreCase);
r = Regex.Replace(r,
@"\[img\].*\[\/img\]",
String.Empty, RegexOptions.IgnoreCase);
I'm working on a feature that uses the JQuery MarkItUp! editor as a BBCode editor. I'm only allowing a small subset of BBCodes including the following:
[b]
[i]
[quote]
[quote=Mr Incredible]
[img]
[url]
[youtube]
I have a 1,500 character "Description" field that uses the editor, but I'm also planning to store a 150 character digest of the description with all of the BBCode stripped out.
I'm currently using a simple RegEx to do this in C#. It basically nukes embedded BBCodes in a string, but it leaves behind a lot of "noisy content" like the [img] URL or the [youtube] video ID that I'd also like to remove from the digest.
Here's my current RegEx:
public static String StripBBCode(string bbCode)
{
string r = Regex.Replace(bbCode,
@"\[(.*?)\]",
String.Empty, RegexOptions.IgnoreCase);
// Finally, replace all newlines with a space
r = Regex.Replace(r,
@"(\r\n|\n\r|\r|\n)+",
@" ", RegexOptions.IgnoreCase);
return r;
}
If I run the following string through this function, I get the result shown below:
source
This is [b]bold[/b]. This is [i]italic[/i].
Here is an image:
[img]http://www.phatmac.com/Pics/Movies/Incredibles.jpg[/img]
Here is a link to [url=http://espn.go.com]ESPN[/url].
Here is a YouTube video:
[youtube]WJ0UkZ3W4FA[/youtube]
result
This is bold. This is italic. Here is an image: http://www.phatmac.com/Pics/Movies/Incredibles.jpg Here is a link to ESPN. Here is a YouTube video: WJ0UkZ3W4FA
Here's what I want to get back
This is bold. This is italic. Here is an image: Here is a link to ESPN. Here is a YouTube video:
How can I modify my StripBBCode() function to achieve this?
EDITED
The suggestion from David below in the first answer was correct.
Here's what I'm using now:
string r = Regex.Replace(s,
@"\[youtube\].*\[\/youtube\]",
String.Empty, RegexOptions.IgnoreCase);
r = Regex.Replace(r,
@"\[img\].*\[\/img\]",
String.Empty, RegexOptions.IgnoreCase);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您有几个想要删除内容的标签,而其余的您只想删除标签。
请将
[img].*[/img]
替换为 string.empty 和[youtube].*[/youtube]
,以及您需要删除的其他内容,然后删除[.*]
。编辑:
我也不是正则表达式专家,但我认为
@"\[img\].*?\[/img\]"
就是你想要的。我认为您不需要@"\[(.*?)\]"
中的括号,我认为在这种情况下括号意味着保存匹配的文本,以便您可以使用 < 再次匹配它代码>\1。You've got several tags that you want the content removed, and the rest where you only want the tags removed.
Do replace of
[img].*[/img]
with string.empty, and[youtube].*[/youtube]
, and whatever else you need the contents removed, then do your removal of[.*]
.Edit:
I'm not a regex expert either, but I think
@"\[img\].*?\[/img\]"
is what you want. I don't think you need the parentheses in@"\[(.*?)\]"
, I think in this context parentheses means to save the matched text so you can match it again with\1
.