php中的正则表达式从wiki文本中删除引用
从给定的示例文本中,我希望将文本与 [[]] 和 {{}} 中包含的文本分开。
示例文本:
On 11 December 1988,年龄只有 15 岁零 232 天,Tendulkar 在他的首秀中得分为 100 分[[一流板球|一流]] [[孟买板球队|孟买]] 对阵 [[古吉拉特邦板球队|古吉拉特邦]],使他最年轻的印度人在头等舱首秀中获得一百分。随后,他在第一个 Deodhar 和 Duleep 奖杯中获得了一个世纪的成绩。 {{引用网页|url=http://www.espnstar.com/cricket/international-cricket/news/detail/item136972/Sachin-Tendulkar-factfile/|title=Sachin Tendulkar事实文件|publisher=www.espnstar.com| accessdate=3 August 2009}} 在看到他谈判后,他被孟买队长 [[Dilip Vengsarkar]] 选中[[Kapil Dev]] 在篮网队效力,并以孟买最高跑分得分手的身份结束了本赛季。他以平均 67.77 分的成绩获得 583 分,在总体跑分得分手中排名第六{{cite web|url=http://blogs.cricinfo.com/link_to_database/ARCHIVE/1980S/1988-89/IND_LOCAL/RANJI/STATS/IND_LOCAL_RJI_AVS_BAT_MOST_RUNS.html|title=1988–89 Ranji 赛季 – 大多数运行|publisher=Cricinfo|accessdate =2009年8月3日}}他还创造了一个世纪不败在 [[伊朗杯]] 决赛中,{{cite web|url=http://cricketarchive.com/Archive/Scorecards/52/52008.html|title=1989/90 年印度其他地区对阵德里 |publisher=Cricketarchive|accessdate=2009 年 8 月 3 日}},仅在一个头等赛季之后就被选中参加明年的巴基斯坦巡回赛。
我尝试了这个:
$patterns = ("/^{{*/", "/*}}$/" );$replacements = "";
preg_replace($patterns, $replacements, $parts);
print_r($parts);
还有这个:
$parts = preg_replace("/\[(?:\\\\|\\\]|[^\]])*\]/", "", $ans_str);
还有这个:
$pattern = ("/\[.*?\]/", "/\{.*?\}/");
$ans = preg_replace($pattern, "", $parts);
它不起作用。 请帮忙,谢谢。
From the given sample text i want the text apart from the ones that are contained in [[]] and {{}}
Sample Text:
On 11 December 1988, aged just 15 years and 232 days, Tendulkar scored 100 not out in his debut [[first-class cricket|first-class]] match for [[Mumbai cricket team|Bombay]] against [[Gujarat cricket team|Gujarat]], making him the youngest Indian to score a century on first-class debut. He followed this by scoring a century in his first Deodhar and Duleep Trophy.
{{cite web|url=http://www.espnstar.com/cricket/international-cricket/news/detail/item136972/Sachin-Tendulkar-factfile/|title=Sachin Tendulkar factfile |publisher=www.espnstar.com|accessdate=3 August 2009}} He was picked by the Mumbai captain [[Dilip Vengsarkar]] after seeing him negotiate [[Kapil Dev]] in the nets, and finished the season as Bombay's highest run-scorer.He scored 583 runs at an average of 67.77, and was the sixth highest run-scorer overall{{cite web|url=http://blogs.cricinfo.com/link_to_database/ARCHIVE/1980S/1988-89/IND_LOCAL/RANJI/STATS/IND_LOCAL_RJI_AVS_BAT_MOST_RUNS.html|title=1988–89 Ranji season – Most Runs|publisher=Cricinfo|accessdate=3 August 2009}} He also made an unbeaten century in the [[Irani Trophy]] final,{{cite web|url=http://cricketarchive.com/Archive/Scorecards/52/52008.html|title=Rest of India v Delhi in 1989/90
|publisher=Cricketarchive|accessdate=3 August 2009}} and was selected for the tour of Pakistan next year, after just one first class season.
I tried this:
$patterns = ("/^{{*/", "/*}}$/" );$replacements = "";
preg_replace($patterns, $replacements, $parts);
print_r($parts);
and this:
$parts = preg_replace("/\[(?:\\\\|\\\]|[^\]])*\]/", "", $ans_str);
and this too:
$pattern = ("/\[.*?\]/", "/\{.*?\}/");
$ans = preg_replace($pattern, "", $parts);
It does not work.
Please help, thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这应该可以解决问题
U 修饰符用于非贪婪模式,这意味着尽快停止匹配(以避免所有引用被捕获为一个巨大的匹配)。
编辑:添加了 s 修饰符,请参阅评论
This should do the trick
U modifier is for ungreedy mode, which means stop the match as soon as possible (to avoid all citations being caught as one giant match).
EDIT: added the s modifier, see comments
在 ideone.com 上查看演示
see demo on ideone.com
下面两行就成功了:
抱歉,出了问题。
the following two lines did the trick :
Sorry it went wrong.