如何使用正则表达式匹配字符串中的所有内容,直到第二次出现分隔符?
我试图通过查找第二次出现的句点和空格来优化 preg_match_all
:
<?php
$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop. Slight chance of showers.";
preg_match_all ('/(^)((.|\n)+?)(\.\s{2})/',$str, $matches);
$dataarray=$matches[2];
foreach ($dataarray as $value)
{ echo $value; }
?>
但它不起作用:{2}
出现不正确。
我必须使用 preg_match_all
因为我正在抓取动态 HTML。
我想从字符串中捕获这个:
East Winds 20 knots. Gusts to 25 knots.
I am trying to refine a preg_match_all
by finding the second occurrence of a period then a space:
<?php
$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop. Slight chance of showers.";
preg_match_all ('/(^)((.|\n)+?)(\.\s{2})/',$str, $matches);
$dataarray=$matches[2];
foreach ($dataarray as $value)
{ echo $value; }
?>
But it does not work: the {2}
occurrence is incorrect.
I have to use preg_match_all
because I am scraping dynamic HTML.
I want to capture this from the string:
East Winds 20 knots. Gusts to 25 knots.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这是一种不同的方法
Here is a different approach
为什么不先获取所有句点,然后获取空格,然后仅使用部分结果呢?
不过,我不确定您到底想从中捕获什么。你的问题有点模糊。
现在,如果您想捕获第二个句点(后跟空格)之前的所有内容,请尝试:
它使用非贪婪通配符匹配和
DOTALL
因此.
匹配换行符。如果您不想捕获最后一个空格,也可以这样做:
此外,您可能希望允许字符串终止计数,这意味着:
或
最后,因为您在一个匹配之后想要第一个匹配,为此,您可以轻松地使用
preg_match()
而不是preg_match_all()
。Why not just get all periods then a space and only use some of the results?
I'm not sure what exactly you want to capture from this however. Your question is a little vague.
Now if you want to capture everything up to and including the second period (followed by a space) try:
It uses a non-greedy wildcard match and
DOTALL
so.
matches newlines.If you don't want to capture the last space, you can do that too:
Also you may want to allow the string termination to count, which means either:
or
Lastly, since you're after one match and want the first one, you could just as easily use
preg_match()
rather thanpreg_match_all()
for this.我不认为 (.\s{2}) 意味着你所认为的意思。按照目前的情况,它将匹配“.”(句点后跟两个空格),而不是第二次出现“.”。
I don't think (.\s{2}) means what you think it means. As it stands, it will match ". " (a period followed by two spaces), not the second occurence of ". "
您可以尝试:
输出:
另外,如果您只想捕获一次出现,为什么要使用
preg_match_all
?preg_match
应该足够了。You can try:
Output:
Also if you want to capture just one occurrence, why are you using
preg_match_all
?preg_match
should suffice.不需要正则表达式。想简单
no need regex. think simple
我有两个建议:
1)简单地在“.”(双倍空格)处分解字符串并打印结果。
2)使用Explode和Strpos,这比Preg_match_all对性能更友好。
I have two suggestions:
1) Simply Explode the string at ". " (double space) and just print the result.
2) Use Explode and Strpos which is more performance-friendly than Preg_match_all.