如何使用正则表达式匹配字符串中的所有内容,直到第二次出现分隔符?

发布于 2024-08-26 03:15:37 字数 518 浏览 6 评论 0原文

我试图通过查找第二次出现的句点和空格来优化 preg_match_all

<?php

$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";

preg_match_all ('/(^)((.|\n)+?)(\.\s{2})/',$str, $matches);

$dataarray=$matches[2];
foreach ($dataarray as $value)
{ echo $value; }
?>

但它不起作用:{2} 出现不正确。

我必须使用 preg_match_all 因为我正在抓取动态 HTML。

我想从字符串中捕获这个:

East Winds 20 knots. Gusts to 25 knots.

I am trying to refine a preg_match_all by finding the second occurrence of a period then a space:

<?php

$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";

preg_match_all ('/(^)((.|\n)+?)(\.\s{2})/',$str, $matches);

$dataarray=$matches[2];
foreach ($dataarray as $value)
{ echo $value; }
?>

But it does not work: the {2} occurrence is incorrect.

I have to use preg_match_all because I am scraping dynamic HTML.

I want to capture this from the string:

East Winds 20 knots. Gusts to 25 knots.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

遗弃M 2024-09-02 03:15:37

这是一种不同的方法

$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";


$sentences = preg_split('/\.\s/', $str);

$firstTwoSentences = $sentences[0] . '. ' . $sentences[1] . '.';


echo $firstTwoSentences; // East Winds 20 knots. Gusts to 25 knots.

Here is a different approach

$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";


$sentences = preg_split('/\.\s/', $str);

$firstTwoSentences = $sentences[0] . '. ' . $sentences[1] . '.';


echo $firstTwoSentences; // East Winds 20 knots. Gusts to 25 knots.
倦话 2024-09-02 03:15:37

为什么不先获取所有句点,然后获取空格,然后仅使用部分结果呢?

preg_match_all('!\. !', $str, $matches);
echo $matches[0][1]; // second match

不过,我不确定您到底想从中捕获什么。你的问题有点模糊。

现在,如果您想捕获第二个句点(后跟空格)之前的所有内容,请尝试:

preg_match_all('!^((?:.*?\. ){2})!s', $str, $matches);

它使用非贪婪通配符匹配和 DOTALL 因此 . 匹配换行符。

如果您不想捕获最后一个空格,也可以这样做:

preg_match_all('!^((?:.*?\.(?= )){2})!s', $str, $matches);

此外,您可能希望允许字符串终止计数,这意味着:

preg_match_all('!^((?:.*?\.(?: |\z)){2})!s', $str, $matches);

preg_match_all('!^((?:.*?\.(?= |\z)){2})!s', $str, $matches);

最后,因为您在一个匹配之后想要第一个匹配,为此,您可以轻松地使用 preg_match() 而不是 preg_match_all()

Why not just get all periods then a space and only use some of the results?

preg_match_all('!\. !', $str, $matches);
echo $matches[0][1]; // second match

I'm not sure what exactly you want to capture from this however. Your question is a little vague.

Now if you want to capture everything up to and including the second period (followed by a space) try:

preg_match_all('!^((?:.*?\. ){2})!s', $str, $matches);

It uses a non-greedy wildcard match and DOTALL so . matches newlines.

If you don't want to capture the last space, you can do that too:

preg_match_all('!^((?:.*?\.(?= )){2})!s', $str, $matches);

Also you may want to allow the string termination to count, which means either:

preg_match_all('!^((?:.*?\.(?: |\z)){2})!s', $str, $matches);

or

preg_match_all('!^((?:.*?\.(?= |\z)){2})!s', $str, $matches);

Lastly, since you're after one match and want the first one, you could just as easily use preg_match() rather than preg_match_all() for this.

深者入戏 2024-09-02 03:15:37

我不认为 (.\s{2}) 意味着你所认为的意思。按照目前的情况,它将匹配“.”(句点后跟两个空格),而不是第二次出现“.”。

I don't think (.\s{2}) means what you think it means. As it stands, it will match ". " (a period followed by two spaces), not the second occurence of ". "

战皆罪 2024-09-02 03:15:37

您可以尝试:

<?php
$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";
if(preg_match_all ('/(.*?\. .*?\. )/',$str, $matches))
    $dataarrray = $matches[1];
var_dump($dataarrray);
?>

输出:

array(1) {
  [0]=>
  string(40) "East Winds 20 knots. Gusts to 25 knots. "
}

另外,如果您只想捕获一次出现,为什么要使用 preg_match_allpreg_match 应该足够了。

You can try:

<?php
$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";
if(preg_match_all ('/(.*?\. .*?\. )/',$str, $matches))
    $dataarrray = $matches[1];
var_dump($dataarrray);
?>

Output:

array(1) {
  [0]=>
  string(40) "East Winds 20 knots. Gusts to 25 knots. "
}

Also if you want to capture just one occurrence, why are you using preg_match_all ? preg_match should suffice.

浴红衣 2024-09-02 03:15:37

不需要正则表达式。想简单

$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";
$s = explode(". ",$str);
$s = implode(". ",array_slice($s,0,2)) ;
print_r($s);

no need regex. think simple

$str = "East Winds 20 knots. Gusts to 25 knots. Waters a moderate chop.  Slight chance of showers.";
$s = explode(". ",$str);
$s = implode(". ",array_slice($s,0,2)) ;
print_r($s);
眼藏柔 2024-09-02 03:15:37

我想从绳子上捕捉到这个:东风 20 节。阵风高达 25 节。

我有两个建议:

1)简单地在“.”(双倍空格)处分解字符串并打印结果。

$arr = explode(".  ",$str);
echo $arr[0] . ".";
// Output: East Winds 20 knots. Gusts to 25 knots.

2)使用Explode和Strpos,这比Preg_match_all对性能更友好。

foreach( explode(".",$str) as $key=>$val) {
    echo (strpos($val,"knots")>0) ? trim($val) . ". " : "";
}
// Output: East Winds 20 knots. Gusts to 25 knots.

I want to capture this from the string: East Winds 20 knots. Gusts to 25 knots.

I have two suggestions:

1) Simply Explode the string at ". " (double space) and just print the result.

$arr = explode(".  ",$str);
echo $arr[0] . ".";
// Output: East Winds 20 knots. Gusts to 25 knots.

2) Use Explode and Strpos which is more performance-friendly than Preg_match_all.

foreach( explode(".",$str) as $key=>$val) {
    echo (strpos($val,"knots")>0) ? trim($val) . ". " : "";
}
// Output: East Winds 20 knots. Gusts to 25 knots.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文