正则表达式不匹配,贪婪
我尝试使用 PHP 中的正则表达式来匹配字符串中的两个部分。我认为贪婪是有问题的。我希望第一个正则表达式(请参阅评论)为我提供前两个捕获,作为第二个正则表达式,但仍然捕获两个字符串。我做错了什么?
我正在尝试获取 +123
(如果 cd:
存在,如第一个字符串中所示)和 456
。
<?php
$data[] = 'longstring start waste cd:+123yz456z longstring';
$data[] = 'longstring start waste +yz456z longstring';
$regexs[] = '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/'; // first
$regexs[] = '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/'; // second
foreach ($regexs as $regex) {
foreach ($data as $string) {
if (preg_match($regex, $string, $match)) {
echo "Tried '$regex' on '$string' and got " . implode(',', array_split($match, 1));
echo "\n";
}
}
}
?>
输出为:
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
由于第二个字符串中不存在 cd:
,因此没有第四行。
预期输出(因为我不是专家),其中第一行与实际输出不同:
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
I try to match two parts in a string with a regex in PHP. There is a problem with the greediness, I think. I would like the first regex (see comment) to give me the first two captures, as the second regex, but still capture both strings. What am I doing wrong?
I'm trying to get +123
(if cd:
exists, as in first string) and 456
.
<?php
$data[] = 'longstring start waste cd:+123yz456z longstring';
$data[] = 'longstring start waste +yz456z longstring';
$regexs[] = '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/'; // first
$regexs[] = '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/'; // second
foreach ($regexs as $regex) {
foreach ($data as $string) {
if (preg_match($regex, $string, $match)) {
echo "Tried '$regex' on '$string' and got " . implode(',', array_split($match, 1));
echo "\n";
}
}
}
?>
Output is:
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
There is no fourth line since cd:
is not present in the second string.
Expected output (since I'm no expert), where the first line differs from actual output:
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
Tried '/start[^z]*?(cd:([^y]+)y)?[^z]*z([^z]*)z/' on 'longstring start waste +yz456z longstring' and got ,,456
Tried '/start[^z]*?(cd:([^y]+)y)[^z]*z([^z]*)z/' on 'longstring start waste cd:+123yz456z longstring' and got cd:+123y,+123,456
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好的,如果有
cd:
,您想捕获+123
,并且始终为456
?我将这样做:通过自由使用非贪婪(
?
)乘数,您可以让它完全按照您的意愿行事。另请注意
(?:)
非捕获组。它们非常有用。编辑显然这不起作用,让我们尝试一种不同的方法,使用“非此即彼”组:
Okay, so you want to capture
+123
if there is acd:
, and always456
? Here's how I would do it:With the liberal use of non-greedy (
?
) multipliers you can get it to do exactly what you want.Also note the
(?:)
non-capture group. They are very useful.EDIT Apparently that doesn't work, let's try a different approach, with an "either/or" group: