提取字符串匹配模式的一部分 - 正则表达式,关闭但没有雪茄
我有一个可能很长并且包含各种行和字符的字符串。
我想提取 SB & 包围的所有线。 EB:
SB1EB
SBa description of various lengthEB
SB123.456.78EB
SB99.99EB
SB99.99EB
SB2EB
SBanother description of various lengthEB
SB123.456.00EB
SB199.99EB
SB199.99EB
3
another description of various length that I don't want to return
123.456.00
599.99
599.99
SB60EB
SBanother description of various length that i want to keepEB
SB500.256.10EB
SB0.99EB
SB0.99EB
another bit of text that i don't want - can span multiple lines
这是我在 PHP 中使用的模式:
preg_match_all('/SB(\d+)EB\nSB(\w.*)EB\nSB(\d{3}\.\d{3}\.\d{2})EB\nSB(\d.*)EB\nSB(\d.*)EB\n/', $string, $matches)
所以这应该有希望返回:
[0] -> SB1EB
SBa description of various lengthEB
SB123.456.78EB
SB99.99EB
SB99.99EB
[1] -> SB2EB
SBanother description of various lengthEB
SB123.456.00EB
SB199.99EB
SB199.99EB
[2] -> SB60EB
SBanother description of various length that i want to keepEB
SB500.256.10EB
SB0.99EB
SB0.99EB
但我显然做错了什么,因为它不匹配任何东西。有人可以帮忙吗?
解决方案:
基于@Sajid 回复:
if (preg_match_all('/(?:SB.+?EB(?:[\r\n]+|$))/', $string, $result)) {
for($i=0;$i<count($result[0]);$i++){
$single_item = $result[0][$i];
$single_item = str_replace("SB","",$single_item);
$single_item = str_replace("EB","",$single_item);
if (preg_match('/(\d{3}\.\d{3}\.\d{2})/', $single_item)) {
$id = $single_item;
$qty = $result[0][$i-2];
$name = $result[0][$i-1];
$price = $result[0][$i+1];
$total = $result[0][$i+2];
}
}
}
有点乱,但它有效! :)
谢谢
I have a string that can be very long and contain various lines and characters.
I am wanting to extract all lines that are surrounded by SB & EB:
SB1EB
SBa description of various lengthEB
SB123.456.78EB
SB99.99EB
SB99.99EB
SB2EB
SBanother description of various lengthEB
SB123.456.00EB
SB199.99EB
SB199.99EB
3
another description of various length that I don't want to return
123.456.00
599.99
599.99
SB60EB
SBanother description of various length that i want to keepEB
SB500.256.10EB
SB0.99EB
SB0.99EB
another bit of text that i don't want - can span multiple lines
This is the pattern I am using in PHP:
preg_match_all('/SB(\d+)EB\nSB(\w.*)EB\nSB(\d{3}\.\d{3}\.\d{2})EB\nSB(\d.*)EB\nSB(\d.*)EB\n/', $string, $matches)
So this should hopefully return:
[0] -> SB1EB
SBa description of various lengthEB
SB123.456.78EB
SB99.99EB
SB99.99EB
[1] -> SB2EB
SBanother description of various lengthEB
SB123.456.00EB
SB199.99EB
SB199.99EB
[2] -> SB60EB
SBanother description of various length that i want to keepEB
SB500.256.10EB
SB0.99EB
SB0.99EB
But I'm obviously doing something wrong because it isn't matching anything. Can somebody help please?
SOLUTION:
Based on @Sajid reply:
if (preg_match_all('/(?:SB.+?EB(?:[\r\n]+|$))/', $string, $result)) {
for($i=0;$i<count($result[0]);$i++){
$single_item = $result[0][$i];
$single_item = str_replace("SB","",$single_item);
$single_item = str_replace("EB","",$single_item);
if (preg_match('/(\d{3}\.\d{3}\.\d{2})/', $single_item)) {
$id = $single_item;
$qty = $result[0][$i-2];
$name = $result[0][$i-1];
$price = $result[0][$i+1];
$total = $result[0][$i+2];
}
}
}
It's a bit messy, but it works! :)
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
有点黑客,但这可以完成工作:
注意 ?: 用于使组不捕获,结果将在 $a[0] 中(例如,$a[0][0], $a[0][1], $a[0][2] ...)
A bit of a hack, but this will do the job:
Note that ?: is used to make the group non-capture, and the results will be in $a[0] (eg, $a[0][0], $a[0][1], $a[0][2] ...)
基于@Sajid回复:
有点乱,但它有效! :)
Based on @Sajid reply:
It's a bit messy, but it works! :)
所以基本上我所做的(根据您的输入)只是检查“标头”字符串 SB\d+EB 作为入口点并消耗所有内容,直到找到另一个“标头”或输入的末尾。请注意 /s 修饰符,以便 .匹配换行符。
说明:
So basically what I am doing (based on your input) is simply checking the "header" string SB\d+EB as an entry point and consuming everything until I find another "header" or the end of the input. Note the /s modifier so that . matches newlines.
Explanation: