php preg_replace,正则表达式
我正在尝试使用 php 和 preg_replace 从 yell.com 中提取邮政编码。 我成功提取了邮政编码,但只提取了地址。这是一个例子
$URL = "http://www.yell.com/ucs/UcsSearchAction.do?scrambleSeed=17824062&keywords=shop&layout=&companyName=&location=London&searchType=advance&broaderLocation=&clarifyIndex=0&clarifyOptions=CLOTHES+SHOPS|CLOTHES+SHOPS+-+LADIES|&ooa=&M=&ssm=1&lCOption32=RES|CLOTHES+SHOPS+-+LADIES&bandedclarifyResults=1";//以字符串形式获取 yell.com 页面 $htmlContent = $baseClass->getContent($URL); //获取邮政编码和地址 $result2 = preg_match_all("/(.*)/", $htmlContent, $matches);
print_r($匹配);
上面的代码输出类似 Array ( [0] => Array ( [0] => 7, Royal Parade, Chislehurst, Kent BR7 6NR [1] => 55, Monmouth St, London, WC2H 9DG .... 我遇到的问题是我不知道如何只提取没有地址的邮政编码,因为它没有确切的位数(有时有 6 位数字,有时只有 5 次)基本上我应该提取最后的 2 个单词。来自每个数组。 预先感谢您的帮助!
I'm trying to extract the postal codes from yell.com using php and preg_replace.
I successfully extracted the postal code but only along with the address. Here is an example
$URL = "http://www.yell.com/ucs/UcsSearchAction.do?scrambleSeed=17824062&keywords=shop&layout=&companyName=&location=London&searchType=advance&broaderLocation=&clarifyIndex=0&clarifyOptions=CLOTHES+SHOPS|CLOTHES+SHOPS+-+LADIES|&ooa=&M=&ssm=1&lCOption32=RES|CLOTHES+SHOPS+-+LADIES&bandedclarifyResults=1";//get yell.com page in a string $htmlContent = $baseClass->getContent($URL); //get postal code along with the address $result2 = preg_match_all("/(.*)</span>/", $htmlContent, $matches);
print_r($matches);
The above code ouputs something like
Array ( [0] => Array ( [0] => 7, Royal Parade, Chislehurst, Kent BR7 6NR [1] => 55, Monmouth St, London, WC2H 9DG .... the problem that I have is that I don't know how to extract only the postal code without the address because it doesn't have an exact number of digits (sometimes it has 6 digits and sometimes has only 5 times). Basically I should extract the lasted 2 words from each array .
Thank you in advance for any help !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
快速&脏:
查看它运行于:代码板
quick & dirty:
See it run at: code pad
如果您只需要匹配字符串中的最后两个单词,则可以使用此正则表达式:
这将匹配它所说的内容:单词边界,一些非空单词,一些空格,然后是另一个单词,后面是字符串结尾锚。
此打印:
您还可以通过在最后一个单词后允许可选的尾随空格来使正则表达式更加健壮
\s*
等,但使用$
是主要思想。If you just need to match the last two words in a string, you can use this regex:
This will match what it says: a word boundary, some non-empty word, some white spaces, then another word, followed by end of string anchor.
This prints:
You may also make the regex more robust by allowing optional trailing white spaces after the last word
\s*
, etc, but using the$
is the main idea.