在 PHP 和正则表达式中使用 preg_match_all 进行 URL 匹配
我正在尝试构建一个爬虫,从 imdb 列表中获取电影网址。我能够将页面上的所有链接放入一个数组中,并且只想选择那些带有“标题”的链接。
preg_match_all($pattern, "[125] => href=\"/chart/2000s?mode=popular\" [126] => href=\"/title/tt0111161/\" ", $matches);
其中$pattern='/title/'
。
我收到以下错误:
Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in C:\xampp\htdocs\phpProject1\index.php on line 53
知道如何解决这个问题吗?多谢。
I am trying to build a crawler that gets the movie urls from an imdb list. I am able to get all the links on the page into an array and want to select only those ones with "title" in them.
preg_match_all($pattern, "[125] => href=\"/chart/2000s?mode=popular\" [126] => href=\"/title/tt0111161/\" ", $matches);
where $pattern='/title/'
.
I am getting the following error:
Warning: preg_match_all() [function.preg-match-all]: Delimiter must not be alphanumeric or backslash in C:\xampp\htdocs\phpProject1\index.php on line 53
Any idea on how to go about this? Thanks a lot.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用 DOM 解析器:
http://simplehtmldom.sourceforge.net/manual.htm
Use a DOM Parser:
http://simplehtmldom.sourceforge.net/manual.htm
您确定在调用 preg_match_all 时
$pattern
是'/title/'
吗?当提供给 preg_match_all (第一个参数)的模式未正确分隔时,您会收到错误。
Are you sure
$pattern
is'/title/'
at the time when preg_match_all is called?The error you are getting comes when the pattern provided to preg_match_all (1st argument) is not properly delimited.