IMDB 抓取器 PHP

发布于 2024-08-26 19:18:39 字数 2009 浏览 9 评论 0原文

我收到错误:

注意:未定义的变量:第 17 行 C:\wamp\www\includes\imdbgrabber.php 中的内容

使用此代码时:

<?php
//url
$url = 'http://www.imdb.com/title/tt0367882/';

//get the page content
$imdb_content = get_data($url);

//parse for product name
$name = get_match('/<title>(.*)<\/title>/isU',$imdb_content);
$director = strip_tags(get_match('/<h5[^>]*>Director:<\/h5>(.*)<\/div>/isU',$imdb_content));
$plot = get_match('/<h5[^>]*>Plot:<\/h5>(.*)<\/div>/isU',$imdb_content);
$release_date = get_match('/<h5[^>]*>Release Date:<\/h5>(.*)<\/div>/isU',$imdb_content);
$mpaa = get_match('/<a href="\/mpaa">MPAA<\/a>:<\/h5>(.*)<\/div>/isU',$imdb_content);
$run_time = get_match('/Runtime:<\/h5>(.*)<\/div>/isU',$imdb_content);

//build content


line 17 -->  $content.= '<h2>Film</h2><p>'.$name.'</p>';
    $content.= '<h2>Director</h2><p>'.$director.'</p>';
    $content.= '<h2>Plot</h2><p>'.substr($plot,0,strpos($plot,'<a')).'</p>';
    $content.= '<h2>Release Date</h2><p>'.substr($release_date,0,strpos($release_date,'<a')).'</p>';
    $content.= '<h2>MPAA</h2><p>'.$mpaa.'</p>';
    $content.= '<h2>Run Time</h2><p>'.$run_time.'</p>';
    $content.= '<h2>Full Details</h2><p><a href="'.$url.'" rel="nofollow">'.$url.'</a></p>';

    echo $content;

//gets the match content
function get_match($regex,$content)
{
    preg_match($regex,$content,$matches);
    return $matches[1];
}

//gets the data from a URL
function get_data($url)
{
    $ch = curl_init();
    $timeout = 5;
    curl_setopt($ch,CURLOPT_URL,$url);
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
?>

I am recieving an error:

Notice: Undefined variable: content in C:\wamp\www\includes\imdbgrabber.php on line 17

When using this code:

<?php
//url
$url = 'http://www.imdb.com/title/tt0367882/';

//get the page content
$imdb_content = get_data($url);

//parse for product name
$name = get_match('/<title>(.*)<\/title>/isU',$imdb_content);
$director = strip_tags(get_match('/<h5[^>]*>Director:<\/h5>(.*)<\/div>/isU',$imdb_content));
$plot = get_match('/<h5[^>]*>Plot:<\/h5>(.*)<\/div>/isU',$imdb_content);
$release_date = get_match('/<h5[^>]*>Release Date:<\/h5>(.*)<\/div>/isU',$imdb_content);
$mpaa = get_match('/<a href="\/mpaa">MPAA<\/a>:<\/h5>(.*)<\/div>/isU',$imdb_content);
$run_time = get_match('/Runtime:<\/h5>(.*)<\/div>/isU',$imdb_content);

//build content


line 17 -->  $content.= '<h2>Film</h2><p>'.$name.'</p>';
    $content.= '<h2>Director</h2><p>'.$director.'</p>';
    $content.= '<h2>Plot</h2><p>'.substr($plot,0,strpos($plot,'<a')).'</p>';
    $content.= '<h2>Release Date</h2><p>'.substr($release_date,0,strpos($release_date,'<a')).'</p>';
    $content.= '<h2>MPAA</h2><p>'.$mpaa.'</p>';
    $content.= '<h2>Run Time</h2><p>'.$run_time.'</p>';
    $content.= '<h2>Full Details</h2><p><a href="'.$url.'" rel="nofollow">'.$url.'</a></p>';

    echo $content;

//gets the match content
function get_match($regex,$content)
{
    preg_match($regex,$content,$matches);
    return $matches[1];
}

//gets the data from a URL
function get_data($url)
{
    $ch = curl_init();
    $timeout = 5;
    curl_setopt($ch,CURLOPT_URL,$url);
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
    $data = curl_exec($ch);
    curl_close($ch);
    return $data;
}
?>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

姐不稀罕 2024-09-02 19:18:40

您正在将内容附加到不存在的变量。将第 17 行更改为赋值:

$content = '<h2>Film</h2><p>'.$name.'</p>';

您还可以将该代码部分更改为以下内容,这稍微简洁一些:

$content = '<h2>Film</h2><p>'.$name.'</p>'
         . '<h2>Director</h2><p>'.$director.'</p>'
         . '<h2>Plot</h2><p>'.substr($plot,0,strpos($plot,'<a')).'</p>'
      // etc

You are appending content to a variable that doesn't exist. Change line 17 to an assignment:

$content = '<h2>Film</h2><p>'.$name.'</p>';

You could also change that section of code to the following, which is slightly neater:

$content = '<h2>Film</h2><p>'.$name.'</p>'
         . '<h2>Director</h2><p>'.$director.'</p>'
         . '<h2>Plot</h2><p>'.substr($plot,0,strpos($plot,'<a')).'</p>'
      // etc
她如夕阳 2024-09-02 19:18:40

当变量 $content 尚不存在时,您尝试向其添加内容,这自然会触发错误。

尝试将第 17 行中的 $content.= 替换为 $content=

You are trying to add something to the variable $content when it doesn't exist yet, this naturally triggers an error.

Try replacing $content.= with $content= in line 17.

半枫 2024-09-02 19:18:40

您没有收到错误,而是收到通知,因为您尝试将某些内容连接到不存在的变量。删除第 17 行 .= 中的点,或将 $content = '' 放在第 17 行之前。

You're not receiving an error, you're receiving a notice because you try to concatenate something to a variable that does not exist. Remove the dot from .= at line 17 or put $content = '' before line 17.

忆梦 2024-09-02 19:18:40

除了其他人所说的之外,您的代码还有另一个问题需要注意。在从函数 get_match 返回值之前,您不会检查 preg_match 的返回值。你应该做类似的事情:

if(preg_match($regex,$content,$matches))
  return $matches[1];
else
  // return some default

Apart from what others have said there is another issue with your code that needs attention. You are not checking the return value of preg_match before you return the value from the function get_match. You should do something like:

if(preg_match($regex,$content,$matches))
  return $matches[1];
else
  // return some default
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文