为什么 Perl 的 URI 会抱怨“无法定位对象方法“主机””?通过包“URI::_generic””?

发布于 2024-09-24 09:51:48 字数 2120 浏览 8 评论 0原文

嗨,我正在尝试从网址获取主机。

sub scrape {
my @m_error_array;
my @m_href_array;
my @href_array;
my ( $self, $DBhost, $DBuser, $DBpass, $DBname ) = @_;
my ($dbh, $query, $result, $array);
my $DNS = "dbi:mysql:$DBname:$DBhost:3306";
$dbh = DBI->connect($DNS, $DBuser, $DBpass ) or die $DBI::errstr;
if( defined( $self->{_process_image} ) && ( -e 'href_w_' . $self->{_process_image} . ".txt" ) ) {
    open  ERROR_W, "error_w_" . $self->{_process_image} . ".txt";
    open  M_HREF_W, "m_href_w_" . $self->{_process_image} . ".txt";
    open  HREF_W, "href_w_" . $self->{_process_image} . ".txt";
    @m_error_array = ( split( '|||', <ERROR_W> ) );
    @m_href_array = ( split( '|||', <M_HREF_W> ) );
    @href_array = ( split( '|||', <HREF_W> ) );
    close ( ERROR_W );
    close ( M_HREF_W );
    close ( HREF_W );
}else{
    @href_array = ( $self->{_url} );
}
my $z = 0;
while( @href_array ){
    if( defined( $self->{_x_more} ) && $z == $self->{_x_more} ) {
        last;
    }
    if( defined( $self->{_process_image} ) ) {
        $self->write( 'm_href_w', @m_href_array );
        $self->write( 'href_w', @href_array );
        $self->write( 'error_w', @m_error_array );
    }
    $self->{_link_count} = scalar @m_href_array;
    my $href = shift( @href_array );
    my $info = URI->new($href);
    my $host = $info->host;
    $host =~ s/^www\.//;
    $result = $dbh->prepare("INSERT INTO `". $host ."` (URL) VALUES ('$href')");
    if( ! $result->execute() ){
        $result = $dbh->prepare("CREATE TABLE `" . $host . "` ( `ID` INT( 255 ) NOT NULL AUTO_INCREMENT , `URL` VARCHAR( 255 ) NOT NULL , PRIMARY KEY ( `ID` )) ENGINE = MYISAM ;");
        $result->execute()
    }
    $self->{_current_page} = $href;
    my $response = $ua->get($href);
    my $responseCode = $response->code;
    print $responseCode;
}

快结束时,my $host = $info->host; 抛出Can't located object method "host" via package "URI::_generic"

谁能解释一下吗?

问候,

菲尔

HI, im trying to get the host from a url.

sub scrape {
my @m_error_array;
my @m_href_array;
my @href_array;
my ( $self, $DBhost, $DBuser, $DBpass, $DBname ) = @_;
my ($dbh, $query, $result, $array);
my $DNS = "dbi:mysql:$DBname:$DBhost:3306";
$dbh = DBI->connect($DNS, $DBuser, $DBpass ) or die $DBI::errstr;
if( defined( $self->{_process_image} ) && ( -e 'href_w_' . $self->{_process_image} . ".txt" ) ) {
    open  ERROR_W, "error_w_" . $self->{_process_image} . ".txt";
    open  M_HREF_W, "m_href_w_" . $self->{_process_image} . ".txt";
    open  HREF_W, "href_w_" . $self->{_process_image} . ".txt";
    @m_error_array = ( split( '|||', <ERROR_W> ) );
    @m_href_array = ( split( '|||', <M_HREF_W> ) );
    @href_array = ( split( '|||', <HREF_W> ) );
    close ( ERROR_W );
    close ( M_HREF_W );
    close ( HREF_W );
}else{
    @href_array = ( $self->{_url} );
}
my $z = 0;
while( @href_array ){
    if( defined( $self->{_x_more} ) && $z == $self->{_x_more} ) {
        last;
    }
    if( defined( $self->{_process_image} ) ) {
        $self->write( 'm_href_w', @m_href_array );
        $self->write( 'href_w', @href_array );
        $self->write( 'error_w', @m_error_array );
    }
    $self->{_link_count} = scalar @m_href_array;
    my $href = shift( @href_array );
    my $info = URI->new($href);
    my $host = $info->host;
    $host =~ s/^www\.//;
    $result = $dbh->prepare("INSERT INTO `". $host ."` (URL) VALUES ('$href')");
    if( ! $result->execute() ){
        $result = $dbh->prepare("CREATE TABLE `" . $host . "` ( `ID` INT( 255 ) NOT NULL AUTO_INCREMENT , `URL` VARCHAR( 255 ) NOT NULL , PRIMARY KEY ( `ID` )) ENGINE = MYISAM ;");
        $result->execute()
    }
    $self->{_current_page} = $href;
    my $response = $ua->get($href);
    my $responseCode = $response->code;
    print $responseCode;
}

}

Towards te end the line my $host = $info->host; is throwing Can't locate object method "host" via package "URI::_generic"

Can anyone explain this?

Regards,

Phil

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

尾戒 2024-10-01 09:51:48

URI->new 创建 URI 子类的实例,具体取决于您提供的 url 的方案。这些子类可能是 URI::httpURI::fileURI::mailto 或完全不同的东西。如果 URI 没有针对您提供的 url 类型的专门子类,它将创建 URI::_generic 的实例。

每个 URI 子类都有不同的方法。 URI::http 恰好有一个 host 方法,但大多数其他方法没有。您正在对不是 URI::http 或类似内容的内容调用 ->host,因此没有 host > 方法。

您可能希望传递给 URI->new 的所有字符串都是 http url。情况似乎并非如此,因此您可能需要检查您的数据。否则,如果您确实想要处理非 http url,则应在调用该实例之前确保该实例确实存在一个方法,例如使用 ->can->;伊莎

URI->new creates instances of a subclass of URI, depending on the scheme of the url you give it. Those subclasses might be URI::http, URI::file, URI::mailto, or something completely different. If URI doesn't have a specialized subclass for the kind of url you gave it, it'll create an instance of URI::_generic.

Each of those URI subclasses have different methods. URI::http happens to have a host method, but most others don't. You're calling ->host on something that isn't a URI::http or similar, and therefore doesn't have a host method.

You probably expected all the strings you pass to URI->new to be http urls. That doesn't seem to be the case, so you might want to check your data. Otherwise, if you do want to handle non-http urls, you should make sure a method actually exists for that instance before calling it, for example by using ->can or ->isa.

深爱成瘾 2024-10-01 09:51:48

换句话说,URI 尝试猜测该方案,如果 URL 的格式错误,则该方案将不具有这些方法。

您所需要的只是像这样的检查:

if($uri->scheme ne 'http'){
    die "URL '$url' was not http\n";
}

即使没有找到方案,方案也会在那里。它只是没有价值。

To word this differently - URI attempts to guess the scheme, and if the URL is of a bad format will be of a scheme that doesn't have those methods.

All you need is a check something like:

if($uri->scheme ne 'http'){
    die "URL '$url' was not http\n";
}

scheme will be there even when it hasn't found a scheme. It'll just not have a value.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文