当前位置：文江博客话题详情

在 Perl 脚本中搜索？

发布于 2024-10-14 04:14:26 字数 1436 浏览 4 评论 0原文

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

草莓酥 2024-10-21 04:14:26

在继续操作之前，请先了解 Google 服务条款。

您同意不通过 Google 提供的界面以外的任何方式访问（或尝试访问）任何服务，除非您在与 Google 签订的单独协议中获得特别允许。您明确同意不通过任何自动方式（包括使用脚本或网络爬虫）访问（或尝试访问）任何服务，并应确保您遵守服务上存在的任何 robots.txt 文件中规定的说明.

话虽这么说，有一个官方 API 可以以编程方式查询网络搜索。

JSON/Atom 自定义搜索 API 可让您开发网站并以编程方式从您的 Google 自定义搜索检索和显示搜索结果的程序。通过此 API，您可以使用 RESTful 请求来获取 JSON 或 Atom 格式的搜索结果。

您可以使用 XML::Atom::Client 或 LWP+JSON::Any 或许多其他库来执行 REST< /a> 调用。

（您可能仍会找到对旧版 Google Web Search API 的引用，但它已被弃用并受到限制。）

回复收藏 0 原文

如果没结果 2024-10-21 04:14:26

看一下 Google 自定义搜索 API：
http://code.google.com/apis/customsearch/

如果您需要搜索对于更广泛的主机，您需要使用较旧的、已弃用的 Websearch API，但这会限制您每天可以进行的查询数量。

除此之外，您将需要进行大量的 html 抓取和解析。

回复收藏 0 原文

鲜肉鲜肉永远不皱 2024-10-21 04:14:26

这是一个简单的脚本的样子（是的，它违反了 TOS，所以它只是 PoC，你不应该使用它......）

use WWW::Mechanize;
use 5.10.0;
use strict;
use warnings;

my $mech = new WWW::Mechanize;

my $option = shift; 

#you may customize your google search by editing this url (always end it with "q=" though)
my $google = 'http://www.google.co.uk/search?q='; 
my @dork = ("this is my search one","this is my search two"); 

        #declare necessary variables
        my $max = 0;
        my $link;
        my $sc = scalar(@dork);

        #start the main loop, one itineration for every google search
        for my $i ( 0 .. $sc ) {

            #loop until the maximum number of results chosen isn't reached
            while ( $max <= $option ) {
                #say $google . $dork[$i] . "&start=" . $max;
                $mech->get( $google . $dork[$i] . "&start=" . $max );

                #get all the google results
                foreach $link ( $mech->links() ) {
                    my $google_url = $link->url;
                    if ( $google_url !~ /^\// && $google_url !~ /google/ ) {
                    say $google_url;
            }
                    }
                     $max += 10;
                }


            }

顺便说一句，我不久前写了这个，所以它不完全符合标准，但它确实完成了工作，而且我懒得启动 linux 来找到这个的更新版本......

Here is how a simple script could look like (and yes it violates TOS so it's just PoC, and you shouldn't use it...)

use WWW::Mechanize;
use 5.10.0;
use strict;
use warnings;

my $mech = new WWW::Mechanize;

my $option = shift; 

#you may customize your google search by editing this url (always end it with "q=" though)
my $google = 'http://www.google.co.uk/search?q='; 
my @dork = ("this is my search one","this is my search two"); 

        #declare necessary variables
        my $max = 0;
        my $link;
        my $sc = scalar(@dork);

        #start the main loop, one itineration for every google search
        for my $i ( 0 .. $sc ) {

            #loop until the maximum number of results chosen isn't reached
            while ( $max <= $option ) {
                #say $google . $dork[$i] . "&start=" . $max;
                $mech->get( $google . $dork[$i] . "&start=" . $max );

                #get all the google results
                foreach $link ( $mech->links() ) {
                    my $google_url = $link->url;
                    if ( $google_url !~ /^\// && $google_url !~ /google/ ) {
                    say $google_url;
            }
                    }
                     $max += 10;
                }


            }

By the way I wrote this a while back, so it's not exactly up to the par, but it does the job, and I am too lazy to boot linux to find the newer version of this...

回复收藏 0 原文

~没有更多了~

关于作者

波浪屿的海角声

暂无简介

文章

25 人气

关注发私信

三岁铭

文章 0 评论 0

关注

alipaysp_VP2a8Q4rgx

文章 0 评论 0

关注

拧巴小姐

文章 0 评论 0

关注

1649543945

文章 0 评论 0

关注

深居我梦

文章 0 评论 0

关注

tongsw

文章 0 评论 0

友情链接

文江博客

在 Perl 脚本中搜索？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

三岁铭

alipaysp_VP2a8Q4rgx

拧巴小姐

1649543945

深居我梦

tongsw

友情链接

在 Perl 脚本中搜索？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

三岁铭

alipaysp_VP2a8Q4rgx

拧巴小姐

1649543945

深居我梦

tongsw

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。