当前位置：文江博客话题详情

如何使用 Perl 从网站下载链接目标？

发布于 2024-09-09 00:00:58 字数 327 浏览 1 评论 0原文

我刚刚制作了一个脚本来从网站获取链接，然后将它们保存到文本文件中。

现在我正在处理我的正则表达式，因此它将从文本文件中获取 URL 中包含 php?dl= 的链接：

例如：www.example.com/site/admin/a_files .php?dl=33931

这几乎就是您将鼠标悬停在网站上的 dl 按钮上时获得的地址。您可以从中单击下载或“右键单击保存”。

我只是想知道如何实现这一点，必须下载给定地址的内容，这将下载 *.txt 文件。当然，一切都来自剧本。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

ㄟ。诗瑗 2024-09-16 00:00:58

让 WWW::Mechanize 成为你最好的新朋友。

原因如下：

它可以识别网页上与特定正则表达式匹配的链接（在本例中为 /php\?dl=/）
它可以通过 follow_link 方法跟踪这些链接
它可以获取这些链接的目标并将它们保存到文件中

，而无需将您想要的链接保存在中间文件中！当您拥有适合工作的正确工具时，生活就会变得甜蜜......

示例

use strict;
use warnings;
use WWW::Mechanize;

my $url  = 'http://www.example.com/';
my $mech = WWW::Mechanize->new();

$mech->get ( $url );

my @linksOfInterest = $mech->find_all_links ( text_regex => qr/php\?dl=/ );

my $fileNumber++;

foreach my $link (@linksOfInterest) {

    $mech->get ( $link, ':contentfile' => "file".($fileNumber++).".txt" );
    $mech->back();
}

Make WWW::Mechanize your new best friend.

Here's why:

It can identify links on a webpage that match a specific regex (/php\?dl=/ in this case)
It can follow those links through the follow_link method
It can get the targets of those links and save them to file

All this without needing to save your wanted links in an intermediate file! Life's sweet when you have the right tool for the job...

Example

use strict;
use warnings;
use WWW::Mechanize;

my $url  = 'http://www.example.com/';
my $mech = WWW::Mechanize->new();

$mech->get ( $url );

my @linksOfInterest = $mech->find_all_links ( text_regex => qr/php\?dl=/ );

my $fileNumber++;

foreach my $link (@linksOfInterest) {

    $mech->get ( $link, ':contentfile' => "file".($fileNumber++).".txt" );
    $mech->back();
}

回复收藏 0 原文

逐鹿 2024-09-16 00:00:58

您可以使用 LWP::UserAgent：

my $ua = LWP::UserAgent->new();  
my $response = $ua->get($url, ':content_file' => 'file.txt');

或者如果您需要文件句柄：

open my $fh, '<', $response->content_ref or die $!;

You can download the file with LWP::UserAgent:

my $ua = LWP::UserAgent->new();  
my $response = $ua->get($url, ':content_file' => 'file.txt');

Or if you need a filehandle:

open my $fh, '<', $response->content_ref or die $!;

回复收藏 0 原文

燕归巢 2024-09-16 00:00:58

老问题，但是当我做快速脚本时，我经常使用“wget”或“curl”和管道。这也许不是跨系统可移植的，但如果我知道我的系统有这些命令中的一个或另一个，那么通常是好的。

例如：

#! /usr/bin/env perl
use strict;
open my $fp, "curl http://www.example.com/ |";
while (<$fp>) {
  print;
}

Old question, but when I'm doing quick scripts, I often use "wget" or "curl" and pipe. This isn't cross-system portable, perhaps, but if I know my system has one or the other of these commands, it's generally good.

For example:

#! /usr/bin/env perl
use strict;
open my $fp, "curl http://www.example.com/ |";
while (<$fp>) {
  print;
}

回复收藏 0 原文

~没有更多了~

关于作者

转身泪倾城

暂无简介

0 文章

0 评论

24 人气

关注发私信

1CH1MKgiKxn9p

文章 0 评论 0

关注

ゞ记忆︶ㄣ

文章 0 评论 0

关注

JackDx

文章 0 评论 0

关注

信远

文章 0 评论 0

关注

yaoduoduo1995

文章 0 评论 0

关注

霞映澄塘

文章 0 评论 0

友情链接

文江博客

如何使用 Perl 从网站下载链接目标？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如何使用 Perl 从网站下载链接目标？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。