如何使用 Perl LWP 抓取欢迎页面?

发布于 2024-12-11 07:07:45 字数 835 浏览 0 评论 0原文

我正在尝试使用 Perl LWP 抓取此页面:

http://livingsocial.com/cities/86/deals/138811-hour-long-photo-session-cd-and-more

我有曾经能够处理生活的代码社交,但它似乎已经停止工作了。基本上,这个想法是抓取页面一次,获取其 cookie,在 UserAgent 中设置 cookie,然后再抓取两次。通过这样做,您可以进入欢迎页面:

$response = $browser->get($url);
$cookie_jar->extract_cookies($response);  
$browser->cookie_jar($cookie_jar);
$response = $browser->get($url);
$response = $browser->get($url);

这似乎已经停止为正常的 LivingSocial 页面工作,但似乎仍然为 LivinSocialEscapes 工作。例如:

http://livingsocial.com/escapes/148029-cook- island-hotel-+-airfare

关于如何通过欢迎页面有什么建议吗?

I'm trying to crawl this page using Perl LWP:

http://livingsocial.com/cities/86/deals/138811-hour-long-photo-session-cd-and-more

I had code that used to be able to handle living social, but it seems to have stopped working. Basically the idea was to crawl the page once, get its cookie, set the cookie in the UserAgent, and crawl it twice more. By doing this, you could get through the welcome page:

$response = $browser->get($url);
$cookie_jar->extract_cookies($response);  
$browser->cookie_jar($cookie_jar);
$response = $browser->get($url);
$response = $browser->get($url);

This seems to have stopped working for normal LivingSocial pages, but still seems to work for LivinSocialEscapes. E.g.,:

http://livingsocial.com/escapes/148029-cook-islands-hotel-+-airfare

Any tips on how to get past the welcome page?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

寄居人 2024-12-18 07:07:45

看起来此页面仅适用于支持 Javascript 的浏览器(LWP::UserAgent 不支持)您可以尝试 WWW::Mechanize::Firefox

use WWW::Mechanize::Firefox;
my $mech = WWW::Mechanize::Firefox->new();
$mech->get($url);

请注意,您必须拥有 Firefox 和 mozrepl 安装扩展程序才能使该模块正常工作。

It looks like this page only works with a Javascript enabled browser (which LWP::UserAgent is not) You could try WWW::Mechanize::Firefox instead:

use WWW::Mechanize::Firefox;
my $mech = WWW::Mechanize::Firefox->new();
$mech->get($url);

Note that you must have Firefox and the mozrepl extension installed for this module to work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文