WWW::Mechanize Perl 登录仅在重新启动后才有效

发布于 2024-08-26 14:18:05 字数 967 浏览 7 评论 0原文

我正在尝试使用 PerlWWW::Mechanize自动登录网站。

我所做的是:

$bot = WWW::Mechanize->new();
$bot->cookie_jar(
        HTTP::Cookies->new(
            file           => "cookies.txt",
            autosave       => 1,
            ignore_discard => 1,
        )
);

$response = $bot->get( 'http://blah.foo/login' );

$bot->form_number(1);

$bot->field( usern => 'user' );
$bot->field( pass => 'pass' );
$response =$bot->click();

print $response->content();

$response = $bot->get( 'http://blah.foo' );

print $response->content();

登录有效,但是当我加载页面时,它告诉我没有连接。

您会看到我将 cookie 存储在文件中。现在,如果我在没有登录部分的情况下重新启动脚本,它会说我已连接......

有人理解这种奇怪的行为吗?

编辑:事实上,我注意到某些平台上的某些网络浏览器也会出现此问题。该页面显示“未登录”。不过,重新加载要登录的页面就足够了。

在脚本中,我尝试进行双重获取,但效果并不好。唯一的方法是启动它两次。

当我两次执行最后一个请求时,它与 curl 一起工作。

I'm trying to login automatically in a website using Perl with WWW::Mechanize.

What I do is:

$bot = WWW::Mechanize->new();
$bot->cookie_jar(
        HTTP::Cookies->new(
            file           => "cookies.txt",
            autosave       => 1,
            ignore_discard => 1,
        )
);

$response = $bot->get( 'http://blah.foo/login' );

$bot->form_number(1);

$bot->field( usern => 'user' );
$bot->field( pass => 'pass' );
$response =$bot->click();

print $response->content();

$response = $bot->get( 'http://blah.foo' );

print $response->content();

The login works, but when I load the page it tells me that I am not connected.

You see that I store cookies in a file. Now if I relaunch the script without the login part, it says that I am connected...

Does anyone understand this strange behaviour ?

Edit: In fact I noticed that the problem happens too with some web browsers on certain platform. The page says "Not logged in". However, it is sufficient to reload the page to be logged in.

In the script, I tried to do a double get, but it doesn't work better. The only way is to launch it twice.

It worked with curl when I did the last request twice.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

皓月长歌 2024-09-02 14:18:05

我见过的一些网站没有在每个页面上正确设置或处理会话 cookie,因此如果您以“意外”顺序访问其页面,它们就会失败。例如,登录页面或登录处理程序页面或某些弹出内容页面可能期望看到已由站点的普通页面设置的会话 cookie。

这听起来像是您的问题,因为当您获取页面时已经设置了 cookie 时,它​​会第二次起作用。

我通过在进入实际登录部分之前获取一些“正常”页面来在脚本中模拟更典型的浏览器用户会话活动来解决此类问题:

$www->get('http://www.example.com');         # Homepage
$www->get('http://www.example.com/account'); # Authenticated section front page
# Now everything is set up, proceed with account login...

Some websites I have seen don't set or handle their session cookies correctly on every page so they fail if you access their pages in "unexpected" order. For example, the login page or login handler page or some popup content page may expect to see the session cookie already set by a normal page from the site.

This sounds like your problem, because it works the second time when the cookie is already set when you fetch the page.

I have worked around this kind of problems by simulating a more typical browser user session activity in my script by fetching some "normal" pages before going to the actual login part:

$www->get('http://www.example.com');         # Homepage
$www->get('http://www.example.com/account'); # Authenticated section front page
# Now everything is set up, proceed with account login...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文