jsoup 发布和 cookie
我正在尝试使用 jsoup 登录网站,然后抓取信息,我遇到了问题,我可以成功登录并从 index.php 创建文档,但我无法获取网站上的其他页面。我知道我需要在发布后设置一个 cookie,然后在尝试打开网站上的另一个页面时加载它。但我该怎么做呢?下面的代码让我登录并获取index.php
Document doc = Jsoup.connect("http://www.example.com/login.php")
.data("username", "myUsername",
"password", "myPassword")
.post();
我知道我可以使用apache httpclient 来执行此操作,但我不想这样做。
I'm trying to use jsoup to login to a site and then scrape information, I am running into in a problem, I can login successfully and create a Document from index.php but I cannot get other pages on the site. I know I need to set a cookie after I post and then load it when I'm trying to open another page on the site. But how do I do this? The following code lets me login and get index.php
Document doc = Jsoup.connect("http://www.example.com/login.php")
.data("username", "myUsername",
"password", "myPassword")
.post();
I know I can use apache httpclient to do this but I don't want to.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
当您登录该站点时,它可能会设置一个授权会话 cookie,需要在后续请求中发送该 cookie 以维持会话。
您可以像这样获取 cookie:
然后在下一个请求中发送它,如下所示:
When you login to the site, it is probably setting an authorised session cookie that needs to be sent on subsequent requests to maintain the session.
You can get the cookie like this:
And then send it on the next request like:
代码在哪里:
我遇到了困难,直到我将其更改为:
现在它可以完美地工作。
Where the code was:
I was having difficulties until I changed it to:
Now it is working flawlessly.
您可以尝试以下方法...
现在保存您所有的 cookie 并向您想要的其他页面发出请求。
向另一个页面发出请求。
询问是否需要进一步帮助。
Here is what you can try...
Now save all your cookies and make request to the other page you want.
Making request to another page.
Ask if further help needed.
为什么要重新连接?
如果有任何 cookie 可以避免 403 状态,我就会这样做。
Why reconnect?
if there are any cookies to avoid 403 Status i do so.