用于 HTTPS 抓取的 Jsoup Cookie

发布于 2024-11-30 19:26:00 字数 821 浏览 0 评论 0 原文

我正在尝试使用此网站在欢迎页面上收集我的用户名来学习 Jsoup 和 Android。使用以下代码

Connection.Response res = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
    .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username", "ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
    .method(Method.POST)
    .execute();
String sessionId = res.cookie(".ASPXAUTH");

Document doc2 = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
.cookie(".ASPXAUTH", sessionId)
.get();

我的 cookie (.ASPXAUTH) 始终以 NULL 结束。如果我在网络浏览器中删除此 cookie,我就会失去连接。所以我确信这是正确的 cookie。此外,如果我更改代码

.cookie(".ASPXAUTH", "jkaldfjjfasldjf")  Using the correct values of course

,我可以从此页面抓取我的登录名。这也让我认为我拥有正确的 cookie。那么,为什么我的 cookie 会出现 Null 呢?我的用户名和密码姓名字段是否不正确?还有别的事吗?

谢谢。

I am experimenting with this site to gather my username on the welcome page to learn Jsoup and Android. Using the following code

Connection.Response res = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
    .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username", "ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
    .method(Method.POST)
    .execute();
String sessionId = res.cookie(".ASPXAUTH");

Document doc2 = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
.cookie(".ASPXAUTH", sessionId)
.get();

My cookie (.ASPXAUTH) always ends up NULL. If I delete this cookie in a webbrowser, I lose my connection. So I am sure it is the correct cookie. In addition, if I change the code

.cookie(".ASPXAUTH", "jkaldfjjfasldjf")  Using the correct values of course

I am able to scrape my login name from this page. This also makes me think I have the correct cookie. So, how come my cookie comes up Null? Are my username and password name fields incorrect? Something else?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

舟遥客 2024-12-07 19:26:00

我知道我来晚了 10 个月。但是使用 Jsoup 的一个不错的选择是使用这段简单的代码:

//This will get you the response.
Response res = Jsoup
    .connect("url")
    .data("loginField", "[email protected]", "passField", "pass1234")
    .method(Method.POST)
    .execute();

//This will get you cookies
Map<String, String> cookies = res.cookies();

//And this is the easieste way I've found to remain in session
Documente doc = Jsoup.connect("url").cookies(cookies).get();

虽然我在连接某些网站时仍然遇到问题,但我使用相同的基本代码连接到很多网站。哦,在我忘记之前..我认为我的问题是 SSL 证书。你必须以一种我还不太明白的方式正确地管理它们。

I know I'm kinda late by 10 months here. But a good option using Jsoup is to use this easy peasy piece of code:

//This will get you the response.
Response res = Jsoup
    .connect("url")
    .data("loginField", "[email protected]", "passField", "pass1234")
    .method(Method.POST)
    .execute();

//This will get you cookies
Map<String, String> cookies = res.cookies();

//And this is the easieste way I've found to remain in session
Documente doc = Jsoup.connect("url").cookies(cookies).get();

Though I'm still having trouble connection to SOME websites, I connect to a whole lot of them with the same basic piece of code. Oh, and before I forget.. What I figured my problem is, is SSL certificates. You have to properly manage them in a way I still haven't quite figured out.

滥情哥ㄟ 2024-12-07 19:26:00

我总是分两步执行此操作(像普通人一样),

  1. 读取登录页面(通过 GET,读取 cookie)
  2. 提交表单和 cookie(通过 POST,无需 cookie 操作)

示例:

Connection.Response response = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
        .method(Connection.Method.GET)
        .execute();

response = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
        .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username")
        .data("ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
        .cookies(response.cookies())
        .method(Connection.Method.POST)
        .execute();

Document homePage = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
        .cookies(response.cookies())
        .get();

并且始终使用

         .cookies(response.cookies())

SSL 从上一个请求到下一个请求设置 cookie 不是这里很重要。如果您的证书有问题,请执行此方法以忽略 SSL。

public static void trustEveryone() {
    try {
        HttpsURLConnection.setDefaultHostnameVerifier(new HostnameVerifier() {
            public boolean verify(String hostname, SSLSession session) {
                return true;
            }
        });

        SSLContext context = SSLContext.getInstance("TLS");
        context.init(null, new X509TrustManager[]{new X509TrustManager() {
            public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException { }

            public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException { }

            public X509Certificate[] getAcceptedIssuers() {
                return new X509Certificate[0];
            }
        }}, new SecureRandom());
        HttpsURLConnection.setDefaultSSLSocketFactory(context.getSocketFactory());
    } catch (Exception e) { // should never happen
        e.printStackTrace();
    }
}

I always do this in two steps (like normal human),

  1. Read login page (by GET, read cookies)
  2. Submit form and cookies (by POST, without cookie manipulation)

Example:

Connection.Response response = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
        .method(Connection.Method.GET)
        .execute();

response = Jsoup.connect("http://www.mikeportnoy.com/forum/login.aspx")
        .data("ctl00$ContentPlaceHolder1$ctl00$Login1$UserName", "username")
        .data("ctl00$ContentPlaceHolder1$ctl00$Login1$Password", "password")
        .cookies(response.cookies())
        .method(Connection.Method.POST)
        .execute();

Document homePage = Jsoup.connect("http://www.mikeportnoy.com/forum/default.aspx")
        .cookies(response.cookies())
        .get();

And always set cookies from previuos request to next using

         .cookies(response.cookies())

SSL is not important here. If you have problem with certifcates then execute this method for ignore SSL.

public static void trustEveryone() {
    try {
        HttpsURLConnection.setDefaultHostnameVerifier(new HostnameVerifier() {
            public boolean verify(String hostname, SSLSession session) {
                return true;
            }
        });

        SSLContext context = SSLContext.getInstance("TLS");
        context.init(null, new X509TrustManager[]{new X509TrustManager() {
            public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException { }

            public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException { }

            public X509Certificate[] getAcceptedIssuers() {
                return new X509Certificate[0];
            }
        }}, new SecureRandom());
        HttpsURLConnection.setDefaultSSLSocketFactory(context.getSocketFactory());
    } catch (Exception e) { // should never happen
        e.printStackTrace();
    }
}
溺深海 2024-12-07 19:26:00

如果您尝试获取并传递所有 cookie,而不做如下假设,会怎么样: 发送使用用户名和密码进行 POST 请求并保存会话 cookie

如果您仍然遇到问题,请尝试查看此内容:将 cookie 传递给 GET 请求(POST 之后)的问题< /a>

What if you try fetching and passing all cookies without assuming anything like this: Sending POST request with username and password and save session cookie

If you still have problems try looking in to this: Issues with passing cookies to GET request (after POST)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文