如何使用 Jsoup 管理 cookie?

发布于 2024-12-29 14:25:24 字数 176 浏览 4 评论 0原文

Jsoup 中是否有一个简单的 cookie 管理器可以按主机存储 cookie? 此线程中的示例非常缺乏。

Is there a simple cookie manager in Jsoup that stores the cookies by host? the example in this thread is quite lacking.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

别在捏我脸啦 2025-01-05 14:25:24

我没有找到适用于 Jsoup 的标准解决方案。这是我使用 HashMap 进行的简单 cookie 处理。它可能缺少一些功能,但我希望它对于我的基本爬虫来说足够工作:

private static HashMap<String, HashMap<String, String>> host2cookies = new HashMap<String, HashMap<String, String>>();

public static String[] DownloadPage(URL url) throws Exception
{
    Connection con = Jsoup.connect(url.toString()).timeout(600000);
    loadCookiesByHost(url, con);


    Document doc = con.get();
    url = con.request().url();

    storeCookiesByHost(url, con);

    return new String[]{url.toString(), doc.html()};
}

private static void loadCookiesByHost(URL url, Connection con) {
    try {
        String host = url.getHost();
        if (host2cookies.containsKey(host)) {
            HashMap<String, String> cookies = host2cookies.get(host);
            for (Entry<String, String> cookie : cookies.entrySet()) {
                con.cookie(cookie.getKey(), cookie.getValue());
            }
        }
    } catch (Throwable t) {
        // MTMT move to log
        System.err.println(t.toString()+":: Error loading cookies to: " + url);
    }
}

private static void storeCookiesByHost(URL url, Connection con) {
        try {
            String host = url.getHost();
            HashMap<String, String> cookies = host2cookies.get(host);
            if (cookies == null) {
                cookies = new HashMap<String, String>();
                host2cookies.put(host, cookies);
            }
            cookies.putAll(con.response().cookies());
        } catch (Throwable t) {
            // MTMT move to log
            System.err.println(t.toString()+":: Error saving cookies from: " + url);
        }    
}   

I didn't find a standard solution that works with Jsoup. Here's my simple cookie handling using a HashMap. It's probably missing a bunch of functionalities but I hope it'll work well enough for my basic crawler:

private static HashMap<String, HashMap<String, String>> host2cookies = new HashMap<String, HashMap<String, String>>();

public static String[] DownloadPage(URL url) throws Exception
{
    Connection con = Jsoup.connect(url.toString()).timeout(600000);
    loadCookiesByHost(url, con);


    Document doc = con.get();
    url = con.request().url();

    storeCookiesByHost(url, con);

    return new String[]{url.toString(), doc.html()};
}

private static void loadCookiesByHost(URL url, Connection con) {
    try {
        String host = url.getHost();
        if (host2cookies.containsKey(host)) {
            HashMap<String, String> cookies = host2cookies.get(host);
            for (Entry<String, String> cookie : cookies.entrySet()) {
                con.cookie(cookie.getKey(), cookie.getValue());
            }
        }
    } catch (Throwable t) {
        // MTMT move to log
        System.err.println(t.toString()+":: Error loading cookies to: " + url);
    }
}

private static void storeCookiesByHost(URL url, Connection con) {
        try {
            String host = url.getHost();
            HashMap<String, String> cookies = host2cookies.get(host);
            if (cookies == null) {
                cookies = new HashMap<String, String>();
                host2cookies.put(host, cookies);
            }
            cookies.putAll(con.response().cookies());
        } catch (Throwable t) {
            // MTMT move to log
            System.err.println(t.toString()+":: Error saving cookies from: " + url);
        }    
}   
入画浅相思 2025-01-05 14:25:24

Connection.Base 类拥有您所需要的一切需要了解 jsoup 如何处理 cookie。

本质上,它可以让您在每个连接上获取和设置它们,但除此之外,您还可以“管理”它们。

The Connection.Base class has everything you need to know about how jsoup deals with cookies.

Essentially, it will let you get and set them on each connection, but beyond that it's up to you to "manage" them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文