当前位置：文江博客话题详情

是否可以仅使用一个线程打开到多个站点的多个连接？

发布于 2024-12-07 13:18:28 字数 1919 浏览 1 评论 0原文

更新

我已经使用了FixedThreadPool。发生的情况是每个线程为一个站点打开一个连接。我想做的是异步的事情。

向服务器发送请求
无需等待第一个请求完成即可转到下一个请求
建立请求后，执行一些操作，通知另一个线程连接已建立并准备好下载。

我认为这将加快执行速度，因为将使用更少的线程来打开与当前性能相同或更多的连接。

在当前方式中，每个线程等待一段时间，无需等待连接建立。通过这种新方式，它将始终有效。

问题

我想知道是否有一种方法可以仅用一个线程打开与多个站点的连接。

这是因为我正在做一个网络爬虫，我已经做了一个线程来打开一个连接，但是在一定数量的线程之后，这将无济于事，因为处理器共享会增加很多。

我希望这可以加快下载的页面数量。可以这样做吗？如何？

此代码打开连接并进行一些处理。它由打开连接的线程执行

/*
 * Open connection to a server
 */
boolean openConnection(Link link) throws Exception {
    //set the connection paramenters
    HttpURLConnection conn = (HttpURLConnection) new URL(link.getOriginalURL().getURL()).openConnection();
    conn.setRequestProperty("User-Agent", ROBOT_NAME);
    conn.setInstanceFollowRedirects(true);
    conn.setConnectTimeout(READ_TIMEOUT);
    conn.setReadTimeout(READ_TIMEOUT);
    link.setConnection(conn);
    //open the connection
    conn.connect();        
    //check the server answer
    if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {
        return false;
    }
    //analyse the URL of the redirected URL
    urlAnalyzer.fillURL(link.getRedirectedURL(), getRedirectedURL(link.getConnection()));
    return true;
}

这将执行连接打开器，每个连接打开器都在一个线程中

/*
 * Start the execution of the connection openers     
 */
private void executeConnectionOpeners() {
    LOGGER.info("Starting connection openners.");
    /* Execution */
    NameThreadFactory ntf = new NameThreadFactory("Connection Opener");
    crawlerOpenerExecutor = Executors.newFixedThreadPool(nOpeners, ntf);
    for (int i = 0; i < nOpeners; i++) {
        crawlerOpenerExecutor.submit(new ConnectionOpener(this));
    }
    /* End of execution */
    LOGGER.info(nOpeners + " connection openers created and running.");
}

原文

Update

I use a FixedThreadPool already. What happens is that each thread open one connection for one site. What I want to do is something asynchronous.

Send request to a server
Go to next request without need to wait the first request to complete
When a request was established, do something informing another thread that connection was established and ready for download.

I think this will speed up the execution because will use less threads for opening the same or more connection that the currently performance.

In the current way, each thread wait a time without work waiting the connection establishes. In this new way, it will be always working.

The Question

I want to know if there is a way to open connection to multiple sites with only one thread.

This is because I'm doing an webcrawler, I already did a thread to open a connection, but after a certain number of threads, this will not help because the processor sharing will increase a lot.

I want this to speed up the number of pages downloaded. It's possible do this? How?

This code open a connection and do some processing. It's executed by the threads that open a connection

/*
 * Open connection to a server
 */
boolean openConnection(Link link) throws Exception {
    //set the connection paramenters
    HttpURLConnection conn = (HttpURLConnection) new URL(link.getOriginalURL().getURL()).openConnection();
    conn.setRequestProperty("User-Agent", ROBOT_NAME);
    conn.setInstanceFollowRedirects(true);
    conn.setConnectTimeout(READ_TIMEOUT);
    conn.setReadTimeout(READ_TIMEOUT);
    link.setConnection(conn);
    //open the connection
    conn.connect();        
    //check the server answer
    if (conn.getResponseCode() != HttpURLConnection.HTTP_OK) {
        return false;
    }
    //analyse the URL of the redirected URL
    urlAnalyzer.fillURL(link.getRedirectedURL(), getRedirectedURL(link.getConnection()));
    return true;
}

This executes the connection openers, each one in one thread

/*
 * Start the execution of the connection openers     
 */
private void executeConnectionOpeners() {
    LOGGER.info("Starting connection openners.");
    /* Execution */
    NameThreadFactory ntf = new NameThreadFactory("Connection Opener");
    crawlerOpenerExecutor = Executors.newFixedThreadPool(nOpeners, ntf);
    for (int i = 0; i < nOpeners; i++) {
        crawlerOpenerExecutor.submit(new ConnectionOpener(this));
    }
    /* End of execution */
    LOGGER.info(nOpeners + " connection openers created and running.");
}

分享到QQ

分享到微博