如何在java中使用Http、Socks4和Socks5代理?

发布于 2024-08-18 06:17:28 字数 520 浏览 7 评论 0原文

我想对网站进行屏幕抓取,为此我想使用 Http、Socks4 和 Sock5 代理。所以我的问题如下:

  1. 是否可以通过 Java 使用这些代理而不使用任何其他外部 API?例如,是否可以通过 发送请求HttpURLConnection 通过这些代理?

  2. 如果不可能,那么我还可以使用哪些其他外部 API?

  3. 我使用 HtmlUnit 提供的无头浏览器来完成此操作,但即使加载简单的网页也需要时间,那么您能否建议我提供其他可快速加载网页的无头浏览器 API(如果有)。我不想打开包含大量 AJAX 或 Javascript 代码的网页。我只需要通过无头浏览器单击表单按钮。

I want to screen-scrape a website and for that I want to use Http, Socks4 and Sock5 proxies. So my questions are as follows:

  1. Is it possible to use these proxies through Java without using any other external API? For instance, Is it possible to send a request through HttpURLConnection through theseproxies?

  2. If it is not possible, then What other external APIs I can use?

  3. I was doing it by using a headless browser provided by HtmlUnit but it takes time to load even simple webpages, so could you please suggest me other APIs (if any) that provide headless browsers that are fast in loading webpages. I don't want to open webpages that contain heavy AJAX or Javascript code. I just need to click on the forms button through the headless browser.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

泪意 2024-08-25 06:17:28

是否可以通过 Java 使用这些代理而不使用任何其他外部 API?例如,是否可以通过这些代理通过 HttpURLConnection 发送请求?

是的,您可以通过使用(全局)系统属性或使用 Proxy 类,或使用 ProxySelector。后面的两个选项从 Java 5 开始就可用,并且更加灵活。查看 Java 网络和代理正如 jarnbjo 所提到的所有细节。

我使用 HtmlUnit 提供的无头浏览器来完成此操作,但即使加载简单的网页也需要时间,因此您能否建议我提供其他可快速加载网页的无头浏览器的 API(如果有)。我不想打开包含大量 AJAX 或 Javascript 代码的网页。我只需通过无头浏览器单击表单按钮即可。

不幸的是,我能想到的第一个替代方案是基于 HtmlUnit 的(例如 JWebUnitWebTest)或更慢(SeleniumWebDriver - 您可以在无头模式下运行)。但如果您不需要高级 JavaScript 支持,也许您可​​以尝试 HttpUnit

Is it possible to use these proxies through Java without using any other external API? For instance, Is it possible to send a request through HttpURLConnection through these proxies?

Yes, you can configure proxies by either using (global) system properties, or using the Proxy class, or using a ProxySelector. The two later options are available since Java 5 and are more flexible. Have a look at Java Networking and Proxies as mentioned by jarnbjo for all the details.

I was doing it by using a headless browser provided by HtmlUnit but it takes time to load even simple webpages, so could you please suggest me other APIs (if any) that provide headless browsers that are fast in loading webpages. I don't want to open webpages that contain heavy AJAX or Javascript code. I just need to click on the forms button through the headless browser.

Unfortunately, the first alternatives I can think of are either HtmlUnit based (like JWebUnit or WebTest) or slower (Selenium, WebDriver - that you can run in headless mode). But maybe you could try HttpUnit if you don't need advanced JavaScript support.

草莓味的萝莉 2024-08-25 06:17:28

是的,这是可能的。您可以在此处找到不同网络代理的配置选项< /a>.

Yes, that is possible. You can find the configuration options for different network proxies here.

墨落成白 2024-08-25 06:17:28

您可以设置每个连接代理。以下是 Java 11 HttpClient 和旧版 HttpURLConnection 的示例:

public static void java11Http(String url) throws Exception {
    ProxySelector proxySelector = new ProxySelector() {
        @Override
        public List<Proxy> select(URI uri) {
            return List.of(new Proxy(Proxy.Type.SOCKS, new InetSocketAddress("127.0.0.1", 1234)));
        }
        @Override
        public void connectFailed(URI uri, SocketAddress sa, IOException ioe) {
            ioe.printStackTrace();
        }
    };

    HttpClient client = HttpClient.newBuilder()
            .proxy(proxySelector)
            .build();
    HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .build();

    HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
    System.out.println(response.body());
}

private static void legacyJavaHttp(String url) {
    SocketAddress proxyAddr = new InetSocketAddress("127.0.0.1", 1234);
    Proxy pr = new Proxy(Proxy.Type.SOCKS, proxyAddr);

    try {
        HttpURLConnection con = (HttpURLConnection) URI.create(url).toURL().openConnection(pr);
        con.setConnectTimeout(5000);
        con.setReadTimeout(5000);
        con.connect();
        System.out.println(con.getResponseMessage());
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}

您可以使用 SOCKS 或 HTTP 代理。

您可以在此处阅读有关 Java 代理的更多信息:https:// /docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.html

You can set up per-connection proxies. Here is an example with the Java 11 HttpClient and the legacy HttpURLConnection:

public static void java11Http(String url) throws Exception {
    ProxySelector proxySelector = new ProxySelector() {
        @Override
        public List<Proxy> select(URI uri) {
            return List.of(new Proxy(Proxy.Type.SOCKS, new InetSocketAddress("127.0.0.1", 1234)));
        }
        @Override
        public void connectFailed(URI uri, SocketAddress sa, IOException ioe) {
            ioe.printStackTrace();
        }
    };

    HttpClient client = HttpClient.newBuilder()
            .proxy(proxySelector)
            .build();
    HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .build();

    HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
    System.out.println(response.body());
}

private static void legacyJavaHttp(String url) {
    SocketAddress proxyAddr = new InetSocketAddress("127.0.0.1", 1234);
    Proxy pr = new Proxy(Proxy.Type.SOCKS, proxyAddr);

    try {
        HttpURLConnection con = (HttpURLConnection) URI.create(url).toURL().openConnection(pr);
        con.setConnectTimeout(5000);
        con.setReadTimeout(5000);
        con.connect();
        System.out.println(con.getResponseMessage());
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}

You can use either SOCKS or HTTP proxying.

You can read more on Java proxying here: https://docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文