来自 URL 的输入流

发布于 2024-11-27 21:59:23 字数 320 浏览 4 评论 0原文

如何从 URL 获取 InputStream?

例如,我想获取 URL www.somewebsite.com/a.txt 处的文件,并通过 servlet 将其作为 Java 中的 InputStream 读取。

我已经尝试过

InputStream is = new FileInputStream("wwww.somewebsite.com/a.txt");

,但得到的是一个错误:

java.io.FileNotFoundException

How do I get an InputStream from a URL?

for example, I want to take the file at the url wwww.somewebsite.com/a.txt and read it as an InputStream in Java, through a servlet.

I've tried

InputStream is = new FileInputStream("wwww.somewebsite.com/a.txt");

but what I got was an error:

java.io.FileNotFoundException

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

百变从容 2024-12-04 21:59:23

使用 java.net .URL#openStream() 具有正确的 URL(包括协议!)。例如,

InputStream input = new URL("http://www.somewebsite.com/a.txt").openStream();
// ...

另请参阅:

Use java.net.URL#openStream() with a proper URL (including the protocol!). E.g.

InputStream input = new URL("http://www.somewebsite.com/a.txt").openStream();
// ...

See also:

奢欲 2024-12-04 21:59:23

尝试:

final InputStream is = new URL("http://wwww.somewebsite.com/a.txt").openStream();

Try:

final InputStream is = new URL("http://wwww.somewebsite.com/a.txt").openStream();
故人的歌 2024-12-04 21:59:23

(a) www.somewebsite.com/a.txt 不是“文件 URL”。它根本不是一个 URL。如果您将 http:// 放在其前面,它将是一个 HTTP URL,这显然就是您的意图。

(b) FileInputStream 用于文件,而不是 URL。

(c) 从任何 URL 获取输入流的方法是通过 URL.openStream(),URL.getConnection().getInputStream(),< /code> 这是等效的,但您可能有其他原因来获取 URLConnection 并首先使用它。

(a) wwww.somewebsite.com/a.txt isn't a 'file URL'. It isn't a URL at all. If you put http:// on the front of it it would be an HTTP URL, which is clearly what you intend here.

(b) FileInputStream is for files, not URLs.

(c) The way to get an input stream from any URL is via URL.openStream(), or URL.getConnection().getInputStream(), which is equivalent but you might have other reasons to get the URLConnection and play with it first.

酒几许 2024-12-04 21:59:23

纯Java:

 urlToInputStream(url,httpHeaders);

我使用这种方法取得了一些成功。它处理重定向,并且可以将可变数量的HTTP 标头作为Map 传递。它还允许从 HTTP 重定向到 HTTPS

private InputStream urlToInputStream(URL url, Map<String, String> args) {
    HttpURLConnection con = null;
    InputStream inputStream = null;
    try {
        con = (HttpURLConnection) url.openConnection();
        con.setConnectTimeout(15000);
        con.setReadTimeout(15000);
        if (args != null) {
            for (Entry<String, String> e : args.entrySet()) {
                con.setRequestProperty(e.getKey(), e.getValue());
            }
        }
        con.connect();
        int responseCode = con.getResponseCode();
        /* By default the connection will follow redirects. The following
         * block is only entered if the implementation of HttpURLConnection
         * does not perform the redirect. The exact behavior depends to 
         * the actual implementation (e.g. sun.net).
         * !!! Attention: This block allows the connection to 
         * switch protocols (e.g. HTTP to HTTPS), which is <b>not</b> 
         * default behavior. See: https://stackoverflow.com/questions/1884230 
         * for more info!!!
         */
        if (responseCode < 400 && responseCode > 299) {
            String redirectUrl = con.getHeaderField("Location");
            try {
                URL newUrl = new URL(redirectUrl);
                return urlToInputStream(newUrl, args);
            } catch (MalformedURLException e) {
                URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
                return urlToInputStream(newUrl, args);
            }
        }
        /*!!!!!*/

        inputStream = con.getInputStream();
        return inputStream;
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

完整的调用示例

private InputStream getInputStreamFromUrl(URL url, String user, String passwd) throws IOException {
        String encoded = Base64.getEncoder().encodeToString((user + ":" + passwd).getBytes(StandardCharsets.UTF_8));
        Map<String,String> httpHeaders=new Map<>();
        httpHeaders.put("Accept", "application/json");
        httpHeaders.put("User-Agent", "myApplication");
        httpHeaders.put("Authorization", "Basic " + encoded);
        return urlToInputStream(url,httpHeaders);
    }

Pure Java:

 urlToInputStream(url,httpHeaders);

With some success I use this method. It handles redirects and one can pass a variable number of HTTP headers asMap<String,String>. It also allows redirects from HTTP to HTTPS.

private InputStream urlToInputStream(URL url, Map<String, String> args) {
    HttpURLConnection con = null;
    InputStream inputStream = null;
    try {
        con = (HttpURLConnection) url.openConnection();
        con.setConnectTimeout(15000);
        con.setReadTimeout(15000);
        if (args != null) {
            for (Entry<String, String> e : args.entrySet()) {
                con.setRequestProperty(e.getKey(), e.getValue());
            }
        }
        con.connect();
        int responseCode = con.getResponseCode();
        /* By default the connection will follow redirects. The following
         * block is only entered if the implementation of HttpURLConnection
         * does not perform the redirect. The exact behavior depends to 
         * the actual implementation (e.g. sun.net).
         * !!! Attention: This block allows the connection to 
         * switch protocols (e.g. HTTP to HTTPS), which is <b>not</b> 
         * default behavior. See: https://stackoverflow.com/questions/1884230 
         * for more info!!!
         */
        if (responseCode < 400 && responseCode > 299) {
            String redirectUrl = con.getHeaderField("Location");
            try {
                URL newUrl = new URL(redirectUrl);
                return urlToInputStream(newUrl, args);
            } catch (MalformedURLException e) {
                URL newUrl = new URL(url.getProtocol() + "://" + url.getHost() + redirectUrl);
                return urlToInputStream(newUrl, args);
            }
        }
        /*!!!!!*/

        inputStream = con.getInputStream();
        return inputStream;
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Full example call

private InputStream getInputStreamFromUrl(URL url, String user, String passwd) throws IOException {
        String encoded = Base64.getEncoder().encodeToString((user + ":" + passwd).getBytes(StandardCharsets.UTF_8));
        Map<String,String> httpHeaders=new Map<>();
        httpHeaders.put("Accept", "application/json");
        httpHeaders.put("User-Agent", "myApplication");
        httpHeaders.put("Authorization", "Basic " + encoded);
        return urlToInputStream(url,httpHeaders);
    }
作妖 2024-12-04 21:59:23

您的原始代码使用 FileInputStream,它用于访问文件系统托管文件。

您使用的构造函数将尝试在当前工作目录(系统属性 user.dir 的值)的 www.somewebsite.com 子文件夹中查找名为 a.txt 的文件。您提供的名称将使用 File 类解析为文件。

URL 对象是解决这个问题的通用方法。您可以使用 URL 来访问本地文件,也可以使用网络托管资源。除了 http:// 或 https:// 之外,URL 类还支持 file:// 协议,因此您可以开始使用。

Your original code uses FileInputStream, which is for accessing file system hosted files.

The constructor you used will attempt to locate a file named a.txt in the www.somewebsite.com subfolder of the current working directory (the value of system property user.dir). The name you provide is resolved to a file using the File class.

URL objects are the generic way to solve this. You can use URLs to access local files but also network hosted resources. The URL class supports the file:// protocol besides http:// or https:// so you're good to go.

叶落知秋 2024-12-04 21:59:23

InputStream input = URI.create("http://www.somewebsite.com/a.txt").toURL().openStream();

增强whiskeysierra 的答案java.net.URL.URL(String) 构造函数是已弃用Java 20

InputStream input = URI.create("http://www.somewebsite.com/a.txt").toURL().openStream();

Enhancing whiskeysierra's answer, the java.net.URL.URL(String) constructor is deprecated since Java 20.

在风中等你 2024-12-04 21:59:23

这是读取给定网页内容的完整示例。
该网页是从 HTML 表单中读取的。我们使用标准 InputStream 类,但使用 JSoup 库可以更轻松地完成。

<dependency>
    <groupId>javax.servlet</groupId>
    <artifactId>javax.servlet-api</artifactId>
    <version>3.1.0</version>
    <scope>provided</scope>

</dependency>

<dependency>
    <groupId>commons-validator</groupId>
    <artifactId>commons-validator</artifactId>
    <version>1.6</version>
</dependency>  

这些是 Maven 依赖项。我们使用 Apache Commons 库来验证 URL 字符串。

package com.zetcode.web;

import com.zetcode.service.WebPageReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

@WebServlet(name = "ReadWebPage", urlPatterns = {"/ReadWebPage"})
public class ReadWebpage extends HttpServlet {

    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {

        response.setContentType("text/plain;charset=UTF-8");

        String page = request.getParameter("webpage");

        String content = new WebPageReader().setWebPageName(page).getWebPageContent();

        ServletOutputStream os = response.getOutputStream();
        os.write(content.getBytes(StandardCharsets.UTF_8));
    }
}

ReadWebPage servlet 读取给定网页的内容并将其以纯文本格式发送回客户端。读取页面的任务被委托给WebPageReader

package com.zetcode.service;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.stream.Collectors;
import org.apache.commons.validator.routines.UrlValidator;

public class WebPageReader {

    private String webpage;
    private String content;

    public WebPageReader setWebPageName(String name) {

        webpage = name;
        return this;
    }

    public String getWebPageContent() {

        try {

            boolean valid = validateUrl(webpage);

            if (!valid) {

                content = "Invalid URL; use http(s)://www.example.com format";
                return content;
            }

            URL url = new URL(webpage);

            try (InputStream is = url.openStream();
                    BufferedReader br = new BufferedReader(
                            new InputStreamReader(is, StandardCharsets.UTF_8))) {

                content = br.lines().collect(
                      Collectors.joining(System.lineSeparator()));
            }

        } catch (IOException ex) {

            content = String.format("Cannot read webpage %s", ex);
            Logger.getLogger(WebPageReader.class.getName()).log(Level.SEVERE, null, ex);
        }

        return content;
    }

    private boolean validateUrl(String webpage) {

        UrlValidator urlValidator = new UrlValidator();

        return urlValidator.isValid(webpage);
    }
}

WebPageReader 验证 URL 并读取网页内容。
它返回一个包含页面 HTML 代码的字符串。

<!DOCTYPE html>
<html>
    <head>
        <title>Home page</title>
        <meta charset="UTF-8">
    </head>
    <body>
        <form action="ReadWebPage">

            <label for="page">Enter a web page name:</label>
            <input  type="text" id="page" name="webpage">

            <button type="submit">Submit</button>

        </form>
    </body>
</html>

最后,这是包含 HTML 表单的主页。
这是摘自我关于此主题的教程

Here is a full example which reads the contents of the given web page.
The web page is read from an HTML form. We use standard InputStream classes, but it could be done more easily with JSoup library.

<dependency>
    <groupId>javax.servlet</groupId>
    <artifactId>javax.servlet-api</artifactId>
    <version>3.1.0</version>
    <scope>provided</scope>

</dependency>

<dependency>
    <groupId>commons-validator</groupId>
    <artifactId>commons-validator</artifactId>
    <version>1.6</version>
</dependency>  

These are the Maven dependencies. We use Apache Commons library to validate URL strings.

package com.zetcode.web;

import com.zetcode.service.WebPageReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

@WebServlet(name = "ReadWebPage", urlPatterns = {"/ReadWebPage"})
public class ReadWebpage extends HttpServlet {

    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {

        response.setContentType("text/plain;charset=UTF-8");

        String page = request.getParameter("webpage");

        String content = new WebPageReader().setWebPageName(page).getWebPageContent();

        ServletOutputStream os = response.getOutputStream();
        os.write(content.getBytes(StandardCharsets.UTF_8));
    }
}

The ReadWebPage servlet reads the contents of the given web page and sends it back to the client in plain text format. The task of reading the page is delegated to WebPageReader.

package com.zetcode.service;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.logging.Level;
import java.util.logging.Logger;
import java.util.stream.Collectors;
import org.apache.commons.validator.routines.UrlValidator;

public class WebPageReader {

    private String webpage;
    private String content;

    public WebPageReader setWebPageName(String name) {

        webpage = name;
        return this;
    }

    public String getWebPageContent() {

        try {

            boolean valid = validateUrl(webpage);

            if (!valid) {

                content = "Invalid URL; use http(s)://www.example.com format";
                return content;
            }

            URL url = new URL(webpage);

            try (InputStream is = url.openStream();
                    BufferedReader br = new BufferedReader(
                            new InputStreamReader(is, StandardCharsets.UTF_8))) {

                content = br.lines().collect(
                      Collectors.joining(System.lineSeparator()));
            }

        } catch (IOException ex) {

            content = String.format("Cannot read webpage %s", ex);
            Logger.getLogger(WebPageReader.class.getName()).log(Level.SEVERE, null, ex);
        }

        return content;
    }

    private boolean validateUrl(String webpage) {

        UrlValidator urlValidator = new UrlValidator();

        return urlValidator.isValid(webpage);
    }
}

WebPageReader validates the URL and reads the contents of the web page.
It returns a string containing the HTML code of the page.

<!DOCTYPE html>
<html>
    <head>
        <title>Home page</title>
        <meta charset="UTF-8">
    </head>
    <body>
        <form action="ReadWebPage">

            <label for="page">Enter a web page name:</label>
            <input  type="text" id="page" name="webpage">

            <button type="submit">Submit</button>

        </form>
    </body>
</html>

Finally, this is the home page containing the HTML form.
This is taken from my tutorial about this topic.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文