如何将大网页加载到字符串中

发布于 2024-12-03 09:21:24 字数 995 浏览 0 评论 0原文

我是 Java 和 Android 新手，但不是编程和 HTTP 新手。此 HTTP GET 方法主要是从使用 Apache HTTP 类的其他示例复制的，仅检索大型网页的前几 K。我检查了该网页没有超过 8192 字节的行（这可能吗？），但是在 40K 左右的网页中，我可能会返回 6K，可能是 20K。读取的字节数似乎与网页总大小、网页模数8192、网页内容没有简单的关系。

大家有什么想法吗？

谢谢！

public static String myHttpGet(String url) throws Exception {
BufferedReader in = null;
try {
    HttpClient client = getHttpClient();
    HttpGet request = new HttpGet();
    request.setURI(new URI(url));
    HttpResponse response = client.execute(request);
    in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));

    StringBuffer sbuffer = new StringBuffer("");
    String line = "";

    while ((line = in.readLine()) != null) {
        sbuffer.append(line + "\n");
    }
    in.close();

    String result = sbuffer.toString();
    return result; 
} finally {
    if (in != null) {
        try {
            in.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
}

原文

I'm a novice with Java and Android, but not to programming and HTTP. This HTTP GET method, mostly copied from other examples using the Apache HTTP classes, only retrieves the first few K of a large webpage. I checked that the webpage does not have lines longer than 8192 bytes (is that possible?), but out of webpages around 40K I get back maybe 6K, maybe 20K. The number of bytes read does not seem to have a simple realtionship with the total webpage size, or the webpage modulus 8192, or with the webpage content.

Any ideas folks?

Thanks!

public static String myHttpGet(String url) throws Exception {
BufferedReader in = null;
try {
    HttpClient client = getHttpClient();
    HttpGet request = new HttpGet();
    request.setURI(new URI(url));
    HttpResponse response = client.execute(request);
    in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));

    StringBuffer sbuffer = new StringBuffer("");
    String line = "";

    while ((line = in.readLine()) != null) {
        sbuffer.append(line + "\n");
    }
    in.close();

    String result = sbuffer.toString();
    return result; 
} finally {
    if (in != null) {
        try {
            in.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

凉城 2024-12-10 09:21:24

无需编写自己的 HttpEntity-to-String 代码，请尝试 EntityUtils 相反：

// this uses the charset the server encoded the entity in
String result = EntityUtils.toString(entity);

No need to write you own HttpEntity-to-String code, try EntityUtils instead:

// this uses the charset the server encoded the entity in
String result = EntityUtils.toString(entity);

回复收藏 0 原文

蒗幽 2024-12-10 09:21:24

看起来问题好像是来自某个以 Goo 开头的网站的页面...我对其他网站的大页面没有这个问题。所以代码可能没问题。

回复收藏 0 原文

~没有更多了~

关于作者

感性

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何将大网页加载到字符串中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

╰ゝ天使的微笑

少女净妖师

朱洁

觉浅

滥情空心

hl1314520

友情链接

如何将大网页加载到字符串中

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

╰ゝ天使的微笑

少女净妖师

朱洁

觉浅

滥情空心

hl1314520

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。