如何将大网页加载到字符串中
我是 Java 和 Android 新手,但不是编程和 HTTP 新手。此 HTTP GET 方法主要是从使用 Apache HTTP 类的其他示例复制的,仅检索大型网页的前几 K。我检查了该网页没有超过 8192 字节的行(这可能吗?),但是在 40K 左右的网页中,我可能会返回 6K,可能是 20K。读取的字节数似乎与网页总大小、网页模数8192、网页内容没有简单的关系。
大家有什么想法吗?
谢谢!
public static String myHttpGet(String url) throws Exception {
BufferedReader in = null;
try {
HttpClient client = getHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI(url));
HttpResponse response = client.execute(request);
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sbuffer = new StringBuffer("");
String line = "";
while ((line = in.readLine()) != null) {
sbuffer.append(line + "\n");
}
in.close();
String result = sbuffer.toString();
return result;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
I'm a novice with Java and Android, but not to programming and HTTP. This HTTP GET method, mostly copied from other examples using the Apache HTTP classes, only retrieves the first few K of a large webpage. I checked that the webpage does not have lines longer than 8192 bytes (is that possible?), but out of webpages around 40K I get back maybe 6K, maybe 20K. The number of bytes read does not seem to have a simple realtionship with the total webpage size, or the webpage modulus 8192, or with the webpage content.
Any ideas folks?
Thanks!
public static String myHttpGet(String url) throws Exception {
BufferedReader in = null;
try {
HttpClient client = getHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI(url));
HttpResponse response = client.execute(request);
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sbuffer = new StringBuffer("");
String line = "";
while ((line = in.readLine()) != null) {
sbuffer.append(line + "\n");
}
in.close();
String result = sbuffer.toString();
return result;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
无需编写自己的 HttpEntity-to-String 代码,请尝试 EntityUtils 相反:
No need to write you own HttpEntity-to-String code, try EntityUtils instead:
看起来问题好像是来自某个以 Goo 开头的网站的页面...我对其他网站的大页面没有这个问题。所以代码可能没问题。
It looks as if the problem is with pages from a certain website starting Goo... I'm not having this problem with large pages from other sites. So the code is probably OK.