GZIPInputStream 逐行读取
我有一个 .gz 格式的文件。 用于读取该文件的java类是GZIPInputStream。 但是,该类没有扩展 java.io.BufferedReader 类。 结果,我无法逐行读取文件。 我需要这样的东西,
reader = new MyGZInputStream( some constructor of GZInputStream)
reader.readLine()...
虽然我创建了一个类,它扩展了 java 的 Reader 或 BufferedReader 类,并使用 GZIPInputStream 作为其变量之一。
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.util.zip.GZIPInputStream;
public class MyGZFilReader extends Reader {
private GZIPInputStream gzipInputStream = null;
char[] buf = new char[1024];
@Override
public void close() throws IOException {
gzipInputStream.close();
}
public MyGZFilReader(String filename)
throws FileNotFoundException, IOException {
gzipInputStream = new GZIPInputStream(new FileInputStream(filename));
}
@Override
public int read(char[] cbuf, int off, int len) throws IOException {
// TODO Auto-generated method stub
return gzipInputStream.read((byte[])buf, off, len);
}
}
但是,当我使用时这不起作用
BufferedReader in = new BufferedReader(
new MyGZFilReader("F:/gawiki-20090614-stub-meta-history.xml.gz"));
System.out.println(in.readLine());
有人可以建议如何继续..
I have a file in .gz format. The java class for reading this file is GZIPInputStream.
However, this class doesn't extend the BufferedReader class of java. As a result, I am not able to read the file line by line. I need something like this
reader = new MyGZInputStream( some constructor of GZInputStream)
reader.readLine()...
I though of creating my class which extends the Reader or BufferedReader class of java and use GZIPInputStream as one of its variable.
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.util.zip.GZIPInputStream;
public class MyGZFilReader extends Reader {
private GZIPInputStream gzipInputStream = null;
char[] buf = new char[1024];
@Override
public void close() throws IOException {
gzipInputStream.close();
}
public MyGZFilReader(String filename)
throws FileNotFoundException, IOException {
gzipInputStream = new GZIPInputStream(new FileInputStream(filename));
}
@Override
public int read(char[] cbuf, int off, int len) throws IOException {
// TODO Auto-generated method stub
return gzipInputStream.read((byte[])buf, off, len);
}
}
But, this doesn't work when I use
BufferedReader in = new BufferedReader(
new MyGZFilReader("F:/gawiki-20090614-stub-meta-history.xml.gz"));
System.out.println(in.readLine());
Can someone advice how to proceed ..
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
装饰器的基本设置是这样的:
这段代码中的关键问题是
encoding
的值。 这是文件中文本的字符编码。 是“US-ASCII”、“UTF-8”、“SHIFT-JIS”、“ISO-8859-9”……? 有数百种可能性,并且通常无法从文件本身确定正确的选择。 它必须通过一些带外通道来指定。例如,也许这是平台默认的。 然而,在网络环境中,这是极其脆弱的。 写入文件的计算机可能位于相邻的隔间中,但具有不同的默认文件编码。
大多数网络协议使用标头或其他元数据来明确记录字符编码。
在本例中,从文件扩展名看来内容是 XML。 为此,XML 在 XML 声明中包含“编码”属性。 此外,XML 实际上应该使用 XML 解析器进行处理,而不是作为文本处理。 逐行读取 XML 似乎是一种脆弱的特殊情况。
未能明确指定编码违反了第二条戒律。 使用默认编码会带来危险!
The basic setup of decorators is like this:
The key issue in this snippet is the value of
encoding
. This is the character encoding of the text in the file. Is it "US-ASCII", "UTF-8", "SHIFT-JIS", "ISO-8859-9", …? there are hundreds of possibilities, and the correct choice usually cannot be determined from the file itself. It must be specified through some out-of-band channel.For example, maybe it's the platform default. In a networked environment, however, this is extremely fragile. The machine that wrote the file might sit in the neighboring cubicle, but have a different default file encoding.
Most network protocols use a header or other metadata to explicitly note the character encoding.
In this case, it appears from the file extension that the content is XML. XML includes the "encoding" attribute in the XML declaration for this purpose. Furthermore, XML should really be processed with an XML parser, not as text. Reading XML line-by-line seems like a fragile, special case.
Failing to explicitly specify the encoding is against the second commandment. Use the default encoding at your peril!
您可以在 util 类中使用以下方法,并在必要时使用它......
You can use the following method in a util class, and use it whenever necessary...
这是一行
here is with one line