如何在java中获取gzip内文件的文件名?
int BUFFER_SIZE = 4096;
byte[] buffer = new byte[BUFFER_SIZE];
InputStream input = new GZIPInputStream(new FileInputStream("a_gunzipped_file.gz"));
OutputStream output = new FileOutputStream("current_output_name");
int n = input.read(buffer, 0, BUFFER_SIZE);
while (n >= 0) {
output.write(buffer, 0, n);
n = input.read(buffer, 0, BUFFER_SIZE);
}
}catch(IOException e){
System.out.println("error: \n\t" + e.getMessage());
}
使用上面的代码我可以成功提取 gzip 的内容,尽管提取的文件的文件名如预期的那样始终是 current_output_name
(我知道它是因为我在代码中声明它是这样的)。我的问题是,当文件仍在存档中时,我不知道如何获取文件的文件名。
虽然 java.util.zip 提供了 ZipEntry,但我无法在 gzip 文件上使用它。 还有其他选择吗?
int BUFFER_SIZE = 4096;
byte[] buffer = new byte[BUFFER_SIZE];
InputStream input = new GZIPInputStream(new FileInputStream("a_gunzipped_file.gz"));
OutputStream output = new FileOutputStream("current_output_name");
int n = input.read(buffer, 0, BUFFER_SIZE);
while (n >= 0) {
output.write(buffer, 0, n);
n = input.read(buffer, 0, BUFFER_SIZE);
}
}catch(IOException e){
System.out.println("error: \n\t" + e.getMessage());
}
Using the above code I can succesfully extract a gzip's contents although the extracted file's filenames are, as expected, will always be current_output_name
(I know its because I declared it to be that way in the code). My problem is I dont know how to get the file's filename when it is still inside the archive.
Though, java.util.zip provides a ZipEntry, I couldn't use it on gzip files.
Any alternatives?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我有点同意“Michael Borgwardt”的回复,但这并不完全正确,gzip 文件规范包含存储在 gz 文件标头中的可选文件名,遗憾的是(据我所知)没有办法在当前的 java (1.6) 中获取该名称。如方法 getHeader 在 openjdk 中
他们跳过读取文件名
我已经修改了类 GZIPInputStream 以从 gzip 存档中获取可选文件名(我不确定我是否可以这样做)( 从这里下载原始版本),只需添加成员字符串文件名即可;到类中,并将上面的代码修改为:
它对我有用。
as i kinda agree with "Michael Borgwardt" on his reply, but it is not entirely true, gzip file specifications contains an optional file name stored in the header of the gz file, sadly there are no way (as far as i know ) of getting that name in current java (1.6). as seen in the implementation of the GZIPInputStream in the method getHeader in the openjdk
they skip reading the file name
i have modified the class GZIPInputStream to get the optional filename out of the gzip archive(im not sure if i am allowed to do that) (download the original version from here), you only need to add a member String filename; to the class, and modify the above code to be :
and it worked for me.
Apache Commons Compress 提供了两种获取文件名的选项:
使用元数据(Java 7+ 示例代码)
使用“约定”
参考文献
Apache Commons Compress offers two options for obtaining the filename:
With metadata (Java 7+ sample code)
With "the convention"
References
实际上,GZIP 文件格式使用多个成员,允许指定原始文件名。包括带有FLAG的成员FLAG.FNAME可以指定名称。但我在 java 库中没有看到这样做的方法。
http://www.gzip.org/zlib/rfc-gzip.html#specation
Actually, the GZIP file format, using the multiple members, allows the original filename to be specified. Including a member with the FLAG of FLAG.FNAME the name can be specified. I do not see a way to do this in the java libraries though.
http://www.gzip.org/zlib/rfc-gzip.html#specification
按照上面的答案,这里是一个创建文件“myTest.csv.gz”的示例,其中包含文件“myTest.csv”,请注意,您无法更改内部文件名,并且无法添加更多文件进入.gz 文件。
following the answers above, here is an example that creates a file "myTest.csv.gz" that contains a file "myTest.csv", notice that you can't change the internal file name, and you can't add more files into the gz file.
Gzip 是纯粹的压缩。 没有存档,它只是压缩的文件数据。
惯例是 gzip 将
.gz
附加到文件名,而gunzip则删除该扩展名。因此,压缩时logfile.txt
变为logfile.txt.gz
,解压缩时又变为logfile.txt
。如果重命名该文件,名称信息将丢失。Gzip is purely compression. There is no archive, it's just the file's data, compressed.
The convention is for gzip to append
.gz
to the filename, and for gunzip to remove that extension. So,logfile.txt
becomeslogfile.txt.gz
when compressed, and againlogfile.txt
when it's decompressed. If you rename the file, the name information is lost.